Announcement

**Vim_User** · 09 July 2013, 11:45 AM

Originally posted by bridgman View Post

Sounds like something the latest DPM code in 3.11 would help with...

One could think so, but it doesn't.

**ChrisXY** · 09 July 2013, 11:50 AM

Originally posted by agd5f View Post

With the providers stuff in xserver 1.14, powerXpress should work pretty well on most systems.

Yes, except until the "auto poweroff" feature is in mainline we have these problems: https://bugzilla.kernel.org/show_bug.cgi?id=51381

**bridgman** · 09 July 2013, 01:53 PM

Originally posted by Vim_User View Post

One could think so, but it doesn't.

This (DPM not reducing temperatures) is on your HD3200, right ?

**Ericg** · 09 July 2013, 02:13 PM

Originally posted by agd5f View Post

With the providers stuff in xserver 1.14, powerXpress should work pretty well on most systems.

Hey Alex, lets throw X out the window for the sake of this question. You've got Wayland, or some other magical display manager that you (the devs) created yourself and everything is goldilocks and perfect. Also, let's throw backwards compatibility in-kernel out the window too. You've basically got a blank slate.

Whats it gonna take (infrastructure wise) to get Nvidia Optimus, PowerXpress, SLI and CrossFire working on Linux? Throw out most of the graphics stack and power management infrastructure and start over? Is the infrastructure itself solid and good and we just have to expand on it?

I know Dave did what he could with X in 1.14, and its definitely appreciated. But as the kernel bug, and your comments ON that bug mention, these are symptoms of problems.

VGAswitcharoo is a manual switch, which is fine when you want to override driver logic, but whats it gonna take to get the logic in place? And I mean perfectly running, like Windows, seamless, zero user-input necessary: the discrete card automatically takes over when its needed, shuts down when it isn't, and SLI and Crossfire can upload to the same buffers to have interwoven rendering done (as seamless as they can-- even on Windows I dont think its completely seamless)

I would think DMA-BUF and Render Nodes would help with SLI and CrossFire, DMA-BUF handling the sharing of buffers between two graphics cards, and Render Nodes allowing the second card to render content even though it doesn't actually have a display.

EDIT: "Throw out most of the graphics stack and power management infrastructure and start over?" --> Not the new DPM code, I mean the over all PCI Power Management code

**Vim_User** · 09 July 2013, 03:35 PM

Originally posted by bridgman View Post

This (DPM not reducing temperatures) is on your HD3200, right ?

Yes.
To give a bit more info, I am not sure if it is the video chip that heats up the system, since it seems that there is no supported temperature sensor on that device. The only temperature reported in my system (besides a bogus sensor that always reports 30?C) is the CPU temperature and this temperature is to high with the radeon drivers, while in reasonable ranges with Catalyst.

Seems to me like I am the unlucky one, just tested the latest drm-next-3.11 on my HD6870 and I still have heavy artifacts (maybe even a crash, the system becomes unresponsive). If I wouldn't know better (card works fine with Windows 7, Catalyst/Linux and radeon in kernel 3.9.9, also worked with drm-next-3.11-wip5) I would say hardware error, it definitely looks like one.

No luck for me with drm-next-3.11, sadly.

**agd5f** · 09 July 2013, 04:15 PM

Originally posted by Ericg View Post

Hey Alex, lets throw X out the window for the sake of this question. You've got Wayland, or some other magical display manager that you (the devs) created yourself and everything is goldilocks and perfect. Also, let's throw backwards compatibility in-kernel out the window too. You've basically got a blank slate.

Whats it gonna take (infrastructure wise) to get Nvidia Optimus, PowerXpress, SLI and CrossFire working on Linux? Throw out most of the graphics stack and power management infrastructure and start over? Is the infrastructure itself solid and good and we just have to expand on it?

I know Dave did what he could with X in 1.14, and its definitely appreciated. But as the kernel bug, and your comments ON that bug mention, these are symptoms of problems.

VGAswitcharoo is a manual switch, which is fine when you want to override driver logic, but whats it gonna take to get the logic in place? And I mean perfectly running, like Windows, seamless, zero user-input necessary: the discrete card automatically takes over when its needed, shuts down when it isn't, and SLI and Crossfire can upload to the same buffers to have interwoven rendering done (as seamless as they can-- even on Windows I dont think its completely seamless)

I would think DMA-BUF and Render Nodes would help with SLI and CrossFire, DMA-BUF handling the sharing of buffers between two graphics cards, and Render Nodes allowing the second card to render content even though it doesn't actually have a display.

Both hybrid graphics and crossfire require large amounts of driver independent infrastructure. Just about everything you'd need from the drivers is already there. Someone who really wants those features would need to sit down and just do the work and for the most part.

For hybrid graphics it really doesn't have anything to do with power management per se, and just about everything you need from the driver is already there. You just need a windowing system specific infrastructure to enumerate and select which GPU they want to render with and call down to the kernel and turn the dGPU on/off as needed. You have to teach X or wayland or mir to pick the GPU it wants to render with and make sure it's turned on when they want to use it and turned off when they don't. From the driver's perspective, it just looks like resume and suspend with a little APCI magic thrown in completely power on/off the device. Whether you use vgaswitcheroo or a custom drm ioctl, the driver code is pretty much the same. Once the window system code to handle this is in place any remaining bugs around the edges in the driver can be fixed, but the main effort is device independent.

For crossfire, you'd need some sort of layer on top of the 3D driver what would dispatch to multiple instances of the driver. Once again, largely driver independent.

Anyone interested in either of these features could start working on them without really having to any of the low level hw details. Unfortunately, neither of these features are a high priority for paying Linux customers.

**Ericg** · 09 July 2013, 04:33 PM

Originally posted by agd5f View Post

Anyone interested in either of these features could start working on them without really having to any of the low level hw details. Unfortunately, neither of these features are a high priority for paying Linux customers.

And thats fine, but this makes it clear to others what amount of work would need to be done.

Originally posted by agd5f View Post

Both hybrid graphics and crossfire require large amounts of driver independent infrastructure. Just about everything you'd need from the drivers is already there. Someone who really wants those features would need to sit down and just do the work and for the most part.

For hybrid graphics it really doesn't have anything to do with power management per se, and just about everything you need from the driver is already there. You just need a windowing system specific infrastructure to enumerate and select which GPU they want to render with and call down to the kernel and turn the dGPU on/off as needed. You have to teach X or wayland or mir to pick the GPU it wants to render with and make sure it's turned on when they want to use it and turned off when they don't. From the driver's perspective, it just looks like resume and suspend with a little APCI magic thrown in completely power on/off the device. Whether you use vgaswitcheroo or a custom drm ioctl, the driver code is pretty much the same. Once the window system code to handle this is in place any remaining bugs around the edges in the driver can be fixed, but the main effort is device independent.

For crossfire, you'd need some sort of layer on top of the 3D driver what would dispatch to multiple instances of the driver. Once again, largely driver independent.

So the next step for Optimus-style laptops is going to be at the Wayland / X / Mir level, correct? Say a window-hint (for simplicity's sake) that says "I would like to be rendered on the $powersave / $performance GPU." Which then passes that request onto the driver, who then has to decide what to do with that information?

I think it was... Dave's talk, where he said that there was an issue with lspci hanging because the device had to be basically re-initialized. Is that going to be an issue for the driver? Like when the card is OFF because nothing is rendering on it, does the kernel still knows that THAT PCI device exists? Or is it "out of sight out of mind" for the kernel where the only devices that exist are ones that have power?

I'm asking because im wondering if we need a new ACPI mode, something deeper than sleep but more alive than totally off-- so that the kernel doesn't lose track of a discrete GPU just because its not in use.

**Hamish Wilson** · 09 July 2013, 05:09 PM

Originally posted by bridgman View Post

AFAIK the rv610/630 and rs780 were the first with any kind of DPM hardware, so older chips (r600 and earlier) rely on the driver for all power management.

So basically unless a different solution is created for these cards they will be stuck with static PM, if I get your meaning correctly?

**agd5f** · 09 July 2013, 05:13 PM

Originally posted by Ericg View Post

So the next step for Optimus-style laptops is going to be at the Wayland / X / Mir level, correct? Say a window-hint (for simplicity's sake) that says "I would like to be rendered on the $powersave / $performance GPU." Which then passes that request onto the driver, who then has to decide what to do with that information?

It's a little more complex than that. Most of the logic is in the window system, you don't really need anything else in the kernel driver. The driver just does whatever userspace asks for; e.g., render to this buffer, render to that buffer, etc. The kernel driver doesn't really need to care what sort of policy the windowing system employs.

For a one time app, you can just list all the gpus in the system and let the user pick which gpu they want to use, then when the app is done, the window system can ask the kernel to turn off the gpu. However for longer running apps like the system compositor, what do you do when the gpu it's running on changes? The different gpus may support GL versions and extensions. When the GPU changes, you lose your GL context and all your surfaces and you have to start fresh on a new GPU. Making that happen relatively seamlessly is where it gets tricky.

Originally posted by Ericg View Post

I think it was... Dave's talk, where he said that there was an issue with lspci hanging because the device had to be basically re-initialized. Is that going to be an issue for the driver? Like when the card is OFF because nothing is rendering on it, does the kernel still knows that THAT PCI device exists? Or is it "out of sight out of mind" for the kernel where the only devices that exist are ones that have power?

I'm asking because im wondering if we need a new ACPI mode, something deeper than sleep but more alive than totally off-- so that the kernel doesn't lose track of a discrete GPU just because its not in use.

It's not a driver issue. lspci walks the pci bus and lists devices. If the card is powered off, it won't show up on the bus. So you'd have to power the device back up when running lspci which adds latency to lspci. IIRC, Dave already had that working, the issue was mainly the added latency. OTOH, how often do you run lspci in your everyday work. I don't think a little extra latency is a big deal. The other issue is hdmi audio. When the gpu is powered off, the hdmi audio pci device goes away as well, so you'd need to power up the gpu for hdmi audio as well.

**agd5f** · 09 July 2013, 05:15 PM

Originally posted by Hamish Wilson View Post

So basically unless a different solution is created for these cards they will be stuck with static PM, if I get your meaning correctly?

Those cards only support static pm (in addition to clockgating). They didn't have the dynamic reclocking features of the newer asics.

Announcement

Radeon DRM: Dynamic Power Management Updates

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment