Announcement

Collapse
No announcement yet.

Radeon DRM: Dynamic Power Management Updates

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by agd5f View Post
    It's a little more complex than that. Most of the logic is in the window system, you don't really need anything else in the kernel driver. The driver just does whatever userspace asks for; e.g., render to this buffer, render to that buffer, etc. The kernel driver doesn't really need to care what sort of policy the windowing system employs.
    I figured it was more complex than that, I was just trying to keep the question and idea simple just for sake of clarity.

    Originally posted by agd5f View Post
    For a one time app, you can just list all the gpus in the system and let the user pick which gpu they want to use, then when the app is done, the window system can ask the kernel to turn off the gpu. However for longer running apps like the system compositor, what do you do when the gpu it's running on changes? The different gpus may support GL versions and extensions. When the GPU changes, you lose your GL context and all your surfaces and you have to start fresh on a new GPU. Making that happen relatively seamlessly is where it gets tricky.
    Listing the GPU's is an interesting idea. As far as the issue of the changing system compositor... what happens on Windows with Catalyst? When you switch to the discrete GPU for a game or something, does the integrated GPU turn mostly off and just display the buffer with ALL rendering is moving to the discrete, or does the game render on the discrete and the desktop (for sake of argument, lets say you were running the game in windowed mode) continue to be rendered on the integrated?

    Originally posted by agd5f View Post
    It's not a driver issue. lspci walks the pci bus and lists devices. If the card is powered off, it won't show up on the bus. So you'd have to power the device back up when running lspci which adds latency to lspci. IIRC, Dave already had that working, the issue was mainly the added latency. OTOH, how often do you run lspci in your everyday work. I don't think a little extra latency is a big deal. The other issue is hdmi audio. When the gpu is powered off, the hdmi audio pci device goes away as well, so you'd need to power up the gpu for hdmi audio as well.
    Yeah Dave was only talking about the added latency in regards to lspci-- otherwise it worked fine. I was just asking because I didn't know if "The card doesn't show up on the bus" was a symptom of a problem, rather than the problem itself.

    I guess I was unclear in my original post above: I only brought up the lspci issue because the card had to be re-initialized. If the card has to be reinitialized to show up on the bus, will we hit an issue where the windowing system says "I want to run something on the discrete card" and the kernel comes back with "What discrete card? There's nothing there." Just because the card was off and therefore not showing up as a PCI device, when inreality it IS there and is working perfectly fine, its just...off.
    All opinions are my own not those of my employer if you know who they are.

    Comment


    • #62
      Originally posted by Ericg View Post
      Listing the GPU's is an interesting idea. As far as the issue of the changing system compositor... what happens on Windows with Catalyst? When you switch to the discrete GPU for a game or something, does the integrated GPU turn mostly off and just display the buffer with ALL rendering is moving to the discrete, or does the game render on the discrete and the desktop (for sake of argument, lets say you were running the game in windowed mode) continue to be rendered on the integrated?
      It depends on a lot of factors. hybrid laptops can be configured in a lot of ways by the oem. Here are a few examples:
      1. all displays connected to both GPUs, a mux selects which GPU the displays are connected to
      2. some displays are connected to both GPUs, others are only connected to on GPU and there is a mux for the displays that are attached to both GPUs
      3. all displays connected to the iGPU, no displays connected to the dGPU
      4. some displays only connected to the iGPU, others displays are only connected to the dGPU

      Depending on how your system is wired and what you are are doing, the rendering and display GPUs may be the same or different. See Dave's blog post on reverse optimus:
      So I took some time today to try and code up a thing I call reverse optimus. Optimus laptops come in a lot of flavours, but one annoying one is where the LVDS/eDP panel is only connected to the Intel and the outputs are only connected to the nvidia GPU. Under Windows, either the intel is rendering…


      Originally posted by Ericg View Post
      Yeah Dave was only talking about the added latency in regards to lspci-- otherwise it worked fine. I was just asking because I didn't know if "The card doesn't show up on the bus" was a symptom of a problem, rather than the problem itself.

      I guess I was unclear in my original post above: I only brought up the lspci issue because the card had to be re-initialized. If the card has to be reinitialized to show up on the bus, will we hit an issue where the windowing system says "I want to run something on the discrete card" and the kernel comes back with "What discrete card? There's nothing there." Just because the card was off and therefore not showing up as a PCI device, when inreality it IS there and is working perfectly fine, its just...off.
      For lspci, the hw doesn't need to be re-initialized, it just needs to be powered up so it shows up on the bus, but we end up doing it since you need to anyway if you have the driver loaded. Once the driver has loaded, think of turning the card on/off as resuming/suspending the driver; the same thing the driver would do when you resume/suspend the entire system. You have to turn the dGPU turned on at some point so that the kernel can see it and load the driver. Once the driver has loaded, you can turn the hw on/off as needed since userspace talks to the kernel driver.

      Comment


      • #63
        Originally posted by agd5f View Post
        For lspci, the hw doesn't need to be re-initialized, it just needs to be powered up so it shows up on the bus, but we end up doing it since you need to anyway if you have the driver loaded. Once the driver has loaded, think of turning the card on/off as resuming/suspending the driver; the same thing the driver would do when you resume/suspend the entire system. You have to turn the dGPU turned on at some point so that the kernel can see it and load the driver. Once the driver has loaded, you can turn the hw on/off as needed since userspace talks to the kernel driver.
        Gotcha, alright, I understand now. Thanks for the clarifications Alex Its very much appreciated
        All opinions are my own not those of my employer if you know who they are.

        Comment


        • #64
          Originally posted by Vim_User View Post
          Yes.
          To give a bit more info, I am not sure if it is the video chip that heats up the system, since it seems that there is no supported temperature sensor on that device. The only temperature reported in my system (besides a bogus sensor that always reports 30?C) is the CPU temperature and this temperature is to high with the radeon drivers, while in reasonable ranges with Catalyst.

          Seems to me like I am the unlucky one, just tested the latest drm-next-3.11 on my HD6870 and I still have heavy artifacts (maybe even a crash, the system becomes unresponsive). If I wouldn't know better (card works fine with Windows 7, Catalyst/Linux and radeon in kernel 3.9.9, also worked with drm-next-3.11-wip5) I would say hardware error, it definitely looks like one.

          No luck for me with drm-next-3.11, sadly.
          To show what I mean with artifacts: http://slackeee.de/public/drm-next-3.11.mp4

          Comment


          • #65
            It looks like that I am still need to stick to "low profile" with my HD6850. It's can't run OpenCL programs yet. Might, I could start to enjoying my card with release of 3.13...

            Comment


            • #66
              Originally posted by Death Knight View Post
              It looks like that I am still need to stick to "low profile" with my HD6850. It's can't run OpenCL programs yet. Might, I could start to enjoying my card with release of 3.13...
              Is DPM not working for you ? AFAIK "low profile" doesn't really coexist with DPM...

              OpenCL support should be unrelated to DPM, shouldn't it ?
              Test signature

              Comment


              • #67
                UVD still does not work and cause a soft-lockup with newest-everything (E-350 and HD 6870) for me.

                [ 4.238632] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 5.249853] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 6.261076] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 7.272293] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 8.283510] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 9.294728] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 10.305945] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 11.317170] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 12.328388] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 13.339605] [drm:r600_uvd_init] *ERROR* UVD not responding, trying to reset the VCPU!!!
                [ 13.359480] [drm:r600_uvd_init] *ERROR* UVD not responding, giving up!!!
                [ 13.359531] [drm:evergreen_startup] *ERROR* radeon: error initializing UVD (-1).

                Comment


                • #68
                  Originally posted by d2kx View Post
                  UVD still does not work and cause a soft-lockup with newest-everything (E-350 and HD 6870) for me.
                  just get latest firmware files from http://people.freedesktop.org/~agd5f/radeon_ucode/

                  Comment


                  • #69
                    Originally posted by Vanek View Post
                    just get latest firmware files from http://people.freedesktop.org/~agd5f/radeon_ucode/
                    They are installed.

                    Comment


                    • #70
                      Linux 3.11 rc0 here, and I report UVD is finally working with my E-450. Since E-350 and E-450 are basically the same, you shouldn't have issues.

                      Code:
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] UVD initialized successfully.
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] Enabling audio support
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] ib test on ring 0 succeeded in 0 usecs
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] ib test on ring 3 succeeded in 0 usecs
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] ib test on ring 5 succeeded
                      jul 13 20:05:25 hydragiros.estrella kernel: [drm] radeon atom DIG backlight initialized
                      OTOH, I get this when I try to enable the new Dynamic Power Management code.

                      Code:
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU lockup CP stall for more than 26095msec
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000000003 last fence id 0x0000000000000001)
                      jul 13 20:04:45 hydragiros.estrella kernel: [drm] Disabling audio support
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: fence driver on ring 5 use gpu addr 0x0000000000177118 and cpu addr 0xffffc900053331
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: Saved 55 dwords of commands on ring 0.
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU softreset: 0x00000009
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS               = 0xB2433828
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x08000007
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x40000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00008000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x80228643
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: GRBM_SOFT_RESET=0x00007F6B
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: SRBM_SOFT_RESET=0x00000100
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS               = 0x00003828
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS_SE0           = 0x00000007
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   GRBM_STATUS_SE1           = 0x00000007
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   SRBM_STATUS               = 0x20000040
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   SRBM_STATUS2              = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008674_CP_STALLED_STAT1 = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008678_CP_STALLED_STAT2 = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_00867C_CP_BUSY_STAT     = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_008680_CP_STAT          = 0x00000000
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0:   R_00D034_DMA_STATUS_REG   = 0x44C83D57
                      jul 13 20:04:45 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU reset succeeded, trying to resume
                      jul 13 20:04:45 hydragiros.estrella kernel: [drm] PCIE GART of 512M enabled (table at 0x0000000000040000).
                      ...
                      jul 13 20:04:55 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU lockup CP stall for more than 10000msec
                      jul 13 20:04:55 hydragiros.estrella kernel: radeon 0000:00:01.0: GPU lockup (waiting for 0x0000000000000004 last fence id 0x0000000000000001)
                      jul 13 20:04:55 hydragiros.estrella kernel: [drm:r600_ib_test] *ERROR* radeon: fence wait failed (-35).
                      jul 13 20:04:55 hydragiros.estrella kernel: [drm:radeon_ib_ring_tests] *ERROR* radeon: failed testing IB on GFX ring (-35).
                      jul 13 20:04:55 hydragiros.estrella kernel: radeon 0000:00:01.0: ib ring test failed (-35)
                      All this, as I see a white screen of death.

                      Comment

                      Working...
                      X