Announcement

Collapse
No announcement yet.

More Radeon Power Management Improvements

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #76
    Originally posted by agd5f View Post
    "auto" only changes the clocks when switching from battery to ac and vice versa (mid <-> high) and when the monitors blank (low).
    OK for the battery/ac switch. However, closing the lid doesn't seem to change the clocks (checking through ssh).

    Besides, "dynpm" still introduces lots of flicker and micro-freezes, but doesn't always bring the clocks up when needed (HD playback, thus jerky). glxgears alone seems to trigger reclocking.
    Is dynpm still regarded as experimental by now ?

    Comment


    • #77
      Originally posted by dmfr View Post
      OK for the battery/ac switch. However, closing the lid doesn't seem to change the clocks (checking through ssh).
      It will if your desktop environment blanks the screen when you close the lid. The dpms reclocking happens when all the monitors are in the dpms off state. It also depends on what power states are specified in your bios. Some cards only have one or two states for example.

      Originally posted by dmfr View Post
      Besides, "dynpm" still introduces lots of flicker and micro-freezes, but doesn't always bring the clocks up when needed (HD playback, thus jerky). glxgears alone seems to trigger reclocking.
      Is dynpm still regarded as experimental by now ?
      The engines have to be idle and the CPU cannot be accessing vram when changing the clocks. To avoid flicker, you need to change the clocks during the vblank period (this is nearly impossible with dualhead as the vblanks may never line up). Changing the clocks requires a certain amount of time depending on how long it takes the plls to lock. As such, if the reclock gets scheduled too close to the end or outside the vblank period, you need to reschedule it. It's even harder with two clocks (engine and memory). They should be scheduled separately to make sure they can be done in the vblank period; right now they are not.

      The pm code has to take the cp lock, wait for the engines to be idle, turn off the crtc mem requests, and unmap all vram mmaps when it's going to reclock. When this happens, new command buffers cannot be sent to the GPU; that's what causes the stalls. The flickering is caused by overrunning the vblank period during the reclock.

      Deciding when to dynamically reclock is also tough. Right now we just look at the number of queued command buffers, but ideally, you'd have some intelligence as to what operations are happening and how much bandwidth you'd need from memory and the engine to complete the requested command buffer smoothly. If you miss the vblank window, you need to reschedule the reclock which means the clocks may be running slower than you'd ideally want.

      Unfortunately, it's very complex to get right.

      Comment


      • #78
        Thanks a lot for the detailed explanation, though a bit off my knowledge.

        That may explain why dynpm was smooth when the engine clock alone was tweaked.
        For now i'll stick to profile=auto.

        Comment


        • #79
          Uhm, since you're AMD/ATI, don't you have info about how fglrx solves this?

          Comment


          • #80
            Yes, piles and piles of code that touches just about everything.

            Comment


            • #81
              Power management is probably the hardest part of GPU programming since it ties into everything.

              Comment


              • #82
                Are there any delays implemented currently?

                One would think it would improve the experience if a reclock would happen only after atleast X secs (0.5?) after the last. Would compensate a bit for incomplete stats.

                Comment


                • #83
                  Originally posted by agd5f View Post
                  Yes, piles and piles of code that touches just about everything.
                  But it seems to work seamless and efficient on a regular desktop system while the OSS driver does not.

                  And using the keyword "auto" for "always high clock" is misleading. Maybe it should be renamed to "laptop-power-source-dependent" or something like that.

                  Comment


                  • #84
                    Originally posted by Ivaldi View Post
                    But it seems to work seamless and efficient on a regular desktop system while the OSS driver does not.
                    Yes, but you can't just rip out the PM code out of the Catalyst driver. Even understanding what the hell it does is probably difficult, considering it touches everything. And rewriting it to touch everything in the open source stack is also a completely different issue.

                    Comment


                    • #85
                      Getting smooth automatic power management without graphical glitches or hangs is a huge amount of work and the open source driver just isn't there yet (we're working on it). One could argue it's the most complex part of the driver since it ties into everything:
                      - asic init
                      - bus setup
                      - drawing engine
                      - displays
                      etc.

                      Besides the trickiness of just getting reclocking to work smoothly with the drawing and display blocks, you also need the infrastructure in place to track the amounts of bandwidth or performance required for certain operations. E.g., if you are running a video on a huge display, you can only clock memory so low before you have bandwidth contention between the displays and drawing engine. Add to that certain clock combinations that don't work well together on certain boards, etc.

                      The pm code in the closed source driver is designed to work with a completely different driver stack, so it's not really possible to just use it. Plus there are 3rd party modules in there for things like i2c thermal chips that are not AMD's IP to release.

                      Comment


                      • #86
                        There are maybe 20-30 man-years of accumulated effort in the fglrx power management code.

                        IMO the open source PM code is doing remarkably well considering how new it is. As agd5f said, the driver architectures are totally different so pulling code from fglrx into the open stack would be more work than writing the open PM code from scratch.

                        Comment


                        • #87
                          More like regressions. It doesn't look to be working in Lubuntu.

                          I have to do a hard reboot since the laptop is unresponsive.

                          Comment


                          • #88
                            I tried the recent stuff. The good thing is that the 2D driver is now stable for me. No more hard locks. On the other, the KMS, DRI2, or whatever new thing that has been introduced works so and so.

                            Performance is poor, and I'm unsure about the power management. The dynamic option is not really usable yet. I understand there are technical problems to be solved; but isn't it trying to be too smart/aggressive? If I read past comments well, the flickering is due to frequency changes not masked properly or something; what I don't get is why is it reclocking constantly. For instance, bringing a yakuake terminal up front causes some flickering. I wouldn't thought that was needed, my old card surely can handle it at its lowest power state. Isn't it a bit trigger-happy?

                            I can't say much about the 'low' and 'high' profiles, because I have no way to measure the temperature of this machine. What surprised me is that there is no difference in performance in OpenArena: both capped at 60 FPS and choking down to the same values in the difficult parts of the map I tested. Is this the expected behavior? Are the frequency values exposed somewhere?

                            By the way, I wanted to test this with 2.6.34 and I thought that it would be enough by compiling the drm kernel bits from source. However, I see that the stuff from freedesktop is only the library. I'm pretty sure I used to compile both parts in the past. Did this change somehow? Is it still possible to do this?

                            Comment


                            • #89
                              The kernel bits of drm have moved into the kernel tree. You used to be able to compile it separately, but it was difficult to track the fast-moving kernel development, if I understood correctly.

                              I think that DRI2 is vsynced by default at the moment -- that's why OpenArena is capped at 60.

                              I've had the same experience you did with dynpm. Profiles work well and result in nice temperature drops when using the low profile. I use low by default, and switch to default if I want to run something 3d intensive.

                              Comment


                              • #90
                                Originally posted by pingufunkybeat View Post
                                The kernel bits of drm have moved into the kernel tree. You used to be able to compile it separately, but it was difficult to track the fast-moving kernel development, if I understood correctly.
                                Ah, so I wasn't making anything up. The reason I wanted to try with an older kernel is that there's no phc (undervolting) patch for 2.6.35 yet, so the possible power saving gains from the graphics card are offset by the CPU, so I can't really tell how good they are.

                                I think that DRI2 is vsynced by default at the moment -- that's why OpenArena is capped at 60.
                                Do you know of an option somewhere to avoid this? The only thing I found was XV_VSYNC, but that seems to be for textured video rather than openGL.

                                I've had the same experience you did with dynpm. Profiles work well and result in nice temperature drops when using the low profile. I use low by default, and switch to default if I want to run something 3d intensive.
                                Yes, that's what I used to do with fglrx and it didn't bother me switching manually. Of course, dynpm will be a nice feature when it gets to work transparently.

                                Comment

                                Working...
                                X