Announcement

Collapse
No announcement yet.

AMDGPU Linux Driver No Longer Lets You Have Unlimited Control To Lower Your Power Limit

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Originally posted by Eudyptula View Post
    Why are so many people advocating for making choices for other people instead of educating and informing?

    Having the possibility to unlock full control doesn't have to be done without any warning and information. There should be a warning. There should be information.

    In extreme cases, the warranty could be voided. Of course, it would have to be reasonable and in accordance to the specific hardware model, chip, generation, hardware configuration, product segment binning, power delivery circuit design etc. If not, then that's like setting a limit on tire pressure across all cars even though different cars and tires run on different tire pressures.
    In this case the manufacturer is setting a limit. If CachyOS was to intentionally disable the limit and didn't include ways to inform and educate the users about it then CachyOS would be at fault when their users mess up their hardware. Ideally, it would be hidden behind some kernel flag like amdgpu.extreme_uv=1 so the user would have to go out of their way to learn about it and enable it.

    Comment


    • #52
      Setting power limits too low generally doesnt' break hardware... what it does do though is cause it to be more likely to have errors... which nobody wants to spend time tracking down errors because people tuned their GPUs too low.

      Comment


      • #53
        Originally posted by stormcrow View Post

        ... ... *head shakes at the pseudo logic*
        Yes, I do shake my head at the pseudo logic in your original comment. Glad you realize it.

        Comment


        • #54
          Originally posted by cb88 View Post
          Setting power limits too low generally doesnt' break hardware... what it does do though is cause it to be more likely to have errors... which nobody wants to spend time tracking down errors because people tuned their GPUs too low.
          How does that work? How would a lower TDP produce errors? And why wouldn't you get errors when you reduce the max frequency which is allowed by AMD?

          Comment


          • #55
            Originally posted by M@GOid View Post

            I was doing that experiment myself. 2 1080p monitors (via Displayport), 144 and 120Hz. But in my case, If the first one is set to 144Hz and the second to 60Hz, the idle power stays in the minimum. But my card is a RX6600 (from Sapphire), not a RX6800.
            Having digged further into it, it seems a few people have luck with some monitors combinations and/or modified vertical blankings.

            My 3440x1400 seems to be the culprit here, nothing helps.

            I guess it's not the end of the world, every single light bulb in the house used to be 40-65W and now they are 5-7W, there's always something somewhere that can be done better or gets worse. Also when the screens sleep it goes back to 8W. I don't expect it to be ever fixed actually, as it seems to be a choice made to reduce flickering risks.

            Also it's not all nvidia gpus that are immune to it, and it seems fixed for the most part in the 7XXX series despite an atrocious start at 100+W idle. I'm waiting for next gen at least to upgrade, anyway, spending hundreds to save a few bucks per year would be dumb.

            Comment


            • #56
              Originally posted by Anux View Post
              How does that work? How would a lower TDP produce errors? And why wouldn't you get errors when you reduce the max frequency which is allowed by AMD?
              Depending on how everything interacts with each other, lowering the TDP could send fewer volts for a given frequency state and when that happens errors can possibly occur. On my 580, some states could be lowered by up to 100mv and others by none before crashing would occur. When you lower with a heuristic, you can't know for sure how much it will undervolt from where to get your desired power draw.

              Comment


              • #57
                Originally posted by skeevy420 View Post
                lowering the TDP could send fewer volts for a given frequency state
                No that's under volting, TDP limits the selection of available FID/VID pairs (dependent on your current current draw) but doesn't alter them.

                some states could be lowered by up to 100mv and others by none before crashing would occur.
                I don't get why so many people here can't differentiate between TDP/Power limit and under volting.

                Comment


                • #58
                  Originally posted by Anux View Post
                  TDP limits the selection of available FID/VID pairs (dependent on your current current draw) but doesn't alter them.
                  This is exactly what the patch is solving. Before the patch, setting TDP to go lower than specified "created" new values for FID or VID so that the tdp target could be reached. And now it doesn't.


                  People are talking about undervolting because of the formula I shared above, since power scales quadratically with voltage and linearly with other variables, undervolting is the most effective way to lower power draw.



                  BTW, if I were to participate in the mailing list, what I would say is that the patch is correct. The driver is not the place to mess with this kind of stuff anyways, of course if you really want to you can patch the driver, bit IMO what you should do is patch the bios of your card, this is what I do, my card even has a switch to change from bios A to B, so if I mess up there is always a known good version accessible.


                  This is what vendors expect. AMD should not be braking their partners expectations.

                  Comment


                  • #59
                    C'mon people, it's 2024 already, we don't need to still be talking in these outdated puritanical concepts of "volts" and "amps". It's time to move on and get with the times, boomers.

                    Comment


                    • #60
                      Originally posted by DumbFsck View Post

                      This is exactly what the patch is solving. Before the patch, setting TDP to go lower than specified "created" new values for FID or VID so that the tdp target could be reached. And now it doesn't.
                      I read the links from the article and this wasn't mentioned there. Do you have any resource to support your claim?

                      I never heard of any automatic under volting in such cases. If there were room to automatically under volt (same FID but lower VID) why wouldn't that be the standard the card get's delivered with instead of running it with a higher voltage? Isn't the whole point of those FID/VID pairs to tune the card from AIB and deliver it with thoroughly tested values?

                      On my current RX 480 I have 7 power states (FID/VID pairs) that are predefined by the AIB. When I lower the power limit to it's minimum some of the highest pairs won't get selected (dependent on workload), but still on idle it selects the lowest pairs without altering any of them and uses much less power so there clearly is much room to lower the power limit even more.

                      You should clearly be able to reduce the power limit until your locked at the lowest state (I think 600 MHz at 0,85V in my case) and then still get lower with throttling. Although at this point the card would be unusable slow, but when I return to standard settings it would work just like ever with no hardware damage.

                      People are talking about undervolting because of the formula I shared above, since power scales quadratically with voltage and linearly with other variables, undervolting is the most effective way to lower power draw.
                      That's perfectly fine and not at all in conflict with the scenario I described.

                      BTW, if I were to participate in the mailing list, what I would say is that the patch is correct. The driver is not the place to mess with this kind of stuff anyways, of course if you really want to you can patch the driver, bit IMO what you should do is patch the bios of your card, this is what I do, my card even has a switch to change from bios A to B, so if I mess up there is always a known good version accessible.
                      The whole point of this discussion is, that the kernel overrides your firmware power limits. So setting them in firmware has now become useless.

                      Comment

                      Working...
                      X