Announcement

Collapse
No announcement yet.

Undervolting/Overclocking only works sometimes?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Undervolting/Overclocking only works sometimes?

    Hi,

    I'm running a Vega 64 on Linux 5.5.10 with the open source drivers with amdgpu.ppfeaturemask=0xffffffff and a custom SCLK table via /sys/class/drm/card0/device/pp_od_clk_voltage to undervolt the device (which as I'm sure you know can make a significant difference on a Vega card).
    That seems to work and according to hwmon the GPU draws less power and is running cooler, just as expected.
    However, under some circumstances it switches back to default settings automatically and no changes are accepted anymore.

    I think it depends on which monitor is connected, but I'm not 100% sure, maybe someone can give me a hint?
    Usually I'm using a 1920x1200@60Hz screen connected via HDMI, but if I switch on my other screen (5120x1440@120Hz via DP) it stops working.
    Is this due to the high resolution, or is it because it's activating freesync or is it the refreshrate or DisplayPort in general or something completely different?
    If any of this is restricting the usage of undervolting, will that go away in the future?

    Regards
    Last edited by Berniyh; 24 March 2020, 02:43 PM.

  • #2
    I only use 1 monitor, but had the same issue (on Vega with amdgpu), it was happening anytime a game would start, or the monitor's resolution or refresh rate would change.

    I already had a script which has a loop that checks the GPU temp every few seconds and sets the fan speed (using custom fans/ cooler so default fan settings were not working). In the loop I also added a check for the GPU voltage, if the voltage is higher than the max I undervolt to , then it runs another script which undervolts/overclocks.

    Basically something like: if [[ $(cat /sys/class/drm/card0/device/hwmon/hwmon4/in0_input) -gt 1000 ]]; then undervolt_overclock_script; fi

    Comment


    • #3
      Don't see an option to edit my post, but here's how I do what is described in the last post, (the commented out parts, line 101 and 116):

      https://gist.github.com/kevinlekille...ontrol-sh-L101

      Comment


      • #4
        In my case it really ignores the settings, even if I set them again.

        Comment


        • #5
          Try setting power_dpm_force_performance_level to low, then back to auto, then undervolt again.

          Comment


          • #6
            I tried that, but that doesn't change anything.

            However, I did now find out, that it's most likely the refresh rate of 120 Hz that enforces some specific state/voltages.
            Per standard (video BIOS?) the voltage is at 1.2 V. I can lower it to 1.08 V without problems (so far …).
            If I set 5120×1440@60 Hz, it stays at those 1.08 V, if I set 5120×1440@100 Hz, it goes up to 1.1 V and if I set 5120×1440@120 Hz it goes up to 1.2 V.

            In Windows it seems to work, even when running at 120 Hz, so it doesn't seem to a problem in general, but possibly the Linux driver is a bit more restrictive here?

            Comment


            • #7
              The only thing I could see for refresh rate is if the rate is over 120Hz, but I'm not sure what it changes: https://github.com/torvalds/linux/bl...gpu_pm.c#L3499

              Maybe there's a check somewhere for pixels per second (something like if ((width*height*refresh_rate) > someNumber) { setGpuVoltage(1200); }), but I don't see it.

              Maybe it's done at the GPU bios level?

              Comment


              • #8
                Maybe, I don't know.
                Doesn't have to be the refresh rate. Didn't yet try other resolutions at 120 Hz. Might work as well.
                So yes, it could be as you say.

                In general there is some sort of weird behavior when undervolting the GPU.
                e.g. the power drain on idle sometimes goes up to 7-9W instead of 3W like it is without undervolting. I think partly because it won't use the lower mclk states anymore.
                Since it's currently almost always under load that doesn't matter so much for me, but it's a bit weird.

                Edit: now that I'm reading that comment in the file you linked … maybe Arch is using the non-DC code path. Need to check for that and maybe compile my own kernel and test with DC.

                Comment


                • #9
                  Originally posted by Berniyh View Post
                  In general there is some sort of weird behavior when undervolting the GPU.
                  e.g. the power drain on idle sometimes goes up to 7-9W instead of 3W like it is without undervolting. I think partly because it won't use the lower mclk states anymore.
                  Since it's currently almost always under load that doesn't matter so much for me, but it's a bit weird.
                  I have that same issue ; if I leave the GPU on stock settings, MCLK will only go back down to its lowest state and VDDC to its minimum voltage, but as soon as I touch anything powerplay related, MCLK will never go lower than the state 2 (800MHz) and the voltage stays stuck on SCLK state 6, gave up on trying to chase down the cause long ago, I'm just happy the card isn't at 1.2v and hitting 105c / throttling, I'm not hopeful these issues will ever be fixed because it's probably some kind of issue with the driver not communicating with the vBIOS and AMD probably doesn't care about vega anymore at this point.

                  Comment

                  Working...
                  X