Announcement

Collapse
No announcement yet.

WattMan in 4.17, does it work for vega10?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • WattMan in 4.17, does it work for vega10?

    From preliminary testing, by setting power_dpm_force_performance_level to manual first, and then setting pp_power_profile_mode to the various profiles (0 to 4), nothing is changed in the power consumption and the sclk still hovers between 1536 and 1630.

    I'm not sure what I might be missing. Ideally I'd like a way to simply set the TDP for a given card and have amdgpu automatically choose appropriate clocks. For instance if I set sclk manually to 1401 I almost halve the consumed power with a very small decrease in performance, so this shows a lot of potential for optimisation.

    (and to anyone who's wondering: no, I'm not mining cryptocurrencies with this card. I run other compute workloads).

  • #2
    As it turns out the issue with manually forcing sclk to a given frequency is that when the GPU is not used, power consumption is quite higher (+30W at the wall). So instead I've written a patch to vega10_find_highest_dpm_level() that prevents the two highers power levels from being used and still allows the lowest power level to be used when idle.

    Code:
    --- ./drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c~    2018-05-19 23:48:31.447618885 +0200
    +++ ./drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c    2018-05-19 23:57:36.837027441 +0200
    @@ -3535,7 +3535,7 @@
         uint32_t i = 0;
     
         if (table->count <= MAX_REGULAR_DPM_NUMBER) {
    -        for (i = table->count; i > 0; i--) {
    +        for (i = table->count - 2; i > 0; i--) {
                 if (table->dpm_levels[i - 1].enabled)
                     return i - 1;
             }
    With this patch, I have a decrease of about 100W at the wall when running LuxMark, with a score of about 25000 instead of ~26400 (so a decrease of 5% in performance).

    Comment


    • #3
      Originally posted by cde1 View Post
      As it turns out the issue with manually forcing sclk to a given frequency is that when the GPU is not used, power consumption is quite higher (+30W at the wall). So instead I've written a patch to vega10_find_highest_dpm_level() that prevents the two highers power levels from being used and still allows the lowest power level to be used when idle.

      Code:
      --- ./drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c~ 2018-05-19 23:48:31.447618885 +0200
      +++ ./drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c 2018-05-19 23:57:36.837027441 +0200
      @@ -3535,7 +3535,7 @@
      uint32_t i = 0;
      
      if (table->count <= MAX_REGULAR_DPM_NUMBER) {
      - for (i = table->count; i > 0; i--) {
      + for (i = table->count - 2; i > 0; i--) {
      if (table->dpm_levels[i - 1].enabled)
      return i - 1;
      }
      With this patch, I have a decrease of about 100W at the wall when running LuxMark, with a score of about 25000 instead of ~26400 (so a decrease of 5% in performance).
      For the record, here's a script to recompile amdgpu.ko on Debian without having to go through the (very slow) dpkg-buildpackage:

      Code:
      #!/bin/sh
      KVER=`uname -r`
      sudo apt-get install module-assistant bison flex libelf-dev libssl-dev bc
      sudo m-a prepare
      apt-get source linux-image-$KVER
      cd linux-*
      perl -i -pe 's/i = table->count/i = table->count - 2/g' drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
      cp /usr/src/linux-headers-$KVER/Module.symvers .
      make oldconfig
      make prepare
      make modules_prepare
      make SUBDIRS=scripts/mod
      make SUBDIRS=drivers/gpu/drm/amd/amdgpu modules
      sudo cp drivers/gpu/drm/amd/amdgpu/amdgpu.ko /lib/modules/$KVER/kernel/drivers/gpu/drm/amd/amdgpu/amdgpu.ko
      sudo depmod -a
      Incidentally, the latest 18.30 amdgpu package (amdgpu-pro-18.30-633530.tar.xz) has the annoying habit of checking wether it's running on Ubuntu 18.04 or not, this check is done by amdgpu-core_18.30-633530_all.deb:

      Code:
      if [ "$VERSION_ID" != "18.04" ] ; then
          >&2 echo "ERROR: This package can only be installed on Ubuntu 18.04."
          exit 1
      fi
      This can be fixed by temporarily adding VERSION_ID=18.04 to /etc/os-release when installing this package (for Vega, only opencl-amdgpu-pro-icd_18.30-633530_amd64.deb amdgpu-pro-core_18.30-633530_all.deb and amdgpu-core_18.30-633530_all.deb are required for OpenCL support, in fact it might even be easier to extract libamdocl64.so and amdocl64.icd rather than installing those .deb).
      Last edited by cde1; 15 October 2018, 12:13 PM.

      Comment


      • #4
        Thanks debianxfce!

        Comment


        • #5
          Additionally, I found out that the intel_pstates driver (enabled by default in the "active" state) does not allow the userspace governor. It has to be either disabled (but then turbo states are unavailable as acpi-cpufreq doesn't support them), or set in passive mode (intel_pstate=passive) which enables the use of the userspace governor and forcing a given frequency with cpupower frequency-set.
          Last edited by cde1; 22 August 2018, 05:20 AM.

          Comment

          Working...
          X