Announcement

Collapse
No announcement yet.

An Early Look At The AMD P-State CPPC Driver Performance vs. ACPI CPUFreq

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    CPPC isn't magic. You can't suddenly get max performance and min power usage at the same time. The main difference between CPPC and the old pstate interface is the level of fine grained control. The old pstate interface gives you 3 discrete clock levels. CPPC gives you a continuum of clock levels. The end goal is not to improve performance on workloads the require maximum CPU performance, it's to reduce power usage (and extend battery life) for most workloads. What you want to do is to get the minimum CPU clock required to finish your workload in an acceptable time. You can save more power if you have a wider range of clocks to pick from.

    Comment


    • #22
      I currently have a windows installation to help with benchmarking/tuning of my 5950x as there are just more sensor readings available. I noticed on Windows processes get scheduled to run on my "fastest" cores more often than on linux where it seems more random. I always thought CPPC was responsible for this. Is this something amd-pstate could help with?

      Comment


      • #23
        Hi hoped for more, while my HP ProBook's idle runtime under Linux is comparable to Window's the usage time is not at all.
        browsing the web in Firefox works out around 8-12 h on Windows but only ~3 hours on Linux, sadge.

        Comment


        • #24
          Originally posted by clouddrop View Post
          I currently have a windows installation to help with benchmarking/tuning of my 5950x as there are just more sensor readings available. I noticed on Windows processes get scheduled to run on my "fastest" cores more often than on linux where it seems more random. I always thought CPPC was responsible for this. Is this something amd-pstate could help with?
          CPPC is just one part of the equation. The other part of the equation is preferential scheduling, which does not exist on Linux. Unfortunately, amd_pstate effort is not related to preferential scheduling in any way.
          Last edited by intelfx; 24 September 2021, 07:48 AM.

          Comment


          • #25
            Originally posted by agd5f View Post
            CPPC isn't magic. You can't suddenly get max performance and min power usage at the same time. The main difference between CPPC and the old pstate interface is the level of fine grained control. The old pstate interface gives you 3 discrete clock levels. CPPC gives you a continuum of clock levels. The end goal is not to improve performance on workloads the require maximum CPU performance, it's to reduce power usage (and extend battery life) for most workloads. What you want to do is to get the minimum CPU clock required to finish your workload in an acceptable time. You can save more power if you have a wider range of clocks to pick from.
            Well, it would definitely be nice to not regress those workloads either. Also I don't think Michael found any significant power efficiency improvements either?

            Comment


            • #26
              Originally posted by agd5f View Post
              CPPC isn't magic. You can't suddenly get max performance and min power usage at the same time. The main difference between CPPC and the old pstate interface is the level of fine grained control. The old pstate interface gives you 3 discrete clock levels. CPPC gives you a continuum of clock levels. The end goal is not to improve performance on workloads the require maximum CPU performance, it's to reduce power usage (and extend battery life) for most workloads. What you want to do is to get the minimum CPU clock required to finish your workload in an acceptable time. You can save more power if you have a wider range of clocks to pick from.
              I agree that CPPC is better than the ACPI driver on all fronts. However, I don't think the impact of the various (non-performance) governors on video encoders should be dismissed. A video encoder just cranks on CPU for many minutes without blocking on vblank, user input, network, or anything else. Such a workload should have the full performance of the CPU, unless the user has expressed a preference for perf/w instead of raw perf, with the kernel's uclamp feature, or by restricting the maximum clock with cpupower set -u (which could be done via something like power-profiles-daemon, etc.)

              The "acceptable time" for such a workload to finish is about 1 second, so that the UI feels like you push the button, the computer goes *gzshhhhhht*, and it's done. Implicitly, it will never finish in an acceptable time, and should always get the maximum CPU frequency allowed under the current thermal/power/energy constraints.

              And furthermore, pretty much any CPU-only workload run as a benchmark shares this characteristic! There is no latency QoS in a benchmark. CPU just goes brrrr for many seconds while you watch the wall clock. So really I am coming back around to the naive view that any substantial performance loss when using a governor points to a deficiency in the governor.

              Comment

              Working...
              X