Announcement

Collapse
No announcement yet.

Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

    Phoronix: Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

    Not only is the AMD "CPU frequency invariance regression" from that new support with the in-development Linux 5.11 kernel on course to address the performance shortcomings I outlined last month, but with the patched kernel for a number of workloads the performance is now ahead of where it was at with Linux 5.10.

    http://www.phoronix.com/vr.php?view=29875

  • #2
    This is great. Michael you deserve much credits on that one.

    One can wonder if the change in calculating freq_next, as explained on the mailing list for the patch, would not benefit Intel processors as well? In addition, the denoise test used in the post did not show any improvement compared to 5.10, just a return to an equal performance. The performance gain is a surprise, a good one at that.
    Last edited by zeb_; 21 January 2021, 03:20 PM.

    Comment


    • #3
      Also I still do not understand it is AMD-specific. The author writes: "The problem happens on CPU-bound workloads spanning a large number of cores. In this case schedutil won't select the maximum P-State. Actually, it's likely that it will select the minimum one." Why these workloads spanning a large number of cores do not happen on Intel? One would think this is a schedutil issue, regardless of the CPU?

      Comment


      • #4
        From the patch email:
        > ... essentially giving freq_next some more headroom to grow in the over-utilized case. This is the approach also followed by intel_pstate in passive mode.
        So Intel is using the same approach already

        Comment


        • #5
          So Intel and AMD have different implementations of schedutil in the kernel? I find it quite surprising, I would have assumed that there would be one algorithm to calculate the next frequency step, depending on load, but that only the bare metal calls would differ. But I am not a specialist

          Comment


          • #6
            Originally posted by zeb_ View Post
            Also I still do not understand it is AMD-specific. The author writes: "The problem happens on CPU-bound workloads spanning a large number of cores. In this case schedutil won't select the maximum P-State. Actually, it's likely that it will select the minimum one." Why these workloads spanning a large number of cores do not happen on Intel? One would think this is a schedutil issue, regardless of the CPU?
            Because AMD makes 64 core processors affordable?

            Edit: double-checked the original post (https://www.phoronix.com/scan.php?pa...chedutil&num=3), Xeon was not tested at all.

            Comment


            • #7
              With schedutil being in better shape than in .10 with this, I'm curious how it performs against performance now.

              Comment


              • #8
                Originally posted by zxy_thf View Post
                Because AMD makes 64 core processors affordable?

                Edit: double-checked the original post (https://www.phoronix.com/scan.php?pa...chedutil&num=3), Xeon was not tested at all.
                Still does not explain difference of implementation as pointed out above by luben.

                Comment


                • #9
                  Originally posted by zeb_ View Post
                  Still does not explain difference of implementation as pointed out above by luben.
                  I think its an odd assumption to think Intel and AMD CPUs behave in the very same way. TSMC 7nm has different physical properties than Intels 14nm and thus one needs to set the parameters differently. Also the whole boosting approach is different. Intel uses some sort of thermal velocity boost for short bursts, some sort of longer-boost period with a timer attached to it and then you have your base frequency. AMD is boosting their cores in a much more individual fashion with implementing certain critical limiters like temperature or maximum current all within a different onset and offset. Those boosting mechanisms are backed up by several sensors measuring those properties and these also differ from Intels.

                  So why again should it be "the same"? At this point you could say AMD is using a different paradigm, so it should be implemented differently.

                  Comment


                  • #10
                    Originally posted by Shevchen View Post
                    So why again should it be "the same"? At this point you could say AMD is using a different paradigm, so it should be implemented differently.
                    I thank you for this detailed explanation. I suppose I should look directly at the code to better understand how this works.

                    Why or not it should be the same: I was referring to the explanation from the patch description itself: "The solution we implement here is a stop-gap one: when the driver is acpi_cpufreq and the machine an AMD EPYC, schedutil will use max_boost instead of max_P as the value for freq_max in its formula freq_next = 1.25 * freq_max * util" I assumed those max_ values depended on software load. But if max_boost is indeed based on CPU physical characteristics (i.e. current and temperature) and then I can conceive that there are differences between the 2 founders, since both CPUs will have different patterns under load.

                    It would be interesting to test a 16-core Intel CPU to see if kernel 5.11 (before this patch) also displays lack of performance.

                    Comment

                    Working...
                    X