A Fix Has Been Proposed For The Slower AMD Performance On Linux 5.11
As outlined in that original article after bisecting the sizable performance regressions and in follow-up tests, AMD hardware performing slower on Linux 5.11 came down to the CPU frequency invariance support introduced this cycle and is utilized by the "Schedutil" CPU frequency scaling governor. With Schedutil often being the default for AMD systems on newer versions of the Linux kernel, this regression on Linux 5.11 compared to prior kernel releases has been unfortunate.
Giovanni Gherdovich of SUSE who worked out the AMD CPU frequency invariance support in cooperation with AMD engineers has now come up with a fix. From his analysis of the situation, "The problem happens on CPU-bound workloads spanning a large number of cores. In this case schedutil won't select the maximum P-State. Actually, it's likely that it will select the minimum one. A CPU-bound workload puts the machine in a state generally called "over-utilization": an increase in CPU speed doesn't result in an increase of capacity. The fraction of time tasks spend on CPU becomes constant regardless of clock frequency (the tasks eat whatever we throw at them), and the PELT invariant util goes up and down with the frequency (i.e. it's not invariant anymore)."
Giovanni was indeed able to reproduce the significant performance hits, such as with one test case, "See how the 128 threads case is almost 40% worse than baseline in v5.11-rc4."
The patch to fixing this Linux 5.11 Git performance regression is now out on the Linux kernel mailing list.
I'll be firing off a large round of benchmarks on multiple AMD systems tomorrow to confirm this regression is indeed sorted out and the AMD Ryzen / EPYC performance looking in good order for Linux 5.11 that will debut as stable next month. Will be back through with confirmation in the next day or two.
UPDATE: Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance