Announcement

**zeb_** · 21 January 2021, 03:17 PM

This is great. Michael you deserve much credits on that one.

One can wonder if the change in calculating freq_next, as explained on the mailing list for the patch, would not benefit Intel processors as well? In addition, the denoise test used in the post did not show any improvement compared to 5.10, just a return to an equal performance. The performance gain is a surprise, a good one at that.

**zeb_** · 21 January 2021, 03:32 PM

Also I still do not understand it is AMD-specific. The author writes: "The problem happens on CPU-bound workloads spanning a large number of cores. In this case schedutil won't select the maximum P-State. Actually, it's likely that it will select the minimum one." Why these workloads spanning a large number of cores do not happen on Intel? One would think this is a schedutil issue, regardless of the CPU?

**luben** · 21 January 2021, 05:05 PM

From the patch email:
> ... essentially giving freq_next some more headroom to grow in the over-utilized case. This is the approach also followed by intel_pstate in passive mode.
So Intel is using the same approach already

**zeb_** · 21 January 2021, 06:03 PM

So Intel and AMD have different implementations of schedutil in the kernel? I find it quite surprising, I would have assumed that there would be one algorithm to calculate the next frequency step, depending on load, but that only the bare metal calls would differ. But I am not a specialist

**zxy_thf** · 21 January 2021, 08:38 PM

Originally posted by zeb_ View Post

Also I still do not understand it is AMD-specific. The author writes: "The problem happens on CPU-bound workloads spanning a large number of cores. In this case schedutil won't select the maximum P-State. Actually, it's likely that it will select the minimum one." Why these workloads spanning a large number of cores do not happen on Intel? One would think this is a schedutil issue, regardless of the CPU?

Because AMD makes 64 core processors affordable?

Edit: double-checked the original post (https://www.phoronix.com/scan.php?pa...chedutil&num=3), Xeon was not tested at all.

**geearf** · 21 January 2021, 08:49 PM

With schedutil being in better shape than in .10 with this, I'm curious how it performs against performance now.

**zeb_** · 21 January 2021, 10:47 PM

Originally posted by zxy_thf View Post

Because AMD makes 64 core processors affordable?

Edit: double-checked the original post (https://www.phoronix.com/scan.php?pa...chedutil&num=3), Xeon was not tested at all.

Still does not explain difference of implementation as pointed out above by luben.

**Shevchen** · 22 January 2021, 05:47 AM

Originally posted by zeb_ View Post

Still does not explain difference of implementation as pointed out above by luben.

I think its an odd assumption to think Intel and AMD CPUs behave in the very same way. TSMC 7nm has different physical properties than Intels 14nm and thus one needs to set the parameters differently. Also the whole boosting approach is different. Intel uses some sort of thermal velocity boost for short bursts, some sort of longer-boost period with a timer attached to it and then you have your base frequency. AMD is boosting their cores in a much more individual fashion with implementing certain critical limiters like temperature or maximum current all within a different onset and offset. Those boosting mechanisms are backed up by several sensors measuring those properties and these also differ from Intels.

So why again should it be "the same"? At this point you could say AMD is using a different paradigm, so it should be implemented differently.

**zeb_** · 22 January 2021, 09:19 AM

Originally posted by Shevchen View Post

So why again should it be "the same"? At this point you could say AMD is using a different paradigm, so it should be implemented differently.

I thank you for this detailed explanation. I suppose I should look directly at the code to better understand how this works.

Why or not it should be the same: I was referring to the explanation from the patch description itself: "The solution we implement here is a stop-gap one: when the driver is acpi_cpufreq and the machine an AMD EPYC, schedutil will use max_boost instead of max_P as the value for freq_max in its formula freq_next = 1.25 * freq_max * util" I assumed those max_ values depended on software load. But if max_boost is indeed based on CPU physical characteristics (i.e. current and temperature) and then I can conceive that there are differences between the 2 founders, since both CPUs will have different patterns under load.

It would be interesting to test a 16-core Intel CPU to see if kernel 5.11 (before this patch) also displays lack of performance.

Announcement

Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment