Announcement

**intelfx** · 22 January 2021, 10:23 AM

Originally posted by Shevchen View Post

I think its an odd assumption to think Intel and AMD CPUs behave in the very same way. TSMC 7nm has different physical properties than Intels 14nm and thus one needs to set the parameters differently. Also the whole boosting approach is different. Intel uses some sort of thermal velocity boost for short bursts, some sort of longer-boost period with a timer attached to it and then you have your base frequency. AMD is boosting their cores in a much more individual fashion with implementing certain critical limiters like temperature or maximum current all within a different onset and offset. Those boosting mechanisms are backed up by several sensors measuring those properties and these also differ from Intels.

So why again should it be "the same"? At this point you could say AMD is using a different paradigm, so it should be implemented differently.

This is nice and all, but the CPU frequency scaling code in Linux does not take anything of this into account.

**Yttrium** · 22 January 2021, 10:25 AM

Originally posted by zeb_ View Post

So Intel and AMD have different implementations of schedutil in the kernel? I find it quite surprising, I would have assumed that there would be one algorithm to calculate the next frequency step, depending on load, but that only the bare metal calls would differ. But I am not a specialist

The problem they are trying to solve is obscenely hard. Modern processors will vary their frequency depending on the load, the perf-scheduler will assign tasks based on load, schedutil will assign tasks based on the maximum frequency it can attain, not the frequency and load it is currently running at. with moderin high core count CPU's the all core maximum frequency is very low, especially on AMD systems (This is the schedutil default for now) Which means that schedutil will not utilise the high single core frequency on high core count architectures. the 'stopgap' fix here is to use the maximum core frequency for its formula.

Really, none of this is going to be a good default untill schedutil has an understanding of per core power/temparature/load targets in order to know wether or not a core can reach boost frequencies. I couldn't be more happier that I dont have to solve this expensive multivariable equation during runtime without significant cost.

**Shevchen** · 22 January 2021, 12:29 PM

Originally posted by Yttrium View Post

with moderin high core count CPU's the all core maximum frequency is very low, especially on AMD systems

That... depends. If you tweak your CPU correctly (I know thats not a valid option for 99% of the users, but for the sake of the argument please listen to me), you can set quite a nice all-core clock. My 5950X is running on 4.7GHz on all cores and quite some thread-heavy workloads prefer the same frequency on every core.

The only disadvantage right now is gaming, where you want a high single core frequency and a couple of good boosting cores on top. So my 5950X with 5GHz on one core and 4.6GHz on others would benefit in games more than 4.7GHz all-core - in theory.
Still, the difference is extremely minor as in those frequency ranges (similar to Zen 2) a higher frequency doesn't really translate in more perforemance anymore (still games), as latency is much more important at this point. On Zen 2, the magical barrier was somewhere between 4.2 and 4.3GHz when it stopped scaling and then you went on to tweak the memory instead. For Zen 3 this limit might be a little higher - like 4.5GHz for example - but my sample does that.

So in practical terms, my all-core overclock is still the best of all worlds. But the scheduler doesn't know that. Also its not very energy efficient, as my idle power consumption is higher due to the fact, that the processor doesn't really know idle anymore.

And I agree: Schedutil does need to be expanded to "know" about all the properties about heat, current, voltage, all the C-states, load, CCD/CCX balance (some sort of thread clustering) etc
Until then, I'll ride my manually tuned all-core setup.

**agd5f** · 22 January 2021, 12:55 PM

Originally posted by Shevchen View Post

And I agree: Schedutil does need to be expanded to "know" about all the properties about heat, current, voltage, all the C-states, load, CCD/CCX balance (some sort of thread clustering) etc
Until then, I'll ride my manually tuned all-core setup.

This seems a bit like second guessing the hardware. The power management controllers on the CPU already monitor all of this with much better latency and accuracy then the OS could. I thought the whole point of CPPC was more to give the CPU a hint as to what the target performance range is so the power management unit can better tune it's dynamic power control, not as a way to override what the CPU is trying to do.

**Shevchen** · 22 January 2021, 01:55 PM

Originally posted by agd5f View Post

This seems a bit like second guessing the hardware. The power management controllers on the CPU already monitor all of this with much better latency and accuracy then the OS could. I thought the whole point of CPPC was more to give the CPU a hint as to what the target performance range is so the power management unit can better tune it's dynamic power control, not as a way to override what the CPU is trying to do.

Well - kinda. If the OS is aware of the limitations the CPU is using to reach its maximum performance, it can plan its scheduling accordingly. It doesn't need to be perfect (thats the job of the internal control mechanisms of the CPU), but throwing a thread on a CPU core that is already 90°C is probably not such a good idea, nor when the current limitation has reached a high state. Temps don't drop that fast so thats a thing I think the OS can handle with lower accuracy. Current/Voltage however... maybe not. There are also cores that can boost higher than others due to silicon lottery. I'm not sure how to tackle this exactly (maybe probing them and create a custom boost table?)

And yes, as CPUs get more complex, the scheduler needs to be aware of this complexity. So if I throw 12 threads around and 8 of them are random tasks and 4 of them are linked to the same memory area, those 4 should work together on the same CCD/CCX (depending on the CPU) to reduce cache misses/reloads/reflushes etc

Its kinda a henn/egg problem at work here.

edit: 1usmus has created an automatic tuning software to gauge the CPU. I imagine that a side effect of his tuning algo could be used by the scheduler as optional input instead of "default" values - maybe.

**onlyLinuxLuvUBack** · 22 January 2021, 02:08 PM

Originally posted by Shevchen View Post

Well - kinda. If the OS is aware of the limitations the CPU is using to reach its maximum performance, it can plan its scheduling accordingly. It doesn't need to be perfect (thats the job of the internal control mechanisms of the CPU), but throwing a thread on a CPU core that is already 90°C is probably not such a good idea, nor when the current limitation has reached a high state. Temps don't drop that fast so thats a thing I think the OS can handle with lower accuracy. Current/Voltage however... maybe not. There are also cores that can boost higher than others due to silicon lottery. I'm not sure how to tackle this exactly (maybe probing them and create a custom boost table?)

And yes, as CPUs get more complex, the scheduler needs to be aware of this complexity. So if I throw 12 threads around and 8 of them are random tasks and 4 of them are linked to the same memory area, those 4 should work together on the same CCD/CCX (depending on the CPU) to reduce cache misses/reloads/reflushes etc

Its kinda a henn/egg problem at work here.

edit: 1usmus has created an automatic tuning software to gauge the CPU. I imagine that a side effect of his tuning algo could be used by the scheduler as optional input instead of "default" values - maybe.

We should all get together and code the systemd_advanced_scheduler and submit it

**Alliancemd** · 23 January 2021, 06:56 AM

Now it's one of the best perf boost patches

Announcement

Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment