Originally posted by Shevchen
View Post
Announcement
Collapse
No announcement yet.
Linux 5.11 Is Now Looking Great For AMD Zen 2 / Zen 3 Performance
Collapse
X
-
Originally posted by zeb_ View PostSo Intel and AMD have different implementations of schedutil in the kernel? I find it quite surprising, I would have assumed that there would be one algorithm to calculate the next frequency step, depending on load, but that only the bare metal calls would differ. But I am not a specialist
Really, none of this is going to be a good default untill schedutil has an understanding of per core power/temparature/load targets in order to know wether or not a core can reach boost frequencies. I couldn't be more happier that I dont have to solve this expensive multivariable equation during runtime without significant cost.
- Likes 2
Comment
-
Originally posted by Yttrium View Postwith moderin high core count CPU's the all core maximum frequency is very low, especially on AMD systems
The only disadvantage right now is gaming, where you want a high single core frequency and a couple of good boosting cores on top. So my 5950X with 5GHz on one core and 4.6GHz on others would benefit in games more than 4.7GHz all-core - in theory.
Still, the difference is extremely minor as in those frequency ranges (similar to Zen 2) a higher frequency doesn't really translate in more perforemance anymore (still games), as latency is much more important at this point. On Zen 2, the magical barrier was somewhere between 4.2 and 4.3GHz when it stopped scaling and then you went on to tweak the memory instead. For Zen 3 this limit might be a little higher - like 4.5GHz for example - but my sample does that.
So in practical terms, my all-core overclock is still the best of all worlds. But the scheduler doesn't know that. Also its not very energy efficient, as my idle power consumption is higher due to the fact, that the processor doesn't really know idle anymore.
And I agree: Schedutil does need to be expanded to "know" about all the properties about heat, current, voltage, all the C-states, load, CCD/CCX balance (some sort of thread clustering) etc
Until then, I'll ride my manually tuned all-core setup.
Comment
-
Originally posted by Shevchen View PostAnd I agree: Schedutil does need to be expanded to "know" about all the properties about heat, current, voltage, all the C-states, load, CCD/CCX balance (some sort of thread clustering) etc
Until then, I'll ride my manually tuned all-core setup.
- Likes 3
Comment
-
Originally posted by agd5f View Post
This seems a bit like second guessing the hardware. The power management controllers on the CPU already monitor all of this with much better latency and accuracy then the OS could. I thought the whole point of CPPC was more to give the CPU a hint as to what the target performance range is so the power management unit can better tune it's dynamic power control, not as a way to override what the CPU is trying to do.
And yes, as CPUs get more complex, the scheduler needs to be aware of this complexity. So if I throw 12 threads around and 8 of them are random tasks and 4 of them are linked to the same memory area, those 4 should work together on the same CCD/CCX (depending on the CPU) to reduce cache misses/reloads/reflushes etc
Its kinda a henn/egg problem at work here.
edit: 1usmus has created an automatic tuning software to gauge the CPU. I imagine that a side effect of his tuning algo could be used by the scheduler as optional input instead of "default" values - maybe.Last edited by Shevchen; 22 January 2021, 01:59 PM.
Comment
-
Originally posted by Shevchen View Post
Well - kinda. If the OS is aware of the limitations the CPU is using to reach its maximum performance, it can plan its scheduling accordingly. It doesn't need to be perfect (thats the job of the internal control mechanisms of the CPU), but throwing a thread on a CPU core that is already 90°C is probably not such a good idea, nor when the current limitation has reached a high state. Temps don't drop that fast so thats a thing I think the OS can handle with lower accuracy. Current/Voltage however... maybe not. There are also cores that can boost higher than others due to silicon lottery. I'm not sure how to tackle this exactly (maybe probing them and create a custom boost table?)
And yes, as CPUs get more complex, the scheduler needs to be aware of this complexity. So if I throw 12 threads around and 8 of them are random tasks and 4 of them are linked to the same memory area, those 4 should work together on the same CCD/CCX (depending on the CPU) to reduce cache misses/reloads/reflushes etc
Its kinda a henn/egg problem at work here.
edit: 1usmus has created an automatic tuning software to gauge the CPU. I imagine that a side effect of his tuning algo could be used by the scheduler as optional input instead of "default" values - maybe.
- Likes 1
Comment
Comment