If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Announcement
Collapse
No announcement yet.
TurboSched Is A New Linux Scheduler Focused On Maximizing Turbo Frequency Usage
AMD CPUs don't care, they also have no artificial limit on boost clock duration. Only limit is thermal (= physical), if your heatsink is able to cool the system, the CPU can stay in "boost" indefinitely.
This is way more complex on Zen 2. Not only thermals are taken into consideration but also voltage and current stability. The power management is so fast and responsive that even using a program with a wrong methodology of polling core voltage suffers from the observer effect and reports way higher loads/voltages in idle than in reality. What is more the chipset drivers on Windows take advantage of CPPC2 to switch the core power states way faster.
AMD has even made Windows 10 scheduler aware of the core quality in the processor since different cores have different maximum boost and the scheduler will take advantage of that so light tasks are pinned onto the fastest cores. The 1903 update also allows the scheduler to be aware of CCX topology so that threads of a process can be scheduled in a way that avoids increased CCX-to-CCX latency (which has been improved in Zen 2). Standard Windows SMT-aware scheduling still applies.
This leads to a situation in which the most demanding process will land on the best core with threads going to physical cores (instead of SMT) in the same CCX first.
Indeed, POWER can turbo all cores indefinitely, as long as your cooling can handle it (exceptions for the very high core count models to not blow up the mobo).
What negatives do you have in mind (on a modern CPU)?
Afaik Intel CPUs are actually disabling (putting in low C-states or even shutting down completely) unused cores when in turbo boost mode. Thawing these cores isn't fast.
AMD CPUs don't care, they also have no artificial limit on boost clock duration. Only limit is thermal (= physical), if your heatsink is able to cool the system, the CPU can stay in "boost" indefinitely.
I strongly suspect that also POWER9 processors don't have stupid limitations on duration nor go and disable cores.
I'm a little surprised nobody has bothered to try this sooner. Seems a little too obvious.
Sadly at workloads where performance really matters (HPC and such) this would have negligible impact on runtime but major impact on power usage. Most HPC workloads are limited by available memory bandwidth. I can demonstrate a pathological case where running cpu at 3.6GHz finishes only 13% faster than running it at 1.2GHz. Now imagine energy-to-result metric and you suddenly realize that you're better off with running your clusters at lower frequencies. That's the main reason why server grade cpus all run on lower frequencies than desktop grade ones. And this will hold true until server grade cpus come standard with some HBM on the cpu die. Intel Xeon phi aside, this should happen in 2021 or 2022.
Well, I have in mind a need for processing that causes latency. Say, a query comes in that parallelizes and demands eight cores. But you put those cores into a deep C-state to save power. Suddenly, they're not there when needed. Of course, this could be the case for any situation, but the way it works it may make it more likely than something which makes it easy to grab a hot (if not as fast) CPU.
Most CPUs are still in the hundreds of MHz when idle. So, let's say your CPU idles at 500MHz. That's a 2 nanosecond delay between cycles. That's pretty much the longest you'll have to wait on idle cores, since the cores should otherwise ramp up their frequency once needed. If that delay is too long for you, I think your power profile is going to be a greater concern than the scheduler.
This is a tough one because it's a few hundred MHz vs L1/L2 cache thrashing. I think I would personally prefer consistent speed across multiple cores as a desktop user. This is the sort of thing that one keeps in the tuning tools hand bag for specific applications.
Well, I have in mind a need for processing that causes latency. Say, a query comes in that parallelizes and demands eight cores. But you put those cores into a deep C-state to save power. Suddenly, they're not there when needed. Of course, this could be the case for any situation, but the way it works it may make it more likely than something which makes it easy to grab a hot (if not as fast) CPU.
I could see this having a real impact, however it could as easily be negative as it is positive.
What negatives do you have in mind (on a modern CPU)? The only negative side effects I'd see is maybe on newer Intel CPUs with a limited duration PL2. It really depends on how the CPU keeps track of how long it has been boosted. For example, if the CPU resets (or at least reduces) its boost counter whenever it switches cores then this scheduler could negatively impact how long it'll remain boosted.
Leave a comment: