Announcement
Collapse
No announcement yet.
"Nest" Is An Interesting New Take On Linux Kernel Scheduling For Better CPU Performance
Collapse
X
-
Originally posted by birdie View PostI've been thinking about that for years, it's strange no one has raised the issue earlier. The Linux kernel is notorious for its penchant for juggling tasks between CPU cores for no reasons and that results in emptying whatever you had in L1/L2 caches prior and non-zero delays considering new CPU cores could be at their absolute lowest power settings when they are given a task to execute.
You could simply run:
7z b -mmt1
And see in top or any graphical process manager how the task is thrown between CPU cores.
[Granted, in a situation where you only have a handful of high workload threads, it would be "more" ideal if those were locked to their cores and the rest of the threads fought over the remaining ones. This does incur a latency hit though, and breaks down as the number of "high workload" threads increases.]
- Likes 11
Leave a comment:
-
Originally posted by V1tol View PostBORE is just patched CFS. None of the schedulers I know address this task assignment problems. They are mostly focused on efficiency of task prioritization but none of them actually take into account CPU cores.
Leave a comment:
-
Originally posted by ms178 View PostI use BORE as of late, but I don't know if that already accounts for this.
- Likes 4
Leave a comment:
-
But this implies that the die layout has to be hardcoded for any cpu or at least any generation especially for the large server cpus...or maybe its given by staying in a numa node (for the lager ones)? maybe it is less problematic then i think
edit. There is another culprit aswell.
Lets imagine the scheduler chooses for performance the best high clockable core. Statistically one out of n is the best and will it always be. This would result in a potential overusage of one core and its neighbours. This will result in more thermal wearout on this particular core (and its neighbour.Last edited by CochainComplex; 15 September 2022, 07:25 AM.
- Likes 4
Leave a comment:
-
Originally posted by ms178 View PostMaybe some of the many custom schedulers are already more clever than CFS? I use BORE as of late, but I don't know if that already accounts for this.
- Likes 1
Leave a comment:
-
Originally posted by birdie View PostI've been thinking about that for years, it's strange no one has raised the issue earlier. The Linux kernel is notorious for its penchant for juggling tasks between CPU cores for no reasons and that results in emptying whatever you had in L1/L2 caches prior and non-zero delays considering new CPU cores could be at their absolute lowest power settings when they are given a task to execute.
- Likes 3
Leave a comment:
-
I haven't tested it much but on my first run taskset -c 0 7z b -mmt1 is faster than just running 7z b -mmt1 (almost completely idle Ryzen 5800X here).
- Likes 2
Leave a comment:
-
Basically they found out that you get better performance by tweaking the scheduling of the workload in a way that makes use of higher boost frequencies when using less cores. Wow, who would have thought?!
I am not surprised that this only affects low to mid-level workloads the most either, as your frequencies fall down the cliff on core-heavy workloads. And yes, as I have a undervolted 18-Core-Xeon with an unlocked turboboost, that CPU is easily hitting its TDP limit, mind you. So I've made use of this behavior to get better performance out of the CPU for a long time now (and am not the only one doing so). It would be great if the Linux Scheduler would be aware of this behavior and make more optimal choices, of course. I only wonder why this is not the case yet, as things like TDP limitations and different base/turbo frequencies are nothing new.
Volta Maybe some of the many custom schedulers are already more clever than CFS? I use BORE as of late, but I don't know if that already accounts for this.Last edited by ms178; 15 September 2022, 06:40 AM.
- Likes 3
Leave a comment:
Leave a comment: