"Nest" Is An Interesting New Take On Linux Kernel Scheduling For Better CPU Performance

ptr1337 replied

14 October 2023, 10:33 AM
Originally posted by Kjell View Post

Volta Is it possible to try this alteady today, e.g. in Arch Linux?

Hi,

sched-ext implented NEST as an example scheduler into their source:

nest: Implement Inria-Paris Nest scheduler · sched-ext/sched_ext@c8380be

https://github.com/sched-ext/sched_ext/commit/c8380be5834cd1bd03934a12f584a477a6475b92

Researchers at Inria Paris recently published a paper called OS Scheduling with Nest: Keeping Tasks Close Together on Warm Cores [0]. The core idea of the scheduler is to make scheduling decisions ...

I have updated the AUR linux-sched-ext-git AUR package:

Build Schedulers with makepkg - aur.git - AUR Package Repositories

https://aur.archlinux.org/cgit/aur.git/commit/?h=linux-sched-ext-git&id=0b1deb60ec2f123d68dfcb7b3a5d9fe646c3518a

So, just run:

Code:

paru -G linux-sched-ext-git cd linux-sched-ext-git makepkg -si

This will also compile the example schedulers and install them into

Code:

/usr/bin

.

After booting into the kernel, you can run the example scheduler with following command:

Code:

sudo scx_nest

You can also modify some values, check out

Code:

scx_nest --help

for help

Edit:
I don't know whats up with the phoronix servers, but it is crazy how slow pages are loading.
Sometimes the page does just crash after 1-2 Minutes trying to load the content. Sometimes cloudflare throws a server error. And these issues are very often.
You really should consider a network/server upgrade phoronix
Last edited by ptr1337; 14 October 2023, 10:36 AM.
Likes 2
Leave a comment:
Kjell replied

09 October 2023, 06:56 AM
Volta Is it possible to try this alteady today, e.g. in Arch Linux?
Leave a comment:
ptr1337 replied

22 September 2022, 06:52 AM
Patch seems now to work on Intel Single Socket machines also.
A version based against 5.19 you can find here:

kernel-patches/5.19/sched/0001-NEST.patch at master · CachyOS/kernel-patches

https://github.com/CachyOS/kernel-patches/blob/master/5.19/sched/0001-NEST.patch

Custom Linux kernel patches. Contribute to CachyOS/kernel-patches development by creating an account on GitHub.
Likes 3
Leave a comment:
yump replied

20 September 2022, 11:52 AM
Originally posted by jason.oliveira View Post

It literally un-does everything retpoline is supposed to do. That's where the performance increase comes from.

Wat? This is about how the kernel chooses which CPU to wake up threads on.
Leave a comment:
jason.oliveira replied

20 September 2022, 11:24 AM
Originally posted by ATLief View Post

Like all nice things, this probably comes with some obscure security vulnerability.

It literally un-does everything retpoline is supposed to do. That's where the performance increase comes from.
Leave a comment:
willmore replied

16 September 2022, 07:39 AM
Originally posted by JosiahBradley View Post

I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.

There are two competeting problems here:
There is latency in clocking up an idle core

The last core running a task is optimal (if the task has much cache footprint)

Depending on the tasks you're running, it makes sense to optimize for one of these cases or the other. If you have a task with a light memory footrpint, it won't benefit from being put back on the last core it ran (on the assumption that the cache will still be 'hot') as it doesn't realy on the cache much. It would make far better sense putting it on the most recently vacated core as that core should still be at high clocks and the hit to not having 'hot' cache for that task will be minimal as the task doesn't care about that.

But, the typical task is cache sensitive--that's why we have caches. Take a look at the recent post on comparing the 5800X and the 5800X3D. The slower clocked processor with more cache won it most of the tests vs the faster-per-core-less-cache case.

So, conventional wisdom is that keeping a task on a core is best. If it's the case that we most optimize for (big compute task that runs for a long time), then the core it came from shoulds still be hot as the only reason the task isn't on that core is that something higher priority bumped it off recently. Should the core, for some reason, have been allowed to get clocked down, the big cache footprint task would still benefit from getting back on that core.

You're really going to have to hunt around for a workload where this NEST code makes more sense. And, even than, the better solution for all cases is to improve core startup latencies. So, it may nice to have this algo hanging around for the few cases where it makes sense, I don't see this being a widely useful solution.
Leave a comment:
waxhead replied

16 September 2022, 03:41 AM
Originally posted by JosiahBradley View Post

I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.

Sorry for being a jackass at the party. Sure the CPU does not get magically faster, but this is like having a lot of cars going back and forth over a hill with boxes of something. When there is no work the car stops and have to accelerate to go over the hill. It makes sense to throw more boxes on cars already running to avoid the time it takes to accelerate the other cars. yes, the hardware makes for super fast acceleration - but you still get a little more done.

As for energy efficiency - Sometimes doing work at 200% speed at 100W is the same as doing something at 100% speed at 50W. And speaking of security - I think you are spot on. It will probably make paranoid people loose what's left of their hair... (darn... I forget how I look in front of a mirror!)

The real answer here, as with so many thing is - it depends... I do not think that most yapping around here on the forums (me included) will notice much of significance on their desktops. However with any gazillion cores CPU that actually does things other than staying idle most of the time this may have huge benefits.
Likes 1
Leave a comment:
JosiahBradley replied

16 September 2022, 03:28 AM
I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.
Likes 2
Leave a comment:
waxhead replied

16 September 2022, 03:23 AM
The nest scheduler is not actually a replacement for the CFS , it is a refinement to how CFS works. I did see the entire talk and while CFS does apparently look ahead to find a free (timeslice) on a core , Nest tries to minimize overhead on each core instead of spreading the tasks across as many cores as possible. Note that in the talk they talked about introducing artificial "go to the next core" about every second or so to avoid hotspots on sillicone and stuff like that.
Likes 2
Leave a comment:
yump replied

15 September 2022, 10:05 PM
Originally posted by Linuxxx View Post

So I take it this means that this scheduler has no chance in beating CFS + performance, neither in throughput nor in power-efficiency?

If I am interpreting slides 32 and 34 correctly, it beats CFS+performance in throughput, and loses on efficiency as you would expect from higher clock speed. Remember cpufreq isn't the only thing that can modulate the CPU frequency. The CPU's embedded power management controller has it's own rules about turbo frequency vs number of active cores.
Likes 2
Leave a comment:

Announcement

"Nest" Is An Interesting New Take On Linux Kernel Scheduling For Better CPU Performance

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: