Announcement

Collapse
No announcement yet.

"Nest" Is An Interesting New Take On Linux Kernel Scheduling For Better CPU Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • ptr1337
    replied
    Patch seems now to work on Intel Single Socket machines also.
    A version based against 5.19 you can find here:
    https://github.com/CachyOS/kernel-pa...001-NEST.patch

    Leave a comment:


  • yump
    replied
    Originally posted by jason.oliveira View Post

    It literally un-does everything retpoline is supposed to do. That's where the performance increase comes from.
    Wat? This is about how the kernel chooses which CPU to wake up threads on.

    Leave a comment:


  • jason.oliveira
    replied
    Originally posted by ATLief View Post
    Like all nice things, this probably comes with some obscure security vulnerability.
    It literally un-does everything retpoline is supposed to do. That's where the performance increase comes from.

    Leave a comment:


  • willmore
    replied
    Originally posted by JosiahBradley View Post
    I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.
    There are two competeting problems here:
    1. There is latency in clocking up an idle core
    2. The last core running a task is optimal (if the task has much cache footprint)
    Depending on the tasks you're running, it makes sense to optimize for one of these cases or the other. If you have a task with a light memory footrpint, it won't benefit from being put back on the last core it ran (on the assumption that the cache will still be 'hot') as it doesn't realy on the cache much. It would make far better sense putting it on the most recently vacated core as that core should still be at high clocks and the hit to not having 'hot' cache for that task will be minimal as the task doesn't care about that.

    But, the typical task is cache sensitive--that's why we have caches. Take a look at the recent post on comparing the 5800X and the 5800X3D. The slower clocked processor with more cache won it most of the tests vs the faster-per-core-less-cache case.

    So, conventional wisdom is that keeping a task on a core is best. If it's the case that we most optimize for (big compute task that runs for a long time), then the core it came from shoulds still be hot as the only reason the task isn't on that core is that something higher priority bumped it off recently. Should the core, for some reason, have been allowed to get clocked down, the big cache footprint task would still benefit from getting back on that core.

    You're really going to have to hunt around for a workload where this NEST code makes more sense. And, even than, the better solution for all cases is to improve core startup latencies. So, it may nice to have this algo hanging around for the few cases where it makes sense, I don't see this being a widely useful solution.

    Leave a comment:


  • waxhead
    replied
    Originally posted by JosiahBradley View Post
    I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.
    Sorry for being a jackass at the party. Sure the CPU does not get magically faster, but this is like having a lot of cars going back and forth over a hill with boxes of something. When there is no work the car stops and have to accelerate to go over the hill. It makes sense to throw more boxes on cars already running to avoid the time it takes to accelerate the other cars. yes, the hardware makes for super fast acceleration - but you still get a little more done.

    As for energy efficiency - Sometimes doing work at 200% speed at 100W is the same as doing something at 100% speed at 50W. And speaking of security - I think you are spot on. It will probably make paranoid people loose what's left of their hair... (darn... I forget how I look in front of a mirror!)

    The real answer here, as with so many thing is - it depends... I do not think that most yapping around here on the forums (me included) will notice much of significance on their desktops. However with any gazillion cores CPU that actually does things other than staying idle most of the time this may have huge benefits.

    Leave a comment:


  • JosiahBradley
    replied
    I hate to ruin the party but simply allowing the hardware power logic speed cores up faster and race to idle faster would be much better and that's exactly what's already being worked on. This scheduler isn't actually faster, it's wasting power efficiency to do more work, which is literally done by spinning cores. This also leads to an obvious side channel attack on the scheduler itself unless the spin is random. This is a looks neat on paper, bad in practice thing.

    Leave a comment:


  • waxhead
    replied
    The nest scheduler is not actually a replacement for the CFS , it is a refinement to how CFS works. I did see the entire talk and while CFS does apparently look ahead to find a free (timeslice) on a core , Nest tries to minimize overhead on each core instead of spreading the tasks across as many cores as possible. Note that in the talk they talked about introducing artificial "go to the next core" about every second or so to avoid hotspots on sillicone and stuff like that.

    Leave a comment:


  • yump
    replied
    Originally posted by Linuxxx View Post

    So I take it this means that this scheduler has no chance in beating CFS + performance, neither in throughput nor in power-efficiency?
    If I am interpreting slides 32 and 34 correctly, it beats CFS+performance in throughput, and loses on efficiency as you would expect from higher clock speed. Remember cpufreq isn't the only thing that can modulate the CPU frequency. The CPU's embedded power management controller has it's own rules about turbo frequency vs number of active cores.

    Leave a comment:


  • NobodyXu
    replied
    Originally posted by yump View Post
    So this aspect of the work might not be applicable outside of server parts.
    I don't think this will be useful for servers, since they are usually overloaded with tasks to maximize profit.

    Leave a comment:


  • Linuxxx
    replied
    Originally posted by yump View Post
    Finally, this might be working around schedutil's glass jaw with multi-threaded serial workloads. By packing all the threads onto one core, schedutil would see that core as fully loaded even if each task spends a lot of it's time asleep. The downside would be unnecessarily higher energy use when the threads are actually independent.
    So I take it this means that this scheduler has no chance in beating CFS + performance, neither in throughput nor in power-efficiency?

    Leave a comment:

Working...
X