Announcement

**geearf** · 21 April 2016, 08:33 PM

From the patch's github:

The main point of our paper is to raise awareness about issues in the Linux scheduler. The provided patches fix the issues encountered with our workloads, but they are not intended as generic bug fixes. They may have unwanted side effects and result in performance loss or energy waste on your machine.

so it may not be better for our use (whatever that is)

**Turbine** · 21 April 2016, 09:23 PM

While I see heigher throughput for my workloads, there is still some overscheduling happening. There's definitely room for improvement in performance

This sheduler is definitely not designed with battery-life in mind.

**nranger** · 21 April 2016, 09:54 PM

The Linux scheduler "being in worse shape than some people imagine" is kind of a meaningless statement. "Some" people can "imagine" just about anything they want, but Linux has held up pretty well running on everything from watches to supercomputers.

I read the whole wasted cores paper, and the issues they raise are real and I think their analysis is sound. That being said, the workloads were very specific, and they mainly tested on a 64 core AMD system. Saying a test ran 138 times worse sounds fantastical, but I'd be willing to bet the performance impact isn't nearly the same on a more mundane 6, 8, or 10 core system a typical small server or workstation runs. For people chomping at the bit to see Phoronix benchmarks, don't hold your breath for some magical leap in performance. We're not suddenly going to see 100 more fps in CS:GO.

These patches are welcome, and I look forward to seeing how they move the scheduler design into the future, but I doubt they will land in mainline as currently presented. They don't address power management, and over the last few years the focus of discussion around Linux scheduling has been around crafting a design that takes into account more than process load and time slices. The evolution of the scheduler has to take into account more power states (that have vastly different switch times), changes in cache and memory architectures, the ability to efficiently sleep while also catering to real time, etc. The wasted cores patches fix a number of bugs that mostly hurt x86 NUMA, but the smarter scheduling algorithms that actually land in mainline will be more interesting in how they interact with cpufreq/pstate and the other platform drivers.

**flubba86** · 21 April 2016, 11:52 PM

Why does the referenced script build a patched 4.1 kernel? Can the same technique be applied to a 4.4 kernel?

**liam** · 22 April 2016, 12:23 AM

Originally posted by geearf View Post

From the patch's github:

so it may not be better for our use (whatever that is)[/SIZE][/FONT][/COLOR]

Your use would be anything except HPC.
Everything else benefits from race to idle.

**dbpalan** · 22 April 2016, 02:07 AM

Originally posted by nranger View Post

the performance impact isn't nearly the same on a more mundane 6, 8, or 10 core system.

This is what should be highlight again. Most PC typically with single socket 2/4 core 2/4/8 threads running ordinary desktop applications and games does not have measurable impact.

For multi-sockets multi-cores system, CPU pining (using "taskset") workaround the issue.

**SethDusek** · 22 April 2016, 04:27 AM

Please compare this to BFS

**Turbine** · 22 April 2016, 01:05 PM

Originally posted by flubba86 View Post

Why does the referenced script build a patched 4.1 kernel? Can the same technique be applied to a 4.4 kernel?

Not in its current form, the API for scheduling has changed in modern Linux versions - slightly.

**kebabbert** · 22 April 2016, 01:33 PM

Originally posted by nranger View Post

The Linux scheduler "being in worse shape than some people imagine" is kind of a meaningless statement. "Some" people can "imagine" just about anything they want, but Linux has held up pretty well running on everything from watches to supercomputers.

Supercomputers are clusters. Lot of small PCs sitting on a fast switch, running parallel HPC scale-out workloads. However, some scale-up workloads can not be parallelized, typically business workloads such as SAP or OLTP databases, such scale-up workloads can only be run on large 16 or even 32-socket Unix servers or Mainframes. On the Enterprise business server arena, the largest Linux server (until a couple of months ago) had 8-sockets. For the really large business SAP workloads you need 32 sockets or so, and such Linux business servers does not exist. Recently a 16-socket Linux server was released, HP Kraken which is a redesigned Unix server (which scaled to 64-sockets). But Linux version scales only to 16-sockets. And performance is bad, just look at the SAP benchmarks. The top spot is all taken by Unix servers. Linux comes far below. Fact is, Linux scales bad on large servers and until recently had scaling problems on 8-socket servers. I dont think the scaling has been improved on 8-socket servers? The Linux kernel developers all have 1-2 sockets, so they target 1-2 sockets. Low end. No one has access to 32-sockets so how can Linux scale well on 32-sockets? Impossible.

As I wrote earlier:
"It is interesting to see what Con Kolivas, the Linux kernel hacker who wrote Linux schedulers, say about the Linux source code:
http://ck-hack.blogspot.se/2010/10/o...s-illumos.html
TL;DR he says that the Linux scheduler code quality sucks.

Also, among Unix and Mainframe sysadmins, Linux have always had a bad reputation of being unstable. During light loads, Linux is stable. Even Windows is stable if the server idles. But under very high load, Linux becomes very jerky and stuttery and some threads finish fast, other threads takes a very long time. Add in the "Ram overcommit syndrome" where Linux randomly kills processes when all of RAM is filled up (imagine Linux killing of the Oracle database process!!!) and it is understandable why Unix sysadmins would never let Linux into their high end Enterprise server halls. It is the same reason that Enterprise companies who use Linux, ALWAYS make sure Linux servers are lightly to medium loaded. They know that if load increases much, there is a high probability that Linux becomes unstable.

Some of these Linux stability problems that Unix sysadmins have talked about for decades, might be explained by the Linux scheduler.
https://en.wikipedia.org/wiki/Critic...nel_criticisms

Announcement

An Easy Way To Build An Ubuntu Kernel With Hopefully Better Scheduler Performance

An Easy Way To Build An Ubuntu Kernel With Hopefully Better Scheduler Performance

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment