Announcement

**speculatrix** · 16 April 2016, 02:00 PM

I think the paper does a very good job at showing why writing a competent, if not a good, scheduler is so hard. Cache coherency, processor pipelining, interrupt latency, thermal throttling etc, there are many factors at work. When there were just one or two CPUs with independent cache and no HT , things were more deterministic.

**OlafLostViking** · 16 April 2016, 02:31 PM

Originally posted by hmijail View Post

OlafLostViking, the point is exactly that the reason why you have to do the pinning is because the scheduler is doing a bad job. Once the problems are fixed, the pinning should be unnecessary.

Sure, not arguing about that! If we would get a scheduler that also takes all the effects of the memory hierarchy into account, that'll be really awesome and would probably make the necessity of pinning a thing of the past

.

I just wanted to point out that some dude running his HPC code at some super computer/cluster will likely not see these (nice) results since he's already very likely using some kind of pinning (as you said, a bad scheduler being a reason for that). So I just wanted to prevent disappointments due to misunderstandings and we both do agree that this paper seems to be a very nice work. Have a nice day

**johnc** · 16 April 2016, 02:49 PM

Clearly what we need here is SchedulerD.

**liam** · 16 April 2016, 04:30 PM

Originally posted by duby229 View Post

I guess the differentiation between io and process is less clear in my mind than it is in yours. And I suppose that is why I see a problem.

The reason that you need to think about them differently is because the process scheduler is always running while the elevator only kicks in relatively infrequently (in general).
For in memory loads, and for the kernel and developer, achieving that is a very high priority task, the elevator never gets called.
I haven't finished reading the paper yet, but the issue is EXACTLY that the scheduler is starving runnable processes for no good reason. It's not about resources, which might cause the elevator to get involved, it's only about minimizing the wall clock time that each job needs to be run.

It's good this is coming up because the kernel is undergoing some pretty major changes in the scheduling area (the longtime coming scheduler directed dvfs and cpuidle). That alone is a pretty big task, but they sure as hell need to keep the issues in mind that this paper brings up so that the scheduler itself can make the best possible decisions.

**CrystalGamma** · 16 April 2016, 04:47 PM

Originally posted by liam View Post

It's good this is coming up because the kernel is undergoing some pretty major changes in the scheduling area (the longtime coming scheduler directed dvfs and cpuidle). That alone is a pretty big task, but they sure as hell need to keep the issues in mind that this paper brings up so that the scheduler itself can make the best possible decisions.

Oh, there is an upcoming Linux scheduler? I didn't know that …

**SystemCrasher** · 16 April 2016, 05:21 PM

or a shocking 137x performance

These researches are real pros. When it comes to nitpicking some exotic corner case. This said, performance is good, BUT it sems they've been silent about latency. Even if one gains 23% of extra juice, it may or may not be worth of it, depending on how popular use case is and whether it brings some major downsides like greater idle power consumption, worse latency or other issues.

**SystemCrasher** · 16 April 2016, 05:26 PM

Originally posted by johnc View Post

Clearly what we need here is SchedulerD.

Could easily end up like this (though I guess such thing would run kernel side as thread), because it getting more complicated. Right now kernel devs are chewing on making scheduler to expose cpu load data to frequency scaling algos, so cpufreq things could be aware of cpu use as seen by cpu scheduler. It allows cpufreq algos to make better/faster decisions. And if cpu frequency is ramped up faster when cpu load increases, it could both rare case where it could improve both latency AND speed at once.

**caligula** · 16 April 2016, 05:33 PM

Originally posted by alpha_one_x86 View Post

It's why (and optimisation allowed by mono cpu into no SMP kernel) it's better to have 4x single core than 1x quad core
No contention problem, no balancing problem. At least it's not problem into server world and vm.

You must be joking right? If you talk about multi-socket systems, they use the same scheduler and have the same issues and even worse NUMA cache coherency issues. The difference is that with multiple processors the contention is in the front side bus instead of inter-core bus. OTOH, if you have totally separate systems, they can't share resources (e.g. memory). In that case you're comparing apples with oranges.

**Guest** · 16 April 2016, 05:55 PM

Originally posted by johnc View Post

Clearly what we need here is SchedulerD.

It's called ulatencyd and systemd's cgroup stuff is already breaking it.
But with systemd reinventing the wheel is never out of the question, maybe they will even replace lua based stateful policies for something that looks a lot like INI config. Oh wait, they kinda have that already assuming that everything runs as a service.

**peppercats** · 16 April 2016, 06:24 PM

Desktop kernels should use BFS anyways.

Announcement

Is The Linux Kernel Scheduler Worse Than People Realize?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment