The ~200 Line Linux Kernel Patch That Does Wonders
In recent weeks and months there has been quite a bit of work towards improving the responsiveness of the Linux desktop with some very significant milestones building up recently and new patches continuing to come. This work is greatly improving the experience of the Linux desktop when the computer is withstanding a great deal of CPU load and memory strain. Fortunately, the exciting improvements are far from over. There is a new patch that has not yet been merged but has undergone a few revisions over the past several weeks and it is quite small -- just over 200 lines of code -- but it does wonders for the Linux desktop.
The patch being talked about is designed to automatically create task groups per TTY in an effort to improve the desktop interactivity under system strain. Mike Galbraith wrote the patch, which is currently in its third version in recent weeks, after Linus Torvalds inspired this idea. In its third form (patch), this patch only adds 224 lines of code to the kernel's scheduler while stripping away nine lines of code, thus only 233 lines of code are in play.
Tests done by Mike show the maximum latency dropping by over ten times and the average latency of the desktop by about 60 times. Linus Torvalds has already heavily praised (in an email) this miracle patch.
Yeah. And I have to say that I'm (very happily) surprised by just how small that patch really ends up being, and how it's not intrusive or ugly either.
I'm also very happy with just what it does to interactive performance. Admittedly, my "testcase" is really trivial (reading email in a web-browser, scrolling around a bit, while doing a "make -j64" on the kernel at the same time), but it's a test-case that is very relevant for me. And it is a _huge_ improvement.
It's an improvement for things like smooth scrolling around, but what I found more interesting was how it seems to really make web pages load a lot faster. Maybe it shouldn't have been surprising, but I always associated that with network performance. But there's clearly enough of a CPU load when loading a new web page that if you have a load average of 50+ at the same time, you _will_ be starved for CPU in the loading process, and probably won't get all the http requests out quickly enough.
So I think this is firmly one of those "real improvement" patches. Good job. Group scheduling goes from "useful for some specific server loads" to "that's a killer feature".
Initially a Phoronix reader tipped us off this morning of this latest patch. "Please check this out, my desktop will never be the same again, it makes a *lot* of difference for desktop usage (all things smooth, scrolling etc.)...It feels as good as Con Kolivas's patches."
Not only is this patch producing great results for Linus, Andre Goddard (the Phoronix reader reporting the latest version), and other early testers, but we are finding this patch to be a miracle too. While in the midst of some major OpenBenchmarking.org "Iveland" development work, I took a few minutes to record two videos that demonstrate the benefits solely of the "sched: automated per tty task groups" patch. The results are very dramatic. UPDATE: There's also now a lot more positive feedback pouring in on this patch within our forums with more users now trying it out.
This patch has been working out extremely great on all of the test systems I tried it out on so far from quad-core AMD Phenom CPUs systems to Intel Atom netbooks. For the two videos I recorded them off a system running Ubuntu 10.10 (x86_64) with an Intel Core i7 970 "Gulftown" processor that boasts six physical cores plus Hyper Threading to provide the Linux operating system with twelve total threads.
The Linux kernel was built from source using the Linus 2.6 Git tree as of 15 November, which is nearing a Linux 2.6.37-rc2 state. The only change made from the latest Linux kernel Git code was applying Mike Galbraith's scheduler patch. This patch allows the automated per TTY task grouping to be done dynamically on the kernel in real-time by writing either 0 or 1 to /proc/sys/kernel/sched_autogroup_enabled or passing "noautogroup" as a parameter when booting the kernel. Changing the sched_autogroup_enabled value was the only system difference between the two video recordings.
Both videos show the Core i7 970 system running the GNOME desktop while playing back the Ogg 1080p version of the open Big Buck Bunny movie, glxgears, two Mozilla Firefox browser windows open to Phoronix and the Phoronix Test Suite web-sites, two terminal windows open, the GNOME System Monitor, and the Nautilus file manager. These videos just show how these different applications respond under the load exhibited by compiling the latest Linux kernel using make -j64 so that there are 64 parallel make jobs that are completely utilizing the Intel processor.
So let's watch these videos!