A commit within the in-development Linux 3.6 kernel
has caused the PostgreSQL database server workload to regress by 15~20%. Fortunately, the commit has been spotted.
For the past day there's been a Linux kernel mailing list thread
about a "20% performance drop on PostgreSQL" when moving from the Linux 3.5 to Linux 3.6 RC kernels. This regression in the popular database server was quickly confirmed by Borislav Petkov of AMD who then isolated it down to this commit: sched: Improve scalability via 'CPU buddies', which withstand random perturbations
. While trying to improve the scalability of the Linux scheduler, it's not without problems.
Traversing an entire package is not only expensive, it also leads to tasks bouncing all over a partially idle and possible quite large package. Fix that up by assigning a 'buddy' CPU to try to motivate. Each buddy may try to motivate that one other CPU, if it's busy, tough, it may then try its SMT sibling, but that's all this optimization is allowed to cost.
Sibling cache buddies are cross-wired to prevent bouncing.
4 socket 40 core + SMT Westmere box, single 30 sec tbench runs, higher is better:
clients 1 2 4 8 16 32 64 128
pre 30 41 118 645 3769 6214 12233 14312
post 299 603 1211 2418 4697 6847 11606 14557
A nice increase in performance.
It caused a nice increase for one workload while dramatically affecting another real-world and common Linux workload.
Linus Torvalds was quick to say
the commit in question is confusing, although he committed it to the kernel tree weeks ago prior to the Linux 3.6-rc1 kernel.
Linus ultimately said
, "I vote we just revert it as 'insane'. The code really doesn't seem to make any sense." Again, something living in mainline for weeks.
The discussion has yet to be settled, but at this point it looks like the problematic commit will simply be reverted prior to the final release of the Linux 3.6 kernel.
For those wondering about the state of the Phoromatic Linux Kernel Tracker
, the overhauled version is still a work-in-progress and to be published soon. This daily and per-commit tracker would have likely caught this regression seeing as there's a PostgreSQL pgbench test profile
, but the interface for this kernel tracker is currently inaccessible to the public. The new version of Phoromatic
is powered directly atop OpenBenchmarking.org
and its new architecture that is coming as part of the Randaberg / Unterschleißheim
improvements. If your company is interested by the Linux enterprise software testing capabilities as part of the Phoronix Test Suite
or to simply support these mainline efforts, please contact us