A commit within the in-development
Linux 3.6 kernel has caused the PostgreSQL database server workload to regress by 15~20%. Fortunately, the commit has been spotted.
For the past day there's been
a Linux kernel mailing list thread about a "20% performance drop on PostgreSQL" when moving from the Linux 3.5 to Linux 3.6 RC kernels. This regression in the popular database server was quickly confirmed by Borislav Petkov of AMD who then isolated it down to this commit:
sched: Improve scalability via 'CPU buddies', which withstand random perturbations. While trying to improve the scalability of the Linux scheduler, it's not without problems.
Traversing an entire package is not only expensive, it also leads to tasks bouncing all over a partially idle and possible quite large package. Fix that up by assigning a 'buddy' CPU to try to motivate. Each buddy may try to motivate that one other CPU, if it's busy, tough, it may then try its SMT sibling, but that's all this optimization is allowed to cost.
Sibling cache buddies are cross-wired to prevent bouncing.
4 socket 40 core + SMT Westmere box, single 30 sec tbench runs, higher is better:
clients 1 2 4 8 16 32 64 128
..........................................................................
pre 30 41 118 645 3769 6214 12233 14312
post 299 603 1211 2418 4697 6847 11606 14557
A nice increase in performance.
It caused a nice increase for one workload while dramatically affecting another real-world and common Linux workload.
Linus Torvalds was
quick to say the commit in question is confusing, although he committed it to the kernel tree weeks ago prior to the Linux 3.6-rc1 kernel.
Linus ultimately
said, "I vote we just revert it as 'insane'. The code really doesn't seem to make any sense." Again, something living in mainline for weeks.
The discussion has yet to be settled, but at this point it looks like the problematic commit will simply be reverted prior to the final release of the Linux 3.6 kernel.
For those wondering about the state of the
Phoromatic Linux Kernel Tracker, the overhauled version is still a work-in-progress and to be published soon. This daily and per-commit tracker would have likely caught this regression seeing as there's a
PostgreSQL pgbench test profile, but the interface for this kernel tracker is currently inaccessible to the public. The new version of
Phoromatic is powered directly atop
OpenBenchmarking.org and its new architecture that is coming as part of the
Randaberg / Unterschleißheim improvements. If your company is interested by the Linux enterprise software testing capabilities as part of the
Phoronix Test Suite or to simply support these mainline efforts, please
contact us.