Autonomously Finding Performance Regressions In The Linux Kernel
This bisect module starts by analyzing the result file and comparing the selected test(s) that illustrate a significant performance drop (or gain). Once that has been done and the good/bad revisions tested on the running test system, git-bisect is automatically called and handled by this Phoronix Test Suite module. With each bisect, the test with any specific test arguments is called and this module then determines whether that revision resulted in a "good" or "bad" run. The bisect module continues until the commit causing the problem is found. Now and into the future we will also work to expand these capabilities and provide more features during the bisecting process.
While this module was written to locate a Linux kernel performance regression, it can be used with any Git-based package that is suitable for testing by the Phoronix Test Suite (read: we can just as easily track down performance regressions in the X Server, Mesa, and drivers). Of course, an external build/install script needs to be written for setting up the environment, but this module should be very extensible. In Phoronix Test Suite 2.2 and future releases, the capabilities of the bisect module and our analytical options within this GPLv3 code should expand. Of course, this is for finding a performance regression from the past. For properly monitoring the performance of system code as its developed, one could also run Phoromatic, which will have the capability of automatically calling test runs after a Git commit is made and to then report the results back to the central repository in real-time rather than debugging regressions well after the fact.
Did the Phoronix Test Suite end up being successful in tracking down this PostgreSQL performance regression in the Linux kernel? Oh yes it did. The very significant drop in PostgreSQL's performance in the Linux 2.6.32 kernel with default options can be attributed to this lone Git commit that is for a fix to address cache flushing in ext4_sync_file for the EXT4 file-system. This commit improves data integrity in the event of a power loss or other problem, but carries a high disk performance penalty. After the Phoronix Test Suite module reported that this was the faulty commit, it was manually confirmed too. When asked about the commit, Red Hat's Eric Sandeen shared:
"Hey, thanks for doing the digging :)
It is required for safe behavior with volatile write caches on drives.
You could mount with -o nobarrier and it would go away, but a sequence like write->fsync->lose power->reboot may well find your file without the data that you synced, if the drive had write caches enabled.
If you know you have no write cache, or that it is safely battery backed, then you can mount with -o nobarrier, and not incur this penalty.
Many kernels were built in the process of tracking down this PostgreSQL performance drop triggered by this default EXT4 change, but fortunately through the Phoronix Test Suite it required very little user intervention once the bisect module was running thanks to a combination of PTS Bardu and Git. Rather than having to use git-bisect manually and building out the software each time, the Phoronix Test Suite can cover it and just tell you if/where a performance regression occurs all while you are sitting back and watching. Look for this capability in Phoronix Test Suite 2.2 "Bardu" along with major GTK user-interface improvements, verification of statistical significance in benchmark results, and many other new features. You can expect to read more on these automated revision-traversing testing capabilities at Phoronix as its features are brought forward and extended to drive a new level of benchmarking in open-source software.
If you enjoyed this article consider joining Phoronix Premium to view this site ad-free, multi-page articles on a single page, and other benefits. PayPal or Stripe tips are also graciously accepted. Thanks for your support.