Statistical Significance In Benchmark Results

Posted by Michael Larabel on September 25, 2009

For those of you following the developments of Phoronix Test Suite 2.2 (codenamed "Bardu"), some new benchmarking features were pushed into its Git tree this week. The latest Phoronix Test Suite 2.2 code now has better FreeBSD 8.0 compatibility and support for network proxies with network communication, but larger than that is new support for ensuring test results are statistically significant.

When any test profile is set to run multiple times, the Phoronix Test Suite is now capable of computing the standard deviation between each of the test runs. If the standard deviation of the test results exceeds a certain threshold (it's currently defined at 3.50%, but it's adjustable through the ~/.phoronix-test-suite/user-config.xml file), the Phoronix Test Suite will automatically increase the number of times that the test profile is to be run. This is done in hopes of lowering the standard deviation of the results, to ensure that the produced result is accurate. There are also safeguards in place against uselessly calling a test profile to run too many times, if the standard deviation is not changing, etc.

Through the user-config.xml file this option can be disabled entirely using the DynamicRunCount tag in the Statistics area as with the StandardDeviationThreshold. There is also a LimitDynamicToTestLength option for not applying this feature to tests that take longer than a defined amount of time to run.

Therefore to sum it up, by default if the Phoronix Test Suite notices the results for any test profile are starting to deviate, it can automatically increase the number of times the test is running in order to hopefully produce more accurate results. This new support is available through a Git snapshot today and can be found in Phoronix Test Suite 2.2 Alpha 3 to be released within the next week. Additional statistics / analytical features will also be coming to Phoronix Test Suite 2.2. To find out about some of the other features already available in 2.2 Bardu, read this news entry.

Discuss this article in our forums, IRC channel, or email the author. You can also follow our content via RSS and on social networks like Facebook, Identi.ca, and Twitter (@Phoronix and @MichaelLarabel). Subscribe to Phoronix Premium to view our content without advertisements, view entire articles on a single page, and experience other benefits.
Latest Hardware Reviews
  1. Sumo Lounge Emperor
  2. Gallium3D Continues Improving OpenGL For Older Radeon GPUs
  3. 15-Way Open vs. Closed Source NVIDIA/AMD Linux GPU Comparison
  4. Nouveau vs. NVIDIA Linux Comparison Shows Shortcomings
Latest Software Articles
  1. GCC 4.8.0 vs. LLVM Clang 3.3 Compiler Performance
  2. Intel Linux OpenGL Driver Leading Over Apple OS X
  3. The Cost Of Ubuntu Disk Encryption
  4. Btrfs vs. EXT4 vs. XFS vs. F2FS On Linux 3.10
Latest Linux News
  1. A New X.Org-Free Wayland LiveCD Released
  2. Unity 8, Mir Made Progress This Week On Features
  3. LLVM Clang 3.3 RC2 Is Ready For Testing
  4. AMD RadeonSI Gallium3D Begins Simple CL Demos
  5. Intel Shows Off GNOME3-Based Tizen Shell
  6. Linux Desktop Security Could Be A Whole Lot Better
  7. KDE 4.11 Will Be The Last Major KDE4 Workspaces Feature Release
  8. New NVIDIA Linux Driver Supports The GeForce GTX 780
  9. Chrome 28 To Offer More Speed Improvements
  10. Digia Announces "Boot To Qt" Project
  11. X.Org Libraries Hit By Round Of Security Issues
Latest Forum Talk
  1. Intel Shows Off GNOME3-Based Tizen Shell
  2. Is there anyway to improve the performance of the...
  3. KDE 4.11 Will Be The Last Major KDE4 Workspaces...
  4. Steam: No used games...
  5. A New X.Org-Free Wayland LiveCD Released
  6. New Intel X.Org Driver Supports All Of Haswell
  1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Motherboards
  5. Peripherals
  6. Processors
  7. Software
  8. Operating Systems
  9. All Articles
  1. Linux Benchmarking
  2. OpenBenchmarking.org
  3. Phoronix Test Suite