Statistical Significance In Benchmark Results

Written by Michael Larabel in Phoronix on 25 September 2009 at 10:46 AM EDT. 6 Comments
PHORONIX
For those of you following the developments of Phoronix Test Suite 2.2 (codenamed "Bardu"), some new benchmarking features were pushed into its Git tree this week. The latest Phoronix Test Suite 2.2 code now has better FreeBSD 8.0 compatibility and support for network proxies with network communication, but larger than that is new support for ensuring test results are statistically significant.

When any test profile is set to run multiple times, the Phoronix Test Suite is now capable of computing the standard deviation between each of the test runs. If the standard deviation of the test results exceeds a certain threshold (it's currently defined at 3.50%, but it's adjustable through the ~/.phoronix-test-suite/user-config.xml file), the Phoronix Test Suite will automatically increase the number of times that the test profile is to be run. This is done in hopes of lowering the standard deviation of the results, to ensure that the produced result is accurate. There are also safeguards in place against uselessly calling a test profile to run too many times, if the standard deviation is not changing, etc.

Through the user-config.xml file this option can be disabled entirely using the DynamicRunCount tag in the Statistics area as with the StandardDeviationThreshold. There is also a LimitDynamicToTestLength option for not applying this feature to tests that take longer than a defined amount of time to run.

Therefore to sum it up, by default if the Phoronix Test Suite notices the results for any test profile are starting to deviate, it can automatically increase the number of times the test is running in order to hopefully produce more accurate results. This new support is available through a Git snapshot today and can be found in Phoronix Test Suite 2.2 Alpha 3 to be released within the next week. Additional statistics / analytical features will also be coming to Phoronix Test Suite 2.2. To find out about some of the other features already available in 2.2 Bardu, read this news entry.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week