Announcement

Collapse
No announcement yet.

How to eliminate redundant benchmark run results for a given test?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to eliminate redundant benchmark run results for a given test?

    Some of my tests [1][2][3][4] contain more than one (actually up to 4) test results, which clutter the result file when merging them together. How could I merge the multiple runs for the given PTS Test, so that each PTS Test contains only one result? Is it alright to keep only the result of the test run with the smallest SE (standard error) value? Is there a command to do this for me?

    Here is my investigation into this problem. According this book, the number of benchmark runs for a particular test depends on many factors, mainly the following.

    Whether DynamicRunCount is enabled in user-config.xml
    If this option is set to TRUE, the Phoronix Test Suite will automatically increase the number of times a test is to be run if the standard deviation of the test results exceeds a predefined threshold. This option is set to TRUE by default and is designed to ensure the statistical signifiance of the test results. The run count will increase until the standard deviation falls below the threshold or when the total number of run counts exceeds twice the amount that is set to run by default from the given test profile. Under certain conditions the run count may also increase further.
    (source)
    DynamicRunCount has been set to TRUE in my tests. The question is, how do I know if the standard deviation has been exceeded? Is it exceeded when SE is larger then StandardDeviationThreshold?

    Each test result contains a string "SE +/- number", e.g. "SE +/- 3.22". Are "SE", "standard deviation" and "standard error" the same things?

    The value of StandardDeviationThreshold in user-config.xml
    If standard deviation exceeds a predefined threshold, another benchmark run for the same test is performed. The value of StandardDeviationThreshold on my PC is 3.50, which is default. However, in some of the tests that contain more than one result, there are two results withe SE < 3.5. How is it possible, that sedond benchmark run has been performed, if the StandardDeviationThreshold has not been exceeded?

    The value of the TOTAL_LOOP_TIME environmental variable
    ...and possibly the values of other envirnmental variables as well. However, the "$ export" command does not show any such environmental variables being set and I have seen no commands inside pre.sh that would set such variables.

    Currently I do not know which of these factors has caused the redundant benchmark runs. My main confusion is with the "SE" values, as already mentioned. Primarily, I need to get a consistent, reliable results to merge the four test results and create a reasonable comparison. Any tips or caveats with doing this would be appreciated.
    OpenBenchmarking.org, Phoronix Test Suite, Phoronix Global, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, test profiles
    Last edited by Slazer; 05-15-2016, 03:56 PM. Reason: Made the text more structured and understandable. Added tags.
Working...
X