Announcement

Collapse
No announcement yet.

The New OpenBenchmarking.org Is Launching Soon

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    I like the view of "Result Confidence". In addition, I would like to see some more features of common box plots, like the median. Especially fast tests can be performed several times in a short time, and even if the arithmetic mean remains the same, the distribution of the individual values could change, which might be of interest. In other words, a diagram that shows all the typical characteristics such as outliers and so on.
    And maybe some kind of significance test? So if you move the cursor over one entry, all other entries will automatically show a ns (not significant) or a * or ** or *** for different levels of significance?

    Comment


    • #12
      Originally posted by bugmenot3 View Post
      I like the view of "Result Confidence". In addition, I would like to see some more features of common box plots, like the median. Especially fast tests can be performed several times in a short time, and even if the arithmetic mean remains the same, the distribution of the individual values could change, which might be of interest. In other words, a diagram that shows all the typical characteristics such as outliers and so on.
      And maybe some kind of significance test? So if you move the cursor over one entry, all other entries will automatically show a ns (not significant) or a * or ** or *** for different levels of significance?

      https://en.wikipedia.org/wiki/Box_plot
      Such info should already be shown on the box plot graphs when relevant.
      Michael Larabel
      https://www.michaellarabel.com/

      Comment


      • #13
        Originally posted by Michael View Post

        Such info should already be shown on the box plot graphs when relevant.
        Okay, so lets take as an example a result from here:
        OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles


        We look at
        SQLite
        SQLite v3.30.1
        Timed SQLite Insertion

        (Would be nice to have a link directly to this test, https://openbenchmarking.org/prospec...b61c03467eb7ee leads to the page of the complete test page for me)

        So then let's take a look at the results of
        CUSO C5S-EVO 120 - Realtek RTL8111 - AMD Ryzen 3 1200

        I see that the test was run 9 times, and that the mean is 611.25, with a standard error of 17.85. I can see no more information.

        I would like to have the possibility to see all the additional statistical data behind this test, not just when relevant but always when available. Like to click on it and see an "advanced view" with a real box plot with min, max, 0.25th quantil, 0.5th quantil (median), 0.75th quantil). To see with one look the number of runs (N), outliers, mean, median, standard error, standard deviation etc.

        This does not make much sense with an N=3, but this may give interesting insights when test are performed very often and maybe the median changes a bit or the number of outliers in both directions increases or decreases. Maybe some kind of scheduler change does not make things go faster or slower (mean stays the same), but there is less variation or similar.

        BTW I find it irritating that for CUSO C5S-EVO 120 - MSI AMD Radeon HD 5000 there is no standard error plotted into the graph, while this is the case for the other two. I know the standard error is slightly smaller, but in the plot it looks like the standard error is much smaller, which is not the case. I think if you plot one, you should plot all.

        Comment

        Working...
        X