Announcement

Collapse
No announcement yet.

The Performance Impact To AMD Zen 2 Compiler Tuning On GCC 9 + Znver2

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Michael View Post
    arithmetic mean would be inaccurate when there is different scales / units of measurement involved.
    Or you could just normalize by some baseline machine.

    The bigger issue with units is that some units are a rate, while others are time. A CPU that's twice as fast will have benchmarks with a time that's 50% of the baseline. However, the rate-based benchmarks will be 2x. The two should have the same effect on the result, although they won't.

    I'd convert all rate-based benchmark results to a time, which is a nice, linear measure. After computing the average, you can take the inverse of this to estimate of how much faster (or slower) the test subjects are than the baseline.

    Also, I believe median is a useful way to gauge the "typical" speedup.
    Last edited by coder; 11 July 2019, 11:45 PM.

    Comment


    • #22
      Originally posted by grigi View Post
      arithmetic mean gives more wieght to larger objects. So if one benchmark emits seriously large numbers, it is going to dominate and turn everything else into noise. It is surely the worst to use.
      You could trim the outliers.

      Comment


      • #23
        Originally posted by carewolf View Post
        Please don't use -march=x86-64
        He should use whatever most distros use.

        Comment


        • #24
          Originally posted by coder View Post
          He should use whatever most distros use.
          Which is nothing. You don't specify architecture if you want the generic architecture

          Comment


          • #25
            Originally posted by carewolf View Post

            Which is nothing. You don't specify architecture if you want the generic architecture
            Which is what you need to use if you want your distribution to run on any last years AMD64 machine. This is why I postulated that maybe hot-spot SIMD vectoring JIT-ing even C & C++ might be a much more performance and universal setup: https://www.youtube.com/watch?v=-VZmXO381HQ

            Comment


            • #26
              Originally posted by thebear View Post
              Here's a paper on why geometric mean is to be preferred to arithmetic mean (I'm not familiar with the journal so I cannot speak to it's peer review process though):

              "How not to lie with statistics: the correct way to summarize benchmark results", Communications of the ACM - The MIT Press scientific computation series CACM Homepage archive Volume 29 Issue 3, March 1986 Pages 218-221 ACM New York, NY, USA

              Using the arithmetic mean to summarize normalized benchmark results leads to mistaken conclusions that can be avoided by using the preferred method: the geometric mean.

              or
              https://www.cse.unsw.edu.au/~cs9242/...Wallace_86.pdf
              That paper was a nice and definitive explanation as to why the geometric mean is what we should use.

              However, considering the desirable outcome of some benchmarks are on an inverted scale (operations per time vs time per operation), the latter should be inverted (1/time) in order to be representative of the performance.

              Shouldn't that be made clear in the presentation of the geometric mean?

              Comment


              • #27
                Originally posted by Djhg2000 View Post
                However, considering the desirable outcome of some benchmarks are on an inverted scale (operations per time vs time per operation), the latter should be inverted (1/time) in order to be representative of the performance.
                I think you've got it backwards. You want to average the times.

                Consider the case where you run a test 3 times and get results of 3, 4, and 5 seconds. The mean is 4 seconds, which equates to an average throughput of 0.25 ops/sec.

                However, if you average the rates, then you get a mean rate of 0.26111 ops/sec, incorrectly suggesting that 3.1333 ops were completed in the combined 12 seconds of the 3 trials.

                Comment


                • #28
                  Originally posted by coder View Post
                  I think you've got it backwards. You want to average the times.

                  Consider the case where you run a test 3 times and get results of 3, 4, and 5 seconds. The mean is 4 seconds, which equates to an average throughput of 0.25 ops/sec.

                  However, if you average the rates, then you get a mean rate of 0.26111 ops/sec, incorrectly suggesting that 3.1333 ops were completed in the combined 12 seconds of the 3 trials.
                  No, if we want larger numbers to be better then we need to use 1/time. The resulting number has no useful unit and is only valid for relative comparisons with the same set of benchmarks anyway, but they all need to tend towards the same limit (0 or infinite) for better performance.

                  Consider this example; benchmark BA gives a score based on the number of frames processed in a given time span, benchmark BB gives the time it took to process a set of frames and benchmark BC gives a score of an arbitrary unit where higher is better. Now, if we use the geometric mean we get the final score S = (BA*BB*BC)^(1/3). Three machines (MA, MB, MC) are now benchmarked and produce the following results:
                  Benchmark/Machine MA MB MC
                  BA 100 99 101
                  BB 10 9 300
                  BC 500 510 490
                  Notice that machine MC is a very poor fit for benchmark BB. Using just those numbers yields a final score of:
                  Score/Machine MA MB MC
                  Spure 79.370 76.880 245.78
                  Here, it looks like machine MC is clearly the winner, but the individual benchmarks tell us that shouldn't be the case. If we instead use 1/BB when calculating the final score (and thus make all results proportional with respect to the desirable outcome), we get:
                  Score/Machine MA MB MC
                  Sproportional 17.100 17.769 5.4844
                  Now the score properly reflects how machine MC performed horribly in benchmark BB.

                  I hope this makes my point a bit more clear.

                  Comment


                  • #29
                    Originally posted by Djhg2000 View Post
                    No, if we want larger numbers to be better then we need to use 1/time.
                    No, I want more accurate numbers.

                    I think my example was pretty damn clear that averaging rates is incorrect. If you want to convert it to an average rate, you can do that at the end.

                    Comment


                    • #30
                      Originally posted by coder View Post
                      No, I want more accurate numbers.

                      I think my example was pretty damn clear that averaging rates is incorrect. If you want to convert it to an average rate, you can do that at the end.
                      But that's what we're discussing, isn't it?

                      Comment

                      Working...
                      X