No announcement yet.

GCC performance testing request

  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    We all love ricer flags, ask any Gentoo user about how much faster they make your system. =O

    But seriously, I would be interested in the difference the Graphite framework makes in the newer (4.4+) GCCs with multicore systems.


    • #12
      Currently on my i7 920 box with gcc-4.4.2

      CFLAGS="-march=native -O3 -msse4 -mmmx -floop-interchange -floop-strip-mine -floop-block -pipe"
      With the big 8mb shared L3 cache I think the O3 is probably beneficial more often than not. The last benchmark I read using core2s showed O2 and O3 to be about even with O3 maybe pulling a bit ahead.


      • #13
        I thought I came across intel doing some gcc compile testing in the latest linux kernel podcast - heres the link


        • #14
          Originally posted by hmmm View Post
          I thought I came across intel doing some gcc compile testing in the latest linux kernel podcast - heres the link
          The net result doesn't seem obvious to me

          The first post clearly shows that -O2 is a winner over -Os, but later the only test won for -O2 kernel and it seemed like given more time and diligence, kernel built with -Os would equal to -O2.

          However I digress. I still want someone to run Phoronix tests with these three compilers.


          • #15
            Originally posted by birdie View Post
            Can you PLEASE test these versions of GCC: 4.2.4, 4.3.4 and 4.4.2 using whatever benchmarks you like (the more the better).
            There was a gcc comparison earlier this year

            As for the compiler flag choices, check out ACOVEA and

            An article by Dunlop and others from 2008, "On the Use of a Genetic Algorithm in High Performance Computer Benchmark Tuning", concluded:

            This paper has addressed the issue of extracting the best adapted parameters for the HPL reference benchmark. Adjustment
            of the seventeen tuning parameters to achieve maximum performance is a time-consuming task that must be performed
            by hand. The use of a genetic algorithm is proposed here to manage this task with individuals corresponding to an
            HPL run. Indeed we do not provide here a description of a particular version of a GA. The Acovea framework has been
            used to validate the approach over a Beowulf cluster composed of heterogeneous resources: a majority of so-called
            “small” nodes and two “large” nodes. In particular, starting from a hand-tuned performance of 84 Gflops, it was possible
            to attain the peak performance of 111.6 Gflops on the cluster using a set of parameters determined nearly automatically by
            Last edited by sabriah; 12-14-2009, 08:23 AM.