Announcement

Collapse
No announcement yet.

Benchmarks Of GCC 4.2 Through GCC 4.7 Compilers

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Benchmarks Of GCC 4.2 Through GCC 4.7 Compilers

    Phoronix: Benchmarks Of GCC 4.2 Through GCC 4.7 Compilers

    To see how the GCC 4.7 release is shaping up, for your viewing pleasure today are benchmarks of GCC 4.2 through a recent GCC 4.7 development snapshot. GCC 4.7 will be released next March/April with many significant changes, so here's some numbers to find out if you can expect to see any broad performance improvements. Making things more interesting, the benchmarks are being done from an AMD FX-8150 to allow you to see how the performance of this latest-generation AMD processor architecture is affected going back by GNU Compiler Collection releases long before this open-source compiler had any optimizations in place.

    http://www.phoronix.com/vr.php?view=16777

  • #2
    Meaningful tests for once (except maybe ffmpeg which uses assembler for it's critical code)

    It would be interesting to bisect down the regressions in 4.7 - I was under the impression we should be seeing speed ups with 4.7 not slow downs

    Can't wait to see the same benchmarks on your Sandybridge 2630QM

    You should really consider adding flattr to your articles

    Comment


    • #3
      measuring optimization potential

      This test basically measures the optimization potential that inner loops have. In other words it measures how bad the inner loop was written. If your inner loop heavily relies on the compiler figuring out how to best convert it to machine code, you should really work on it. I am looking at you, GraphicsMagic. Obviously encoder writers on the other hand have figured out this fact long ago and made their inner loop performance compiler agnostic.

      Comment


      • #4
        I'd rather have seen a comparison of how well the compilers optimize rather than how features of a particular new processor becoming supported over time affects performance. Graphite was added at some point in these versions. I haven't yet seen any tests on it.

        -march implies -mtune. No need for both.

        Comment


        • #5
          Since gcc 4.7 is not out yet, it's very likely performance and optimization are yet to be addressed. So this comparison may be a bit ahead of its time.

          Comment


          • #6
            I'm totally confused - is this a test of GCC compilers or GCC + LLVM backend?

            The table on the first page lists LLVM backend for all GCC versions so I've no idea what to think.

            Comment


            • #7
              -march=native was a great choice, but maybe -O2 would have been better then -O3, the gcc documentation recomends using -O2 as -O3 does some risky optimisations.

              Comment


              • #8
                Originally posted by bug77 View Post
                Since gcc 4.7 is not out yet, it's very likely performance and optimization are yet to be addressed. So this comparison may be a bit ahead of its time.
                No, it is very timely. Now the developers have a chance of addressing these issues.

                Comment


                • #9
                  Originally posted by oglueck View Post
                  This test basically measures the optimization potential that inner loops have. In other words it measures how bad the inner loop was written. If your inner loop heavily relies on the compiler figuring out how to best convert it to machine code, you should really work on it. I am looking at you, GraphicsMagic. Obviously encoder writers on the other hand have figured out this fact long ago and made their inner loop performance compiler agnostic.
                  Encoders optimize by writing those inner loops in assembly by hand and completely bypassing the compiler.

                  Comment


                  • #10
                    Originally posted by FireBurn View Post
                    Meaningful tests for once (except maybe ffmpeg which uses assembler for it's critical code)

                    It would be interesting to bisect down the regressions in 4.7 - I was under the impression we should be seeing speed ups with 4.7 not slow downs

                    Can't wait to see the same benchmarks on your Sandybridge 2630QM

                    You should really consider adding flattr to your articles
                    well if YOU consider an old 8.2 release (ffmpeg/avconv dev's still recommend you use the latest git version or at least a current 0.8.6) doing a virtually useless antiquated and tiny avi to vcd encode (who today even uses vcd , its all HD 1080P BR or at least 720P from HD MKV sources) anything like a reasonable test in 2011/12

                    or come to that, using an even older x264 v2010-11-22 without lots of current SIMD improvements or any AVX come to that ,to encode a non HD content on an AVX ready CPU is far beyond reason for a perceived improved speed test, when a 2 minute GIT pull is clearly needed to get the latest code to see large speed improvements.....

                    its not like the ffmpeg and x264 devs wont give you advice of suitable samples and command lines best suited for the GIT to use and integrate in your current phoenix test suite
                    Last edited by popper; 12-02-2011, 03:23 PM.

                    Comment


                    • #11
                      Originally posted by birdie View Post
                      I'm totally confused - is this a test of GCC compilers or GCC + LLVM backend?
                      The table on the first page lists LLVM backend for all GCC versions so I've no idea what to think.
                      Should be a typo since the tests aims to be about different GCC versions, also gcc-llvm has been deprecated in favour of Dragonegg which does the same thing against newer GCC versions (4.5 forwards iirc) through the plugin framework.

                      Originally posted by sabriah View Post
                      No, it is very timely. Now the developers have a chance of addressing these issues.
                      Well, the compiler developers have test suites of their own which are much more extensive than that of Phoronix's. As for them even considering these tests at all I'd say that ship has sailed long time ago. Both the gcc devs and Chris Lattner (LLVM project leader) has stated that Phoronix's tests are totally worthless due to the poor test conditions through which the are done. Whatever compiler options Micheal states are being used, take it with a large grain of salt as it has been shown over and over again that he doesn't seem to know how to configure these packages correctly before testing (himeno pressure tests using -O0, Povray defaulting to tuning for Amd K8 no matter what processor is being used, etc etc).

                      And we still see it, what use is there to do a test of ffmpeg/x264 with assembly optimizations enabled? All the performance critical code is out of reach for the compilers, they are pretty much left to optimize the commandline option handling, yay!

                      Comment


                      • #12
                        Originally posted by Ansla View Post
                        -march=native was a great choice, but maybe -O2 would have been better then -O3, the gcc documentation recomends using -O2 as -O3 does some risky optimisations.
                        I would prefer both -O2 and -O3, -O3 is aimed at producing the fastest code but due to the problems of correctly determining the best optimization strategy during compile time some of the aggressive optimizations in -O3 backfires and in those cases -O2 ends up faster. Still, if only one option is to be used I do prefer -O3 as it is supposed to generate the fastest code.

                        Comment


                        • #13
                          Originally posted by XorEaxEax View Post
                          Should be a typo since the tests aims to be about different GCC versions, also gcc-llvm has been deprecated in favour of Dragonegg which does the same thing against newer GCC versions (4.5 forwards iirc) through the plugin framework.


                          Well, the compiler developers have test suites of their own which are much more extensive than that of Phoronix's. As for them even considering these tests at all I'd say that ship has sailed long time ago. Both the gcc devs and Chris Lattner (LLVM project leader) has stated that Phoronix's tests are totally worthless due to the poor test conditions through which the are done. Whatever compiler options Michael states are being used, take it with a large grain of salt as it has been shown over and over again that he doesn't seem to know how to configure these packages correctly before testing (himeno pressure tests using -O0, Povray defaulting to tuning for Amd K8 no matter what processor is being used, etc etc).

                          And we still see it, what use is there to do a test of ffmpeg/x264 with assembly optimizations enabled? All the performance critical code is out of reach for the compilers, they are pretty much left to optimize the commandline option handling, yay!
                          indeed, but then to do it properly would mean Michael finally makes some time and effort to actually care about his test suite and update it with the current ffmpeg/avconv and x264 Git code then set a configure switch to disable the assembly ( and so fall back to the slow C routines)for cases like this test.

                          come to that he doesn't seem to even care about running out the box ARM NEON SIMD results now we are into retail ARM quad cores such as the Asus Transformer Prime Tegra3, and several other quads freescale Qualcomm etc. in retail soon enough , never mind all the old dual core ARM NEON kit out there today people and companies would like to see and compare results for.

                          given that current ffmpeg/avconv and x264 have limited (but worthwhile testing) NEON SIMD today then these compiler tests would be perfectly suited to cross compile ARM/NEON testing as they would fall back to the C code routines and perhaps show some speed improvements and show where the "auto vectorising" needs more work..... and lets face it "auto vectorising" NEED's a LOT of work still and/or better developers that can learn some real assembly and liberally apply it in their apps code where it helps
                          Last edited by popper; 12-03-2011, 05:12 AM.

                          Comment


                          • #14
                            Originally posted by XorEaxEax View Post
                            I would prefer both -O2 and -O3, -O3 is aimed at producing the fastest code but due to the problems of correctly determining the best optimization strategy during compile time some of the aggressive optimizations in -O3 backfires and in those cases -O2 ends up faster. Still, if only one option is to be used I do prefer -O3 as it is supposed to generate the fastest code.
                            A benchmark that tests code that produces possibly faulty results is worthless. If you ask O2 compiled code what 2+2 is and it says 4 in a half second while the O3 code says 5 in a quarter second, which is better?

                            Comment


                            • #15
                              Originally posted by locovaca View Post
                              A benchmark that tests code that produces possibly faulty results is worthless. If you ask O2 compiled code what 2+2 is and it says 4 in a half second while the O3 code says 5 in a quarter second, which is better?
                              obviously the O3 code after you finally realise that the "auto vectorising" code in your compiler is so badly broken (perhaps a simple typo etc) that its producing faulty (or even just slow prototype speed code) output and needs fixing ASAP. but then devs should be checking their code routines speed improvements down to the pico second as it all add's up to lost time and efficiancy.
                              Last edited by popper; 12-04-2011, 01:17 AM.

                              Comment

                              Working...
                              X