Announcement

Collapse
No announcement yet.

Compiler Benchmarks Of GCC, LLVM-GCC, DragonEgg, Clang

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by XorEaxEax View Post
    -O3 has been stable to compile with for ages, I can't recall having encountered any program that compiles with -O2 which has problems with -O3 in years. Also I haven't encountered any cases where -O3 is slower than -O2 in ages, so obviously these tests should be done with -O3, especially since that's where most of the new optimizations will end up.
    These benchmarks disagree:

    http://www.linux-mag.com/id/7574/2/

    Comment


    • #17
      While these tests are great (kudos Phoronix!) it's unfortunate that they don't test some of the more advanced optimizations that has come during the later releases. While testing PGO (profile-guided optimization) would be a bit unfair since Clang/LLVM doesn't have this optimization, LTO (link time optimizations) exist in both compilers and would be an initeresting comparison. But I can understand that for practical reasons these more advanced optimizations have to be omitted. And since most people stick to -O3 I guess it's overall a fair comparison. Optimizations like PGO are mainly used by projects like Firefox, x264, emulators etc where the added performance really makes a difference.

      Speaking of x264, in order to really compare the differences between the compilers on this package you really should compile it without the hand-optimized assembly (which I'm assuming you haven't since the results are so similar between all versions of gcc).

      Comment


      • #18
        Originally posted by Drago View Post
        I wish, there was and latest intel C++ compiler benchmark alongside these.
        I agree with that, also some other proprietary compilers might be compared (IBM, HP, CodeWarrior).

        Also what about some ARM compiler benchmarks?

        Comment


        • #19
          Originally posted by yotambien View Post
          These benchmarks disagree:

          http://www.linux-mag.com/id/7574/2/
          You are confused, these are 'time to compile', not performance benchmarks. Obviously it will take longer time 'to compile' with more optimizations than with fewer. But the resulting binary should be atleast as fast or most likely faster.

          Comment


          • #20
            You have the performance benchmarks in the next page of that article.

            Comment


            • #21
              Originally posted by yotambien View Post
              You have the performance benchmarks in the next page of that article.
              Well, in some tests -O3 loses to -O2, but very slightly. But this is a test from a year ago and I can't even find which version of Gcc was used, nor can I see if it was done on 32bit or 64bit. I test alot of packages routinely (Blender, p7zip, Handbrake, Dosbox, Mame etc) with -O2 and -O3 and O3 comes out on top.

              Comment


              • #22
                I hope Michael checked for every compilation that the right flags were used. Some parts of pts did not care about CFLAGS exports and compiled unoptimized crap on amd platforms.

                Comment


                • #23
                  Originally posted by nanonyme View Post
                  -mtune=native is redundant if you're using -march=native.
                  -fomit-frame-pointer breaks debuggability in x86.
                  -O3 has bugs and might slow down run-time in many cases.
                  and since none of the systems is a debugging system, that is fine.
                  O3 migh have bugs or not and might slow down things or make it faster. Depends on the software.

                  Oh, and setting mtune after march is just stupid.

                  Comment


                  • #24
                    Originally posted by XorEaxEax View Post
                    Speaking of x264, in order to really compare the differences between the compilers on this package you really should compile it without the hand-optimized assembly (which I'm assuming you haven't since the results are so similar between all versions of gcc).
                    Very true. There's a ./configure parameter, --no-asm or something similar, that accomplishes that, and should be used in Phoronix testing. Michael, please use it when testing compilers. Nearly everything in x264 that can be optimized by using handwritten ASM has been, so you need to fall back to regular C in order to actually test the compiler on anything other than very basic code.

                    Comment


                    • #25
                      it would be more interessting compare that hand written asm with gcc generated code. Unless that is done there is no reason to turn off assembly just to create a testcase that is completely detached from reality.

                      Comment


                      • #26
                        Well, despite if -O2 beats -O3 in some tests or not, -O3 IS the optimization which is supposed to optimize the best so it's obviously the one to use in a benchmark (unless you are benchmarking across all -O levels). As for -O3 being buggy, it's not from my experience nor is it supposed to be anything but stable.

                        Optimizations that are not considered fully working are introduced as separate flags, not into one of the -O levels. If/when they are considered stable (as in actually improving code and not introducing bugs) they are often added to certain -O levels. Some optimizations like for instance -funroll-loops have been around for ages but are not part of any -O level simply because it's very difficult for the compiler to estimate unrolling and thus there can be great gains aswell as great regressions using this optimization. (Although it's turned on by default if you use PGO in which case the compiler has enough data gathered to guarantee making good judgements).

                        For the absolute best results though you'd most likely need to run something like Acovea (http://www.coyotegulch.com/products/...o5p4gcc40.html) which omits the -O levels and tests all the flag combinations, but it's not very practical.

                        Comment


                        • #27
                          I quite like the graphics horizontally, looks good.

                          But I'd had appreciated a better choice of colors. Something of a similar tint for the GCC stuff and separate tints for the others. I found myself scrolling up/down a couple of times to check which color is which compiler a couple of times.

                          Comment


                          • #28
                            Originally posted by energyman View Post
                            it would be more interessting compare that hand written asm with gcc generated code. Unless that is done there is no reason to turn off assembly just to create a testcase that is completely detached from reality.
                            The point here is to compare the generated code of compilers (GCC vs LLVM), not hand optimized assembly vs compiler generated code.

                            While it certainly would also be interesting seeing how much better hand optimized assembly does against the code generated by these compilers it's not part of THIS benchmark. So yes, there's every reason to disable hand-optimized assembly here.

                            Comment


                            • #29
                              Originally posted by XorEaxEax View Post
                              Well, in some tests -O3 loses to -O2, but very slightly. But this is a test from a year ago and I can't even find which version of Gcc was used, nor can I see if it was done on 32bit or 64bit. I test alot of packages routinely (Blender, p7zip, Handbrake, Dosbox, Mame etc) with -O2 and -O3 and O3 comes out on top.
                              Right. As I said, those benchmarks simply disagree with the idea that -O3 optimisations will always be at least as fast as -O2. To me, those numbers show that a) differences between -O2 and -O3 are minor; b) -O3 does not consistently produce the fastest binary. Of course, your experience is your experience, which is as valid as those tests.

                              What sort of differences do you get with Mame and Handbrake (I guess you mean x264 in this case)?

                              Comment


                              • #30
                                Originally posted by nanonyme View Post
                                -mtune=native is redundant if you're using -march=native.
                                Ok.

                                Originally posted by nanonyme View Post
                                -fomit-frame-pointer breaks debuggability in x86.
                                Who the hell needs debugging functionality in benchmarks? When we want to see which compiler produces the fastest code, what's the point in not generating the fastest code?

                                Originally posted by nanonyme View Post
                                -O3 has bugs and might slow down run-time in many cases.
                                More than a few cases? How many cases exactly? Statistics?

                                Comment

                                Working...
                                X