Announcement

Collapse
No announcement yet.

Ryzen Compiler Performance: Clang 4/5 vs. GCC 6/7/8 Benchmarks

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Ryzen Compiler Performance: Clang 4/5 vs. GCC 6/7/8 Benchmarks

    Phoronix: Ryzen Compiler Performance: Clang 4/5 vs. GCC 6/7/8 Benchmarks

    A few days back I posted some fresh AMD Ryzen compiler benchmarks of LLVM Clang now that it has its new Znver1 scheduler model, which helps out the performance of Ryzen on Linux with some of the generated binaries tested. But it was found still that Haswell-tuned binaries are sometimes still faster on Ryzen than the Zen "znver1" tuning itself. For continuing our fresh compiler benchmarks from AMD's new Ryzen platform, here are the latest GCC numbers.

    http://www.phoronix.com/vr.php?view=24976

  • #2
    Since it's common knowledge that -O3 is highly questionable as a generic switch and can easily produce a SLOWER binary, you should rather compile with -O2. It is also much more useful as for real world benchmarking anyways. As of now, we cannot be sure whether or not the benchmark results are due to poor optimization for Ryzen or if it's actually the -O3 producing code that simply won't execute well despite the other optimizations.
    Last edited by curfew; 07-23-2017, 11:42 AM.

    Comment


    • #3
      x264? Thank you Michael!

      Comment


      • #4
        https://community.amd.com/thread/215...t=285&tstart=0

        Still unresolved. I hope they'll fix it before EPIC launch.

        Comment


        • #5
          Yesterday FreeBSD imported llvm & friends as soon upstream branched 5.0.

          Comment


          • #6
            Interesting how in several of the benchmarks, GCC 8 is a significant regression from GCC 7, which is a significant regression from GCC 6. So it seems that at least in some applications, sticking with GCC 6 is the better option today.

            Comment


            • #7
              Originally posted by curfew View Post
              Since it's common knowledge that -O3 is highly questionable as a generic switch and can easily produce a SLOWER binary, you should rather compile with -O2. It is also much more useful as for real world benchmarking anyways. As of now, we cannot be sure whether or not the benchmark results are due to poor optimization for Ryzen or if it's actually the -O3 producing code that simply won't execute well despite the other optimizations.
              "Common knowledge" is often wrong. I find that GCC's O3 mode does use more instruction cache. But if you're using a CPU with more cache you're good to go. Like an Intel Xeon or the Xeon derived consumer chips like the 5960x. Also in my experience, using O3 with profile guided optimization is unconditionally always faster. And one of the things that the -march and -mtune options do in GCC is adjust its parameters for how many instructions to inline or unroll in loops.

              And see https://stackoverflow.com/a/11546263/13422

              But sure, I haven't tried it myself on Ryzen so it'd be interesting to see an O2 vs O3 comparison.

              Comment


              • #8
                Originally posted by torsionbar28 View Post
                Interesting how in several of the benchmarks, GCC 8 is a significant regression from GCC 7, which is a significant regression from GCC 6. So it seems that at least in some applications, sticking with GCC 6 is the better option today.
                While there is nothing wrong with your conclusion in itself would I stay away from making one at all, knowing the on-demand governor had been used again. That, and the lack of any mentioning of variations in the results just doesn't allow one to come to a good conclusion for results that are within a 5% margin. Most benchmarks on a live machine these days vary by at least 0.5%-1.0% anyway and when it involves I/O to file or disk, too, then it's only higher. Add to this a CPU governor like the on-demand, and you're not only getting further variations, but also a lot more radical results.

                Comment


                • #9
                  Originally posted by sdack View Post
                  While there is nothing wrong with your conclusion in itself would I stay away from making one at all, knowing the on-demand governor had been used again.
                  Oh was it? That isn't good. Benchmarking doesn't combine well with anything but "performance" and even then modern systems screw with it. Ryzen X CPU's do dynamic clock boosting. Intel chips all do various Turbo modes... It can be locked down to constant clock speeds but is tricky to get set correctly.

                  Comment


                  • #10
                    Originally posted by RussianNeuroMancer View Post
                    https://community.amd.com/thread/215...t=285&tstart=0

                    Still unresolved. I hope they'll fix it before EPIC launch.
                    Right now I've postponed my long overdue upgrade. I'll buy a new system once official word from AMD is out that this is fixed, or Intel release their Coffee Lake platform and CPU's. Whichever comes first.

                    The complete silence about this from AMD worries me...

                    Comment

                    Working...
                    X