Announcement

Collapse
No announcement yet.

An Early Look At The GCC 12 Compiler Performance On AMD Zen 3

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • An Early Look At The GCC 12 Compiler Performance On AMD Zen 3

    Phoronix: An Early Look At The GCC 12 Compiler Performance On AMD Zen 3

    GCC 12 isn't seeing its stable release until around March~April as usual, but with feature development slowly wrapping up as approaching the next stage of development next month to focus on fixes, recently I wrapped up some preliminary benchmarks for how GCC 12.0 is currently performing against GCC 11.2 on an AMD Ryzen 9 5950X (Zen 3) system...

    https://www.phoronix.com/scan.php?pa...ber-Zen-3-Perf

  • #2
    More results that show that we need repositories that can push out packages based on gcc feature level.

    How come there aren't -O2 native and flto tests?

    Comment


    • #3
      Originally posted by skeevy420 View Post
      How come there aren't -O2 native and flto tests?
      Only so much time in a day, especially after seeing not much change and this would end up only as a 1 page article.
      Michael Larabel
      http://www.michaellarabel.com/

      Comment


      • #4
        Originally posted by Michael View Post

        Only so much time in a day, especially after seeing not much change and this would end up only as a 1 page article.
        I totally get that. I'm on around hour 19 of a massive file system reshuffle. Long story short: I'm moving all my (mostly) WORM data from LZ4 over to Zstd-19.

        Update:

        Code:
        [FONT=monospace][COLOR=#000000]took 15h 36m 8s   [/COLOR][COLOR=#b2b2b2][/COLOR][COLOR=#000000] at 08:02:12  [/COLOR][/FONT]
        Last edited by skeevy420; 22 October 2021, 09:08 AM.

        Comment


        • #5
          "GCC 12 Compiler Performance"
          -> I was expecting a benchmark on how unbearably slow gcc itself has become in recent years. I think a really interesting benchmark would be to take a few significantly older gcc releases and benchmark their compilation speed vs. speed of the generated code to see if the increase in compilation time can be justified. Of course one slight problem could be that you also need benchmarks that can be built on both old and new gcc releases. So that might limit on what benchmarks can be used, and/or which gcc versions can be tested.

          Comment


          • #6
            Originally posted by syrjala View Post
            "GCC 12 Compiler Performance"
            -> I was expecting a benchmark on how unbearably slow gcc itself has become in recent years. I think a really interesting benchmark would be to take a few significantly older gcc releases and benchmark their compilation speed vs. speed of the generated code to see if the increase in compilation time can be justified. Of course one slight problem could be that you also need benchmarks that can be built on both old and new gcc releases. So that might limit on what benchmarks can be used, and/or which gcc versions can be tested.
            Default flags/settings will not be the same, so it would be apples to oranges.

            Comment


            • #7
              Originally posted by brucethemoose View Post

              Default flags/settings will not be the same, so it would be apples to oranges.
              You can test whatever combination flags you want on any of the compilers. The don't even have to match. The point is whether those extra optimizations passes (or just bloat) that make it so slow actually worth it?

              Comment


              • #8
                Originally posted by syrjala View Post

                You can test whatever combination flags you want on any of the compilers. The don't even have to match. The point is whether those extra optimizations passes (or just bloat) that make it so slow actually worth it?
                Newer compilers are also adding more language features. It's not just optimization passes.

                Comment


                • #9
                  Periodic testers (maintained by Martin Liska) at SUSE also compare different GCC release branches & trunk
                  https://lnt.opensuse.org/db_default/..._report/branch (slow to load)

                  https://lnt.opensuse.org/db_default/v4/SPEC/spec_report/branch?sorting=gcc-11%2Cgcc-trunk&all_elf_detail_stats=on (slow to load showing only gcc11 vs trunk)

                  long story short:
                  zen1 (Ryzen 5 1600) with -O2 -flto:
                  test
                  SPECint 2017
                  base (gcc 6)
                  3.886
                  GCC 7
                  4.20%
                  GCC 8
                  5.04%
                  GCC 9
                  4.59%
                  GCC 10
                  5.12%
                  GCC 11
                  4.47%
                  trunk (GCC 12)
                  10.11%
                  zen2 (AMD EPYC 7702) -O2 -flto:
                  test
                  SPECint 2017
                  bsae (gcc 6)
                  3.609
                  GCC 7
                  4.32%
                  GCC 8
                  4.12%
                  GCC 9
                  3.51%
                  GCC 10
                  5.46%
                  GCC 11
                  5.89%
                  trunk (GCC 12)
                  12.13%
                  The change between gcc11 and gcc12 is mostly due to vectorization by default that greatly improves x264 benchmark (by 44%). Difference between gcc6 and 7 is exchange2 benchmark. Changes in specfp scores are within 1% range.

                  zen2 -Ofast -march=native -flto is:
                  test
                  SPECfp 2017
                  base (GCC 6)
                  7.096
                  GCC 7
                  -3.80%
                  GCC 8
                  10.97%
                  GCC 9
                  18.56%
                  GCC 10
                  19.05%
                  GCC 11
                  20.13%
                  GCC 12
                  22.74%
                  SPECint 2017 4.020 4.00% 8.20% 6.47% 10.23% 15.49% 14.48%
                  zen1 -Ofast -march=native -flto is:
                  test
                  SPECfp 2017
                  base (gcc 6)
                  4.240
                  GCC 7
                  4.67%
                  GCC 8
                  7.89%
                  GCC 9
                  8.56%
                  GCC 10
                  13.90%
                  GCC 11
                  17.50%
                  trunk (GCC 12)
                  17.26%
                  SPECint 2017 7.489 ~ 5.32% 7.06% 7.52% 7.75% 9.02%


                  zen2 -Ofast -march=native -flto and pgo is:
                  test
                  SPECfp 2017
                  base (gcc 6)
                  7.404
                  GCC 7
                  ~
                  GCC 8
                  11.37%
                  GCC 9
                  16.99%
                  GCC 10
                  17.75%
                  GCC 11
                  18.58%
                  trunk (GCC 12)
                  18.39%
                  SPECint 2017 4.346 3.94% 5.63% 2.75% 8.94% 9.46% 9.70%
                  and zen1 -Ofast -march=native -flto and pgo is:
                  test
                  SPECfp 2017
                  base (gcc 6)
                  4.538
                  GCC 7
                  4.61%
                  GCC 8
                  4.56%
                  GCC 9
                  4.44%
                  GCC 10
                  9.14%
                  GCC 11
                  9.01%
                  trunk (GCC 12)
                  9.68%
                  SPECint 2017 7.829 ~ 5.68% 6.13% 6.73% 6.73% 10.22%


                  Compile time of complete spec2017 (in seconds) with -O2 -flto:
                  base (gcc 6)
                  729.993
                  gcc 7
                  3.42%
                  gcc 8
                  13.22%
                  gcc 9
                  ~
                  gcc 10
                  37.52%
                  gcc 11
                  36.09%
                  trunk
                  57.81%
                  So performance keeps improving but also compile times keeps growing. I will definitly spend again some time speeding up trunk once stage1 development ends.

                  More results (for kabylake, zen1 and zen2 and also for spec2006 and other flags) are easily seen in the link above. As well as breakdown to individual tests. Curiously zen1 machine seems to
                  Last edited by hubicka; 23 October 2021, 09:12 AM.

                  Comment


                  • #10
                    Originally posted by skeevy420 View Post
                    More results that show that we need repositories that can push out packages based on gcc feature level.
                    Once upon a time, you could do this on a practical level just by building from source. But then clusterf**ks like CMake happened, and more and more people started jumping on the NIH bandwagon and reinventing autoconf and make, so now you need 20 different build systems installed to do that.

                    It's probably just about doable still if you have infinite patience and near-infinite free time (and I'm sure there'll be an Arch user or somesuch who does exactly that, and will be more than happy to tell us all about it! :P) but unless you have a specific package that you run for hours per day it's a huge net loss overall.

                    Comment

                    Working...
                    X