Announcement

Collapse
No announcement yet.

AMD Ryzen AOCC 1.0 Compiler Tuning Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • AMD Ryzen AOCC 1.0 Compiler Tuning Benchmarks

    Phoronix: AMD Ryzen AOCC 1.0 Compiler Tuning Benchmarks

    On Friday I posted some benchmarks of AMD's new AOCC code compiler for Ryzen compared to LLVM Clang 4.0/5.0 and GCC 6/7/8. The AOCC 1.0 benchmarks on Ryzen 7 didn't offer much over LLVM Clang for which this "AMD Optimizing C/C++ Compiler" is based, but in this article are some tuning benchmarks.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Typo:

    Originally posted by phoronix View Post
    FLAC had failed to build when the stided-vectorization option was enabled.

    Comment


    • #3
      User mlau has said twice ([1], [2]) that using the -mtune flag together with -march (eg. -march=znver1 -mtune=haswell) improves performance noticeably. It would be nice if Michael tested this. Anyway, thanks for the updated benchmark.
      Last edited by Marc.2377; 21 May 2017, 12:17 PM.

      Comment


      • #4
        Originally posted by Marc.2377 View Post
        User mlau has said twice ([1], [2]) that using the -mtune flag together with -march (eg. -march=znver1 -mtune=haswell) improves performance noticeably. It would be nice if Michael tested this himself.
        I think AMD even said this in a slide deck a while back, that one of Zen's goals was to effectively leverage existing intel Haswell optimizations, so that Zen-specific optimizations were not really needed, allowing existing binaries to perform very well on Zen from day 1.

        Comment


        • #5
          I'd like to see a test with the -march=znver1 -mtune=haswell as well. Particularly if what you said about AMD actually targeting Zen for Haswell optimizations is true.

          Comment


          • #6
            So basically all those optimizations are useless. Furthermore, optimizing a binary for a particular CPU is absurd considering the binary is most likely going to be run in different CPUs. The correct way to do the optimizations is to create several code paths for different supported instruction sets in assembler. It's more work, yes, and only real programmers can do it, yes, but it is worth it.

            Comment


            • #7
              Originally posted by wargames View Post
              So basically all those optimizations are useless. Furthermore, optimizing a binary for a particular CPU is absurd considering the binary is most likely going to be run in different CPUs. The correct way to do the optimizations is to create several code paths for different supported instruction sets in assembler. It's more work, yes, and only real programmers can do it, yes, but it is worth it.
              You're not getting the picture.

              This is about the compilers themselves. It's not about a common practise of mediocrity. It's about when to choose a different compiler other than GCC and what gains to expect of them. For your need of mediocre results look no further than the results of -O and -O2.

              But when you want to know more then you will have to look at all the available compilers and all their options. What stands out today will become tomorrows default, because of the competition. If it wasn't for the competition then we'd all still be using ancient versions of gcc and only compile with -g, with only the "daredevils" using -O.

              So, no, these optimizations are not useless at all. Perhaps your opinion on the topic can be seen as useless. Hopefully you'll upgrade to a better opinion soon.

              Comment


              • #8
                Originally posted by wargames View Post
                So basically all those optimizations are useless. Furthermore, optimizing a binary for a particular CPU is absurd considering the binary is most likely going to be run in different CPUs. The correct way to do the optimizations is to create several code paths for different supported instruction sets in assembler. It's more work, yes, and only real programmers can do it, yes, but it is worth it.
                Ah yes, the biggest problem of all open source projects, there's always too much free workforce. A specific assembly optimization is relevant for 6 months, then a new architecture will appear.

                Comment


                • #9
                  While in previous comparisons including -O2 and -O3 they often were very close, there used to be also a few tests where -O3 was more than 10% faster. At the same time, there seem to be fewer cases where -O3 is a bit slower. What happened?

                  (Of course, it is still nice to get 1% - 4% improvements by the mere flip of a switch.)

                  Comment


                  • #10
                    Originally posted by caligula View Post

                    Ah yes, the biggest problem of all open source projects, there's always too much free workforce. A specific assembly optimization is relevant for 6 months, then a new architecture will appear.
                    this is where FMV comes in, no need to write assembly to achieve this

                    Comment

                    Working...
                    X