Announcement

Collapse
No announcement yet.

CentOS Stream ISA Optimized Packages Show Great Results On Intel Xeon Emerald Rapids

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CentOS Stream ISA Optimized Packages Show Great Results On Intel Xeon Emerald Rapids

    Phoronix: CentOS Stream ISA Optimized Packages Show Great Results On Intel Xeon Emerald Rapids

    As part of Red Hat evaluating x86-64-v3 for Red Hat Enterprise Linux 10, there is the CentOS ISA SIG that's been experimenting with ISA Optimized builds for the x86-64-v3 target. Via the CentOS ISA SIG there is the easy ability to transition an existing CentOS Stream 9 system/server over to using the x86_64-v3 optimized packages. In this article are some benchmarks on a modern Intel Xeon Scalable "Emerald Rapids" server showing the performance benefits when the entire Linux server OS is recompiled for x86_64-v3.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Strange how we see much more improvement while it should be less (v2 to v3) compared to https://www.phoronix.com/review/cachyos-x86-64-v3-v4 with v1 to v4, is it just the different selection of programs?

    Comment


    • #3
      Originally posted by Anux View Post
      Strange how we see much more improvement while it should be less (v2 to v3) compared to https://www.phoronix.com/review/cachyos-x86-64-v3-v4 with v1 to v4, is it just the different selection of programs?
      CachyOS builds packages with -O3 optimizations while CentOS uses -O2. There's an old anecdote that you should stick to -O2 because -O2 produces optimized code while -O3 produces optimized bloated code. CachyOS and CentOS are doing things different enough that it's hard to make a 1:1 comparison here.

      We need some -O3 CentOS builds and -O2 CachyOS builds to really jump to some conclusions.

      Comment


      • #4
        Originally posted by skeevy420 View Post
        CachyOS builds packages with -O3 optimizations while CentOS uses -O2.
        I doubt that's the reason, typical difference (if any) between O2 and O3 is 1 to 3 percent. You can look up some old benches from phoronix.

        Edit, found it: https://www.phoronix.com/review/gcc11-rocket-opts

        Comment


        • #5
          Originally posted by Anux View Post
          I doubt that's the reason, typical difference (if any) between O2 and O3 is 1 to 3 percent. You can look up some old benches from phoronix.

          Edit, found it: https://www.phoronix.com/review/gcc11-rocket-opts
          whoa, a comment with evidence? y'know we don't do that here

          Comment


          • #6
            Michael

            Please show results as percentages as well. It;s really hard to read these graphs and insane numbers. E.g. 100% and 102% would be so much easier to comprehend.

            Comment


            • #7
              Originally posted by Anux View Post
              Strange how we see much more improvement while it should be less (v2 to v3) compared to https://www.phoronix.com/review/cachyos-x86-64-v3-v4 with v1 to v4, is it just the different selection of programs?
              It does depend, which package gets tested. Not every package does behave equal and it seems there was a complete different suite tested.
              Has nothing to do with -O3.

              Edit: Also how they are build, if packages are statically or dynamically build and co.

              Comment


              • #8
                Originally posted by avis View Post
                Michael
                Please show results as percentages as well. It;s really hard to read these graphs and insane numbers. E.g. 100% and 102% would be so much easier to comprehend.
                I have a little "trick" just look at the first 2 numbers. First bench for example 90 vs 91, roughly 1% diff. But I agree that some percentage based indicator would be nice on more complex benchmarks.

                Comment


                • #9
                  So just to be clear, this is testing a baseline compiled software stack against a x86_64-v3 optimized software stack, where the former consists of the OS and packages compiled using baseline settings and the latter consists of the OS and packages compiled using x86_64-v3 settings.

                  Here's what Red Hat has to say:



                  The x86-64-v3 x86-64 microarchitecture level primarily benefits numerical applications (for data science, for example) which do not include specialized implementations for modern CPU microarchitectures.
                  In a nutshell, it requires the software developers to either be lazy or incompetent.

                  Further the tested system, dual Xeon Platinum 8592+, has two FMA AVX-512 units per processor, so a total of four SIMD units for this setup. In comparison my i5-1035G1 ​only has 1 SIMD unit capable of SSE/AVX/AVX2/AVX-512.

                  I can't help but feel that if one was to run these "optimized" builds on a system like mine, they would see a performance decrease on benchmarks that should be able to benefit from AVX/2/512 because you now have multiple programs competing for limited resources, programs that have no business using AVX extensions.

                  Comment


                  • #10
                    Originally posted by ptr1337 View Post

                    It does depend, which package gets tested. Not every package does behave equal and it seems there was a complete different suite tested.
                    Has nothing to do with -O3.

                    Edit: Also how they are build, if packages are statically or dynamically build and co.
                    I did some mor digging, Michaels AVX512 benchmarks show 100% improvement on Zen4 for ospray while CachyOS is actually slower with AVX512. And if you look at total numbers: https://www.phoronix.com/review/amd-...per-pro-7995wx I get the feeling that either Michaels test had an error or CachyOS v4 doesn't use any AVX.

                    Comment

                    Working...
                    X