Announcement

Collapse
No announcement yet.

Quantifying The AVX-512 Performance Impact With AMD Zen 5 - Ryzen 9 9950X Benchmarks

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Quantifying The AVX-512 Performance Impact With AMD Zen 5 - Ryzen 9 9950X Benchmarks

    Phoronix: Quantifying The AVX-512 Performance Impact With AMD Zen 5 - Ryzen 9 9950X Benchmarks

    With the AMD Ryzen 9 9900X and Ryzen 9 9950X Linux review out of the way yesterday, today's benchmarking of the Ryzen 9000 series is looking closely at the AVX-512 performance impact. With the Ryzen 9000 series the Zen 5 cores have a full 512-bit data-path compared to the "double pumped" 256-bit data path found in the Zen 4 processors as well as the Strix Point SKUs. In this article is an AVX-512 enabled versus disabled comparison for not only the Ryzen 9 9950X but also the prior generation Ryzen 9 7950X and looking too at the CPU power use, thermals, and peak frequency when engaging a variety of AVX-512 workloads.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    my 11th gen tigerlake 8 core also likes avx-512 workloads - it's a shame intel - you let it die for consumers.

    Comment


    • #3
      Originally posted by spiral_23 View Post
      my 11th gen tigerlake 8 core also likes avx-512 workloads - it's a shame intel - you let it die for consumers.
      What sku model is that CPU of yours? Curious what your system setup is for that CPU? Is it an laptop or desktop?

      Just asking cause I'm rocking a cousin of your cpu, an 8-core Rocket Lake chip.

      Comment


      • #4
        However, 9950X's avx512 multithreaded scalability is clearly limited by memory bandwidth, as most avx512 workloads can't get too far ahead of 9700X.

        Comment


        • #5
          Originally posted by edxposed View Post
          However, 9950X's avx512 multithreaded scalability is clearly limited by memory bandwidth, as most avx512 workloads can't get too far ahead of 9700X.
          128 bits bus for memories is indeed limited: we're in the territory of gpu-kind tasks and 128 bit memory buses are the midrange there.

          Apple made the M-series so interesting and their gpu so performant thanks to their large memory bus.

          Comment


          • #6
            Originally posted by edxposed View Post
            However, 9950X's avx512 multithreaded scalability is clearly limited by memory bandwidth, as most avx512 workloads can't get too far ahead of 9700X.
            I understand that Michael is preparing memory scaling benchmarks, so we'll know soon enough.

            Comment


            • #7
              That looks pretty nice. Also ordered today a 9950X and will be quite interesting, how this generally runs on CachyOS, with the Zen4 optimized repository.

              Having Hynix M Dies, and were able to improve timings, FCLK as well as pushing it to 6400 1:1 with my current 7950X3D, so lets hope this will quite help with memory bandwidth.

              On Expo (either if 8000,6400 or 6000) the secondary timings can do a massive difference. Even just increasing the tREFI from 7000 to the max value commonly already helps quite much.

              Comment


              • #8
                It is interesting that they jumped in a single generation from the double pumped data path to a full 512-bit path. There was no shortage of "stupid Intel should have gone double pumped" when Zen 4 launched. There are very nice gains here with the full 512-bit approach. Intel iterated to a good spot over time on this, then killed it off due to the heterogeneous core / scheduling complexity. Now AMD gets to slaughter them even worse on the consumer side.

                Comment


                • #9
                  Originally posted by spiral_23 View Post
                  my 11th gen tigerlake 8 core also likes avx-512 workloads - it's a shame intel - you let it die for consumers.
                  I have a TGL notebook and while I don't use it for massive workloads, the AVX-512 definitely works in certain situations.

                  I like this review and my only additional recommendation for benchmarks would be x264/x265/AV-1 transcoding with AVX-512 on and off.

                  Comment


                  • #10
                    Originally posted by blackshard View Post

                    128 bits bus for memories is indeed limited: we're in the territory of gpu-kind tasks and 128 bit memory buses are the midrange there.

                    Apple made the M-series so interesting and their gpu so performant thanks to their large memory bus.
                    This may be a dumb idea, but I've been thinking lately that CPUs maybe carry with them a small amount of fast RAM for the iGPU, CPU, and AI usage, and still have classic pluggable RAM, like SODIMM and LPCAMM2. I'm not sure if operating systems would like tierred RAM, but it would perhaps mitigate a lot of performance issues and maintain upgradeability and repairability. As an example, we could add 16GB of 2048-bit RAM on a 9950x or one of its future generations, and let you just add your 64 GB of pluggable RAM.

                    If they could do this, you could perhaps see the 2048-bit bus for the on-chip RAM and the classic 128-bit bus for the rest.

                    I imagine this idea isn't original and it's potentially unfeasible. Just a thought
                    Last edited by Mitch; 15 August 2024, 12:02 PM.

                    Comment

                    Working...
                    X