SMT Performance Benchmarks Continue To Show Benefit With AMD Zen 5/5C

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • phoronix
    Administrator
    • Jan 2007
    • 67141

    SMT Performance Benchmarks Continue To Show Benefit With AMD Zen 5/5C

    Phoronix: SMT Performance Benchmarks Continue To Show Benefit With AMD Zen 5/5C

    While Intel's upcoming Core Ultra Series 2 "Lunar Lake" laptop processors are doing away with Hyper Threading (HT) and instead focusing more on additional E cores. AMD has assered Simultaneous Multi-Threading (SMT) is still beneficial and supported across both their Zen 5 and Zen 5C cores. For those curious about the SMT performance and power efficiency impact, here are some SMT on/off comparison benchmarks usign the Ryzen AI 9 HX 370 "Strix Point" laptop processor.

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite
  • fitzie
    Senior Member
    • May 2012
    • 672

    #2
    no one in risc-v or arm are going anywhere near smt. as jim keller says, there's better things to do with the transistor budget.

    Comment

    • avis
      Senior Member
      • Dec 2022
      • 2176

      #3
      Power consumption and temperatures are the same because you cannot power gate SMT, it's always on whether it's actually enabled (e.g. in BIOS or via CPU scheduler) or not.

      The 35% performance uplift is nothing to sneeze at. I have my concerns about Arrow Lake that will feature "only" 24 cores (8P + 16E). 16 full fat Zen 5 cores (9950X) may actually turn out to be faster not to mention 9950X3D.

      Comment

      • JEBjames
        Senior Member
        • Jan 2018
        • 369

        #4
        Michael

        typos on page 1

        "AMD has assered Simultaneous" I think should be "asserted"

        "comparison benchmarks usign" should be "using"

        Comment

        • trivik12
          Junior Member
          • Aug 2024
          • 2

          #5
          Its good for benchmarks but how many laptop/desktop apps need to use all the cores/threads. I think its just not worth the xtor budget at a bleeding edge nodes. Plus adds the complexity with big core, little core and threads.

          Comment

          • ddriver
            Senior Member
            • Jan 2014
            • 711

            #6
            Originally posted by fitzie View Post
            no one in risc-v or arm are going anywhere near smt. as jim keller says, there's better things to do with the transistor budget.
            Really? Better use than "more than 15% better performance for less than 1% power increase"? Because that's pretty good from an engineering perspective.

            And it is a pretty good trade-off in tech. 1% up-cost for 15% improvement that is. People are used to stuff like "100% cost/power increase for 10% perf increase".

            Intel is only removing it as part of their uarch design stupidity spree, their core is now so inefficient and bloated they can't afford to saturate the pipeline fully. I have no doubt that they saw far more diminished returns from it, but that's not a flaw of SMT, but an artifact of intel's bad cpu design.

            What better thing did intel spend that transistor budget on? It appears to evade me. If anything, intel appears quite eager to needlessly invite all sorts of unwarranted design and manufacturing complexity for no good reason whatsoever, and despite the many issues woes and suffering that causes them, so I'd reckon they are removing SMT not as a part of some effort to optimize and streamline the architecture, but because it has gotten way too bad to even afford or merit full saturation. They are being forced to remove SMT basically, it is not really a matter of choice.

            What riscv or arm cpu has managed to edge out amd by making such a better use of those transistors really?

            If nobody's doing it because there's better things to do, I guess we should be seeing amd struggling and lagging behind for carrying that extra burden. And last time I checked, amd was actually the performance leader.
            Last edited by ddriver; 03 August 2024, 02:18 AM.

            Comment

            • Mitch
              Senior Member
              • May 2017
              • 365

              #7
              Originally posted by fitzie View Post
              no one in risc-v or arm are going anywhere near smt. as jim keller says, there's better things to do with the transistor budget.
              Doesn't the worthiness of SMT depend on the die-size and power requirements to do SMT as well as opportunity costs? Intel and AMD have different implementations of SMT, so they may have separate incentives to continue or discontinue their own implementations.

              Intel and AMD also have very different 'little' cores. AMD's small-to-big size ratios are significantly greater than Intel's. If AMD could produce tiny cores that can fill in whatever SMT is occupying on the big cores, that may be worth removing SMT.

              Comment

              • pipe13
                Senior Member
                • Jun 2006
                • 392

                #8
                Originally posted by avis View Post
                The 35% performance uplift is nothing to sneeze at. I have my concerns about Arrow Lake that will feature "only" 24 cores (8P + 16E). 16 full fat Zen 5 cores (9950X) may actually turn out to be faster not to mention 9950X3D.
                Might depend upon what they are asked to be faster at. Most applications, as Michael has benched here, most likely. But I've a BLAS/Eigen HPC program heavy on GEMM, and BLAS notes strongly recommend limiting its number of threads to the physical CPU count, rather than using the full virtual CPU count. That is, I use num_threads=16 on my 16 core 5950X. Anything larger and performance falls off a cliff.

                Comment

                • avis
                  Senior Member
                  • Dec 2022
                  • 2176

                  #9
                  Originally posted by ddriver View Post

                  Really? Better use than "more than 15% better performance for less than 1% power increase"? Because that's pretty good from an engineering perspective.

                  And it is a pretty good trade-off in tech. 1% up-cost for 15% improvement that is. People are used to stuff like "100% cost increase for 10% perf increase".

                  Intel is only removing it as part of their uarch design stupidity spree, their core is now so inefficient and bloated they can't afford to saturate the pipeline fully.

                  What better thing did intel spend that transistor budget on? It appears to evade me.

                  What riscv or arm cpu has managed to edge out amd by making such a better use of those transistors really?

                  If nobody's doing it because there's better things to do, I guess we should be seeing amd struggling and lagging behind for carrying that extra burden. And last time I checked, amd was actually the performance leader.
                  Apple is a performance leader and they have zero HT/SMT in their CPUs. And Intel is doing exactly the same.

                  x86 uArchs trail Apple by a large margin both in terms of raw performance and efficiency.
                  Last edited by avis; 02 August 2024, 01:14 PM.

                  Comment

                  • V1tol
                    Senior Member
                    • May 2016
                    • 603

                    #10
                    Originally posted by trivik12 View Post
                    Its good for benchmarks but how many laptop/desktop apps need to use all the cores/threads. I think its just not worth the xtor budget at a bleeding edge nodes. Plus adds the complexity with big core, little core and threads.
                    We are in 2024. Stupid Chrome runs 22 processes with only 2 tabs. Open more tabs, launch some Electron crap - and all your cores will have something to do. And people also use Windows which has tons of background crap... And SMT helps here a lot because those processes don't do much so hardware scheduling on the same physical core doesn't hurt performance.

                    Comment

                    Working...
                    X