SMT Performance Benchmarks Continue To Show Benefit With AMD Zen 5/5C

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • avis
    Senior Member
    • Dec 2022
    • 2273

    #31
    Originally posted by ddriver View Post

    You are being duped by crapple's anti-competitive practice to exclusively reserve initial capacity for each new node before anyone else. This allows them to deploy their upcoming "higher transistor budget" products a cycle before the rest of the industry.

    So you are not making a generational comparison here. You are comparing across generations.

    On the same node and within similar logic budgets, amd tends to pull ahead. And when it comes to HPC, there's no contest there...
    So, Apple indeed has the performance crown, right? As a user I couldn't care less about nodes and their availability, I go out and buy products.

    And Apple M4 leaves Zen5 and yet to be released Arrow Lake in the dust.

    Your verbal and logical exercises are futile.

    Comment

    • ddriver
      Senior Member
      • Jan 2014
      • 728

      #32
      Originally posted by avis View Post
      Your verbal and logical exercises are futile.
      So... you are essentially saying that between "logical and technical facts" and "paid for commercial media sensationalism for dummies" you see no contest, and automatically pick the latter?

      Well, that's only to be expected from someone who's incompetent and thus incapable of picking the former. It is not actually your choice... it is the product of lack of any.

      Instead you wasted your choice on... inadvertently yet proudly calling yourself stupid as the final argument in your favor... that's quite something... too bad you probably can't appreciate it, or leverage its immense learning potential...


      Comment

      • zeealpal
        Senior Member
        • Jul 2010
        • 130

        #33
        Originally posted by tenchrio View Post

        Crazy what happens when you actually include the last top of the line performing Ryzen Laptop CPUs like the 7945HX3D and the HX 370 in the Multi core comparison:
        Screenshot from 2024-08-03 04-27-31.png​​
        The HX370 (4nm node) beats the M4 (3nm node) and the 7945HX3D (5nm) might not be power efficient but it is still king of the performance (well at least for MT).
        It's less rabid fanboys and more the fact that, generally speaking, you tend to be wrong.

        Not to mention these are synthetic benchmarks, they don't necessarily translate to actual performance in software and the fact that despite your constant claims of Linux being worse than Windows (which is probably what notebookcheck uses for their benchmarks), it has better performance across the board as usually proven by this site and others including on Geekbench, even when say running Cinebench through Wine. Either closing the gap for X86 chips or setting them in the lead.
        image.png
        Also, from a uArch perspective which has already been mentioned but not compared here is the transistor budget to achieve the results.
        • Apple M3-Pro: 37b transistors.
        • Apple M4: 28b transistors.
        • Ryzen 9 7945HX3d: 17.8b transistors.
        Obviously there are trade-offs for power vs area, general vs accelerator etc... but it's easy to say a processor is more performant / power efficient, but when it costs 50% more to make it's less of an achievement.



        Comment

        • avis
          Senior Member
          • Dec 2022
          • 2273

          #34
          Originally posted by ddriver View Post

          So... you are essentially saying that between "logical and technical facts" and "paid for commercial media sensationalism for dummies" you see no contest, and automatically pick the latter?

          Well, that's only to be expected from someone who's incompetent and thus incapable of picking the former. It is not actually your choice... it is the product of lack of any.

          Instead you wasted your choice on... inadvertently yet proudly calling yourself stupid as the final argument in your favor... that's quite something... too bad you probably can't appreciate it, or leverage its immense learning potential...


          Blacklisted and reported.

          Not only you lack logical thinking and cannot accept reality, you went so far as to insult me.

          I provided a benchmark which proves beyond reasonable doubt that Apple M4 is the fastest and most efficient uArch on Earth at the moment.

          Also, your "counter arguments" are squarely invalid. NVIDIA used an inferior node (Samsung 8) back for their Ampere architecture (the GeForce 30 series) and their GPUs were superior to AMD GPUs (the Radeon RX 6000 series) using a more advanced node (TSMC N6). Node itself doesn't give you an advantage if you don't know how to apply it.

          You know absolutely nothing about the semiconductor industry.
          Last edited by avis; 03 August 2024, 05:01 AM.

          Comment

          • tenchrio
            Senior Member
            • Sep 2022
            • 173

            #35
            Originally posted by avis View Post

            Blacklisted and reported.

            Not only you lack logical thinking and cannot accept reality, you go so far as to insult me.

            I provided a benchmark which proves beyond reasonable doubt that Apple M4 is the fastest and most efficient uArch on Earth at the moment.
            Lol, typical Birdie, one of his claims is already debunked but he will keep repeating his "facts" and throw a hissy fit because people don't agree with him.
            Aside from everything I said before, you also do realize that the site you provided only accounts for laptop CPUs right?
            If you look at Geekbench's singlecore and multicore results, neither of them have the M4 at the top of their list.

            The M4 is power efficient but it definitely is not the fastest.

            Originally posted by avis View Post
            Also, your "counter arguments" are squarely invalid. NVIDIA used an inferior node (Samsung 8) back for their Ampere architecture (the GeForce 30 series) and their GPUs were superior to AMD GPUs (the Radeon RX 6000 series) using a more advanced node (TSMC N6). Node itself doesn't give you an advantage if you don't know how to apply it.
            Funny how you don't provide benchmarks, almost as if it would prove your claim to be false.
            And easily so. There is an entire meme with AMD that their hardware is comparable to fine wine, and RDNA2 was no exception. The 6800XT now easily beats the RTX 3080 in a multitude of games with recent drivers, that is ignoring the fact that the 6800XT had an MSRP of $649 and the RTX 3080 was $699 or that the 6800XT has 2 million less transistors and a smaller die size. And of course the fact that the RX 6800XT already had much lower power consumption compared to the RTX 3080
            image.png

            Originally posted by avis View Post
            You know absolutely nothing about the semiconductor industry.
            Another Birdie classic, projection.

            Comment

            • chithanh
              Senior Member
              • Jul 2008
              • 2493

              #36
              Originally posted by milkylainen View Post
              As the frontend and resources get beefier, the benefit of SMT usually increases, not the opposite?
              It depends on what you want to achieve.
              Intel had SMT in their small Atom CPUs like Diamondville, directed at ultraportables and later cores directed also at smartphones. This was mainly to increase UI responsiveness which is poor on single core systems, and to allow better utilization of the early in-order designs.

              On the server side, SMT can help mask I/O latency in throughput oriented applications. Sun Niagara was the prime example, doing away with branch predictor and all that stuff which helps single thread performance, instead making the CPU switch threads in a single clock cycle.

              Downside is that every thread uses L2 cache, so that needs to be big. Also memory bandwidth, both of which is plenty on servers but very limited on mobile.

              Originally posted by fitzie View Post
              no one in risc-v or arm are going anywhere near smt. as jim keller says, there's better things to do with the transistor budget.
              Originally posted by pWe00Iri3e7Z9lHOX2Qx View Post
              And ARM had the Cortex A65 with 2-way SMT.
              ARM also had the Neoverse E1 with SMT.
              They failed to generate much enthusiasm though.

              Comment

              • drakonas777
                Senior Member
                • Feb 2020
                • 532

                #37
                Originally posted by avis View Post

                So, Apple indeed has the performance crown, right? As a user I couldn't care less about nodes and their availability, I go out and buy products.

                And Apple M4 leaves Zen5 and yet to be released Arrow Lake in the dust.

                Your verbal and logical exercises are futile.
                Perhaps you should take your medications, because your post is yet again a series of illogical and detached from reality statements.

                First of all your posted link does not contain any ZEN5 or Arrow Lake SKU, so there is no need to mention them, because at this point you are operating on speculations and not facts. Second of all, Geekbench is not accurate representation of average performance in wide variety of real world applications. For example, Golden Cove Geekbench ST advantage over ZEN3 was a lot bigger than the real world ST average. Third of all, M4 in iPad is very power limited, so we don't know how much it will scale with larger envelope and we don't know how well ZEN5 and AL are going to scale as well, so we come back to the first point that we lack a lot of details to make reasonable conclusions. Fourth of all, if you are making an argument that you don't care about lithography or anything else, just product itself, then you should not use term "x86 uarch", but ZEN4/Lunar Lake itself, because you are talking about specific implementations and not general x86-specific features. Apple tends to be more advanced then Intel/AMD in lithography and packaging given the same point in time. How much M4 ARM is really ahead x86 in terms of absolute raw performance we will see when Michael tests M4 with Asahi versus ZEN6 in the Linux. Spoiler alert: it won't be ahead and ARM fanboys will cry a river here with "NoT oPTiMiZeD" comments.
                Last edited by drakonas777; 03 August 2024, 06:17 AM.

                Comment

                • qarium
                  Senior Member
                  • Nov 2008
                  • 3446

                  #38
                  this benchmark is highly missleading because it only show difference between hyperthreating on and hyperthreating off on the same hardware.

                  thats nonsense because if you do not put in the tranistors for hyperthreating you can add more cores and you can add other features like AV2 decode/encode or more and bigger NPU...

                  also it only shows the benefit and does not count in the negative side that the customers have to buy more RAM for the same job.

                  if people can choose 2 notebooks modells with 32gb ram one is qualcomm elite x and the other one a AMD ryzen AI 300... well with similar performance and similar power efficiency you better go with the ARM modell because then 32gb ram is more than 32gb ram with hyperthreatin.
                  Phantom circuit Sequence Reducer Dyslexia

                  Comment

                  • qarium
                    Senior Member
                    • Nov 2008
                    • 3446

                    #39
                    Originally posted by avis View Post
                    Power consumption and temperatures are the same because you cannot power gate SMT, it's always on whether it's actually enabled (e.g. in BIOS or via CPU scheduler) or not.
                    The 35% performance uplift is nothing to sneeze at. I have my concerns about Arrow Lake that will feature "only" 24 cores (8P + 16E). 16 full fat Zen 5 cores (9950X) may actually turn out to be faster not to mention 9950X3D.
                    if money does not count well... then ok.

                    budged people would consider the fact that 24 threats need less amount of ram than 32threats...

                    sadly intels E-cores waste ram as well..
                    Phantom circuit Sequence Reducer Dyslexia

                    Comment

                    • qarium
                      Senior Member
                      • Nov 2008
                      • 3446

                      #40
                      Originally posted by ddriver View Post
                      Really? Better use than "more than 15% better performance for less than 1% power increase"? Because that's pretty good from an engineering perspective.
                      And it is a pretty good trade-off in tech. 1% up-cost for 15% improvement that is. People are used to stuff like "100% cost/power increase for 10% perf increase".
                      Intel is only removing it as part of their uarch design stupidity spree, their core is now so inefficient and bloated they can't afford to saturate the pipeline fully. I have no doubt that they saw far more diminished returns from it, but that's not a flaw of SMT, but an artifact of intel's bad cpu design.
                      What better thing did intel spend that transistor budget on? It appears to evade me. If anything, intel appears quite eager to needlessly invite all sorts of unwarranted design and manufacturing complexity for no good reason whatsoever, and despite the many issues woes and suffering that causes them, so I'd reckon they are removing SMT not as a part of some effort to optimize and streamline the architecture, but because it has gotten way too bad to even afford or merit full saturation. They are being forced to remove SMT basically, it is not really a matter of choice.
                      What riscv or arm cpu has managed to edge out amd by making such a better use of those transistors really?
                      If nobody's doing it because there's better things to do, I guess we should be seeing amd struggling and lagging behind for carrying that extra burden. And last time I checked, amd was actually the performance leader.
                      i think you only focus on the cpu chip itself... if you see the complete system inclusive ram

                      then well hyperthreating systems more or less need the double ram size then all your cost efficiency is gone from the ram costs alone
                      Phantom circuit Sequence Reducer Dyslexia

                      Comment

                      Working...
                      X