Announcement

Collapse
No announcement yet.

Amazon Talks Up Big Performance Gains For Their 7nm Graviton2 CPUs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    BTW, these performance comparisons were actually per-vCPU (i.e. ARM core or x86 hyperthread), relative to their Intel Xeon Platinum 8175 instances -- not the current-gen Gravions, as I had assumed.

    Amazon's estimates put their Graviton2 CPU cores at offering around 43% better SPECjvm performance, 44% better for SPEC CPU, about 24% faster for Nginx, 43% higher for Memcached, and about a 26% improvement with H.264 video encoding.
    Even though it's comparing hyperthreads to cores, that should still put some things in perspective!

    The 24-core, 48-thread Intel CPU clocks at 2.5/3.1 GHz (base/turbo). Because it's per-vCPU (on presumably a multi-core workload), scale up those numbers by 33% to see the aggregate performance of a single-CPU instance relative to the Xeon.
    Last edited by coder; 03 December 2019, 08:14 PM.

    Comment


    • #12
      Originally posted by coder View Post
      I think it's a given that there was a business case for it. I'm just saying that the business case must've been more than simply "to put pressure on their suppliers". These chips would have to deliver cost savings or competitive advantage, in their own right.
      Ah, yeah, agreed. Their characteristics certainly make business sense on their own.

      Comment


      • #13
        Originally posted by _Alex_ View Post

        To wake up x86 manufacturers that there are better alternatives in terms of cost and perf/watt. If they start losing clients maybe they'll start making better CPUs.
        Except they don't have better performance per watt. There is not a single ARM CPU out there that has better performance per watt. This is to say nothing about clock speed scaling or SMT. ARM is only great at low power scenarios. Until this changes, it will continue to take a backseat to x86 and other architectures.

        Comment


        • #14
          Originally posted by coder View Post
          BTW, these performance comparisons were actually per-vCPU (i.e. ARM core or x86 hyperthread), relative to their Intel Xeon Platinum 8175 instances -- not the current-gen Gravions, as I had assumed.



          Even though it's comparing hyperthreads to cores, that should still put some things in perspective!

          The 24-core, 48-thread Intel CPU clocks at 2.5/3.1 GHz (base/turbo). Because it's per-vCPU (on presumably a multi-core workload), scale up those numbers by 33% to see the aggregate performance of a single-CPU instance relative to the Xeon.
          Xeon CPUs are actually relatively low end these days. EPYC ROME CPUs have up to 64 cores/128 threads (with higher IPC), a base/boost of 2.25/3.4 GHz, 128 lanes of PCIE 4.0, and the ability to have up to 2 CPUs in one system (128 cores/256 threads) all packed into a 225 watt TDP. Not trying to make this read like an AMD advertisement, but rather to point out that x86 is still way ahead of the pack in terms of computing power. Note that the EPYC CPUs can be packed into a 1U chassis. That's why AMD is holding at least 11 world records, and it's chips are in many next gen super computers. Note that you won't find ARM anywhere in there. I'll be happy when ARM catches up, but they've always been about 3-5 years away from being a good replacement for anything.

          Comment


          • #15
            Originally posted by milkylainen View Post
            Also. Why would Amazon do this?
            There are certain workloads where raw CPU is not a/the bottleneck. A slower, cheaper, instance can have pricing benefits for such use cases, and ARM may be an appropriate fit where other solutions (such as serverless via lamda) will not work (or will take too long to transition to).

            And as to why, this is always about money. AWS believes they see a way to make money on this.

            Comment


            • #16
              Originally posted by betam4x View Post

              Except they don't have better performance per watt. There is not a single ARM CPU out there that has better performance per watt. This is to say nothing about clock speed scaling or SMT. ARM is only great at low power scenarios. Until this changes, it will continue to take a backseat to x86 and other architectures.
              This is correct. Soon as they started trying to scale performance up to normal server loads and tasks (eg nginx, postgresql, etc) for ARM CPUs the performance just isn't there for the same amount of power input. This is one reason why ARM(64) on the server rack has yet to catch on. I'm sure ARM and their clients are working on the problem, but for now the numbers only work out for light workloads to be done on small scales for specific use cases. This is ARM's typical bread and butter, so that shouldn't be any surprise.

              Meanwhile Intel/AMD/Via/others have been working on their CPU power v. performance needs for just as long and there's considerable pressure from mega partners like Microsoft, Amazon, Google, etc to cut and keep their power consumption as low as they can manage while keeping performance on the up curve. Literally the goal posts move on performance per Watt for each iteration whether it's from Intel or AMD.

              I don't believe ARM Ltd (and by extension Qualcomm, Broadcom, etc) is going to be any real threat to Intel (corporation) in the server room any time soon. AMD on the other hand...

              Comment


              • #17
                Originally posted by betam4x View Post
                Xeon CPUs are actually relatively low end these days. EPYC ROME CPUs have up to 64 cores/128 threads (with higher IPC), a base/boost of 2.25/3.4 GHz, 128 lanes of PCIE 4.0, and the ability to have up to 2 CPUs in one system (128 cores/256 threads) all packed into a 225 watt TDP.
                Yeah, I know all of that. The point was just to show that the new CPU significantly outperforms their current Xeon instance, so as to give an idea of how serious this is.

                It doesn't need to outperform Epyc, in order to be competitive. Simply offering lower purchase price & better perf/W could be enough. We might never know about perf/W, but we can certainly benchmark it against Epyc, when the Gravion 2 instances come online. However, due to the lack of SMT, I doubt it will be able to provide comparable performance.

                There's also a case for density. AMD had to make Epyc's socket enormous, partly to support the 2-CPU configuration you mention. Amazon's marketing dept might've determined that not enough customers want > 64 vCPU instances for them to bother with that, so they can save on socket size & complexity, not to mention the decreased energy efficiency of multi-CPU configs. Plus, they can surface-mount the suckers, which would also help. For anyone who does want more vCPUs, it's not like Amazon won't still have Epyc instances available... for a price.
                Last edited by coder; 03 December 2019, 11:20 PM.

                Comment


                • #18
                  Originally posted by betam4x View Post
                  Except they don't have better performance per watt. There is not a single ARM CPU out there that has better performance per watt.
                  Also, the Earth is not round. There is not a single shred of evidence that the Earth is not flat.

                  Comment


                  • #19
                    Originally posted by coder View Post
                    However, the intrinsic perf/W advantages of ARM vs. x86 have been long- and well- established.
                    I do agree on most things but not necessarily this one.
                    Almost all modern high-end microarchs are post-risc-macro-op-vliw-whatever on the inside.
                    Instruction decode of a modern microarch is almost a rounding error in performance and power envelope.

                    Modern performance roughly translates to spent-transistors / spent-power / fabrication process. Regardless of ISA.
                    Spending in each category translates to characteristics that are comparable within equal-sized cpus with the same power budget built in an equal fabrication process.
                    The microarch teams building these CPUs make deliberate tradeoffs for a specific target. So while differences do exist there is no magic sauce to it.

                    So while yes, a ARM can be more power efficient than a x86 CPU for certain tasks it would take a beating in other categories.
                    Single threaded performance, Housing density, etc.

                    Traditionally, ARM has held the low end and x86 held a higher end.
                    Now. x86 has been working on power efficiency and ARM on beefier cores.

                    I would, without hesitation, say that if you want a lot of transistors doing efficient work for as variable workload as possible for as little money as possible, you go x86.
                    If you want battery powered or some other specialty, well that is another question.

                    I don't think this will actually outperform contemporary x86 CPUs on the same power budget. Similar. Yes. Perhaps. But to what cost?
                    Edit: Comparing an unreleased CPU to almost 3 year old microarch-releases does not say much on how it will perform against contemporary CPUs when released.
                    Last edited by milkylainen; 04 December 2019, 01:59 AM.

                    Comment


                    • #20
                      Originally posted by CommunityMember View Post
                      There are certain workloads where raw CPU is not a/the bottleneck. A slower, cheaper, instance can have pricing benefits for such use cases, and ARM may be an appropriate fit where other solutions (such as serverless via lamda) will not work (or will take too long to transition to).
                      Not just certain workloads, but the vast majority of internal business server workloads. Most business hypervisor environments run out of memory long before they peak out the CPU's. RAM TB per socket has been the limiting factor for nearly all my clients. Quite frankly, the CPU doesn't much matter any more these days. I have customers running still on Dell R815 servers (AMD Opteron 6200 series) and even with dozens of VM's, aren't exceeding 15% CPU utilization.

                      Things like scientific workloads, rendering, or audio/video processing all work way better on bare metal, or if virtualized, require very fast CPU's. But most businesses don't run these workloads. They run things like Web applications, databases, email, file servers, etc. which run great on even low-end hardware.

                      Originally posted by CommunityMember View Post
                      And as to why, this is always about money. AWS believes they see a way to make money on this.
                      ^ yup. This is *the* reason.
                      Last edited by torsionbar28; 04 December 2019, 02:01 AM.

                      Comment

                      Working...
                      X