Announcement

Collapse
No announcement yet.

Amazon Graviton3 vs. Intel Xeon vs. AMD EPYC Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PerformanceExpert
    replied
    Originally posted by HEL88 View Post
    LOL So x86-64 ISA came from 2003 and it modern too .
    No, x86-64 isn't a modern ISA like AArch64. Practically everything from x86 was kept as is - only a few opcodes were removed to be used as prefixes. It would have been a great opportunity to make major changes and remove a lot of ancient stuff, but that didn't happen unfortunately.

    Leave a comment:


  • PerformanceExpert
    replied
    Originally posted by smitty3268 View Post
    I'm not sure how useful it really is to compare processors with the same threadcount when some have SMT and some don't.

    Ultimately I think performance per $ is what primarily matters for the cloud.

    But if you were attempting to compare the CPU architectures, then I think getting the same # of physical cores makes more sense, because the extra SMT "cores" are just a side-benefit of the architecture, the same way that the looser memory model is a benefit of the ARM architecture.
    Just because there are 16 threads in the chosen instances doesn't mean they are being used in every benchmark. Various are single-threaded (however this isn't clear from any of the results). There is a reason many look at rate-1 and rate-N SPEC results - this also avoids the differences in CPU-specific optimizations in many Phoronix benchmarks.

    The same number of cores doesn't give a good comparison either, you'd have to look at performance per area (and power) in the same process. Consider for example that the Gravitons have less than half the L2/L3 cache of the EPYC instances.

    But yes, for customers the only thing that ultimately matters is perf/$, and that's exactly why Graviton is getting so popular.

    Leave a comment:


  • HEL88
    replied
    Originally posted by PerformanceExpert View Post
    AArch64 came a decade later.
    LOL So x86-64 ISA came from 2003 and it modern too .


    SPARC is ancient
    last SPARC came from 2017, so not so ancient.

    Yes, but now it's dead.

    Recent POWER CPUs are 8-way SMT to reduce per-core software licensing cost
    Yes, if you IMPROVE performance without adding new core you save money on software licensing.
    Last edited by HEL88; 28 May 2022, 09:01 AM.

    Leave a comment:


  • PerformanceExpert
    replied
    Originally posted by AdrianBc View Post


    The RK3588 boards will be much faster than any *cheap* (i.e. under $300 for a complete computer) ARM boards that have ever been available until now.

    Nevertheless the Cortex-A76 cores from RK3588 are nowhere near the performance of the Neoverse V1 cores of Graviton 3 or of any of the Zen cores or of the big Intel cores.

    The Cortex-A76 cores of RK3588 have about the same speed as the Tremont cores of the Intel Jasper Lake and Elkhart Lake CPUs, which are also available in very cheap computers, even if not as cheap as the cheapest of those based on RK3588 (where some smaller boards with 8 GB DRAM should be less than $150).
    No, Cortex-A76 is basically the same core as used in Graviton 2 and Ampere Altra (Max). It achieves ~91% of single-threaded SPECINT2017 of EPYC 7763. So it is a pretty quick core despite being 4 years old... The problem with most of these boards is that they use cost optimized phone CPUs with very little cache and a slow memory system.

    Leave a comment:


  • HEL88
    replied
    Originally posted by mdedetrich View Post

    I was talking about ARM ISA specifically. And its not that its impossible, its that its a lot less necessary.
    you wrote:
    has different word size's for ISA instructions.

    What variable instruction length has relation to hyperteading?? Please tell me.

    SMT by definition is solving a problem of pipelining instructions which is an issue that ARM doesn't really have (and its why even the most high powered ARM CPU's don't have SMT, they don't need it).
    Utilization pipeline depend on program, not ISA. If program has many depends data or use randomly memory utilization of pipeline will be small. Good predictor and big ROB has slight help solve this problems.

    So if program uses e.g. RAM in randomly pattern and pipeline is mostly stall, you may put another thread into pipeline without performance degradation. It is completely independent from ISA.
    Last edited by HEL88; 28 May 2022, 08:56 AM.

    Leave a comment:


  • PerformanceExpert
    replied
    Originally posted by HEL88 View Post

    Why RISC like IBM POWER has 4 and even 8 (8 threads per core) HT on core? SPARC has also 8 HT.

    RISC ISA of POWER and Sparc are newer than ARM's ISA.

    So what you wrote is not true
    SPARC is ancient (and long dead), POWER is more modern, but AArch64 came a decade later.

    Recent POWER CPUs are 8-way SMT to reduce per-core software licensing cost (they basically slap 2 4-way SMT cores together...). Various Arm CPUs have SMT, but they weren't very successful. It turns out that you get more performance by adding extra cores than to add SMT...

    Leave a comment:


  • mdedetrich
    replied
    Originally posted by HEL88 View Post

    Why RISC like IBM POWER has 4 and even 8 (8 threads per core) HT on core? SPARC has also 8 HT.

    RISC ISA of POWER and Sparc are newer than ARM's ISA.

    So what you wrote is not true
    I was talking about ARM ISA specifically. And its not that its impossible, its that its a lot less necessary. SMT by definition is solving a problem of pipelining instructions which is an issue that ARM doesn't really have (and its why even the most high powered ARM CPU's don't have SMT, they don't need it).

    Leave a comment:


  • HEL88
    replied
    Originally posted by mdedetrich View Post

    Well ARM CPU's don't tend to have hyperthreading due to the fact that hyperthreading is mainly a result of trying to squeeze out more performance from an older CISC based style ISA that has different word size's for ISA instructions. ARM ISA doesn't have this issues so afaik there isn't any modern ARM CPU that has hyperthreading/SMT.
    Why RISC like IBM POWER has 4 and even 8 (8 threads per core) HT on core? SPARC has also 8 HT.

    RISC ISA of POWER and Sparc are newer than ARM's ISA.

    So what you wrote is not true

    Leave a comment:


  • AdrianBc
    replied
    Originally posted by tambre View Post

    NVIDIA's Jetson Orin might be the best option once they release the cheaper variants in the autumn. Of course they aren't very open source friendly.
    The Jetson Orin AGX are extremely overpriced, which was also true for the previous generation, Jetson Xavier AGX.

    On the other hand, the development kit for Jetson Xavier NX had a much more acceptable price in comparison with alternatives. I have one and I am content with it, even if the NVIDIA Carmel cores are rather slow and its Volta GPU is not really faster than the GPU of an Intel NUC of the same price, where the CPU is much faster.

    Because the cheapest of the Jetson Orin NX modules has kept the same price as Xavier NX, even if the slow Carmel cores have been replaced with fast Cortex-A78AE cores, making the 6-core CPU part comparable with a 4-core Skylake CPU, and the GPU has been replaced by a much faster Ampere GPU, it can be hoped that the development kit for Orin NX will also have about the same price as before.

    If that will be true, then a Jetson Orin NX will be a good buy. However it would be in a different class of prices than a SBC with Rockchip RK3588.

    While the latter has similar performance and slightly lower prices than Intel Jasper Lake, a Jetson Orin NX will have a price similar to an Intel NUC with Tiger Lake or Alder Lake. A NUC will have a faster CPU and a slower GPU. The Intel GPU is not much slower, having 75% of the ALU number of Orin NX, but the main advantage of NVIDIA Orin remains the software support for NVIDIA CUDA.

    One important advantage of NVIDIA Orin is that, if it will continue to have a documentation as good as NVIDIA Xavier has now, the documentation needed for improving the Linux support will much better and much more complete than the documentation for any other CPU having ARM cores of comparable speed.

    Better documentation than for NVIDIA Xavier exists only for the small and slow microcontrollers with ARM cores (from NXP, ST, Microchip, TI etc.).

    NVIDIA Xavier is very open-source friendly. Now even the GPU driver is open source (even if it was not in the beginning), so it can be hoped that this will continue to be true for NVIDIA Orin.

    Last edited by AdrianBc; 27 May 2022, 06:42 AM.

    Leave a comment:


  • AdrianBc
    replied
    Originally posted by jjmcwill2003 View Post
    Interesting to see how far their Arm architecture has come in terms of performance.

    It's too bad there's nothing comparable in the desktop market. (Honeycomb LX seems comparatively ancient and lacks single thread performance).

    Maybe one of the upcoming RockChip RK3588 based boards will turn out to be interesting, depending on how quickly Linux support arrives.

    The RK3588 boards will be much faster than any *cheap* (i.e. under $300 for a complete computer) ARM boards that have ever been available until now.

    Nevertheless the Cortex-A76 cores from RK3588 are nowhere near the performance of the Neoverse V1 cores of Graviton 3 or of any of the Zen cores or of the big Intel cores.

    The Cortex-A76 cores of RK3588 have about the same speed as the Tremont cores of the Intel Jasper Lake and Elkhart Lake CPUs, which are also available in very cheap computers, even if not as cheap as the cheapest of those based on RK3588 (where some smaller boards with 8 GB DRAM should be less than $150).

    Leave a comment:

Working...
X