AVX-512 Performance Comparison: AMD Genoa vs. Intel Sapphire Rapids & Ice Lake
With last week's launch of Intel's 4th Gen Xeon Scalable Sapphire Rapids server processors, Intel heavily talked up the shiny new accelerators and the big performance potential of AMX, but not really showcased and only heard through the grapevine was the improved AVX-512 implementation found with these new processors. With Sapphire Rapids there is reduced penalties from engaging AVX-512 -- and for some AVX-512 instructions, no longer any measurable impact -- compared to prior generation Xeon processors. In this article is a look at the performance for a wide variety of workloads with AVX-512 on/off not just for Sapphire Rapids but also for prior generation Ice Lake as well as AMD's new EPYC 4th Gen "Genoa" processors where they have introduced AVX-512 for the first time.
Hearing of improved AVX-512 handling with Sapphire Rapids had me quite excited considering the number of workloads out there right now able to leverage AVX-512, compared to the early software state with the accelerators, etc. Plus with the new AMD Zen 4 AVX-512 implementation being quite efficient I was eager to see how the change in performance would compare.
For today's article I ran benchmarks of the Intel Xeon Platinum 8490H (Sapphire Rapids), Intel Xeon Platinum 8380 (Ice Lake), and AMD EPYC 965 (Genoa) processors all in two CPU configurations as the flagship models of each generation. From there I ran a wide variety of (mostly real-world) benchmarks with AVX-512 support enabled and then disabled/compiled-out.
Aside from servers featuring an AVX-512 toggle in the BIOS, the AVX-512 usage can also be manipulated using "clearcpuid=304" as a Linux kernel boot option to clear the AVX-512 flags from being used by the kernel or exposed as part of the /proc/cpuinfo output for applications parsing that for determining CPU features to enable. And additionally for the apps built from source compiling with/without the AVX-512 options.
In addition to looking at the raw performance across Ice Lake / Sapphire Rapids / Genoa with AVX-512 on/off, the CPU core temperatures, combined CPU power consumption, and CPU peak frequency (the highest observed every second across any of the CPU cores) were recorded as complementary data metrics for each benchmark.
These AVX-512 results are quite interesting so let's jump straight to the data.