AMD Launches EPYC 9004 "Genoa" Processors - Up To 96 Cores, AVX-512, Incredible Performance
Following September's successful launch of the AMD Ryzen 7000 series "Zen 4" desktop processors, today AMD is lifting the embargo on their EPYC 9004 series "Genoa" server processors. EPYC Genoa takes AMD server processors to the new SP5 socket, up to 96 cores / 192 threads per socket, AVX-512 with Zen 4, twelve channels of DDR5 system memory, and much more -- all combined it puts AMD and the industry at new levels of HPC performance. I've been benchmarking the AMD EPYC Genoa processors the past few weeks to astounding success. This article is looking more at the feature set and platform for Genoa while separately are my initial AMD EPYC 9554 / EPYC 9654 Linux review and benchmarks.
In addition to Zen 4 bringing AVX-512, the updated Zen microarchitecture also provides a larger op cache, larger register file, a larger L2 cache, and more. Like on the Ryzen 70000 desktop side, AVX-512 with Genoa is implemented using a 256-bit data path with the "double pumping" approach that has proven to be remarkably efficient and in good shape. As I've shown in the AVX-512 Zen 4 desktop benchmarks this implementation has worked out surprisingly well and from the tests I've done Genoa locally, the AVX-512 wins are equally compelling on the server side -- and more impactful too given the greater number of code-bases with AVX-512 support on the server/HPC side.
Across a range of 33 server workloads, AMD is reporting about a 14% uplift at fixed frequency and 8+1 CCD configuration over prior generation Milan processors. The bulk of that uplift is attributed to improvements on the front-end along with load/store enhancements, improved branch prediction, enhanced execution engine, and the larger L2 cache. AMD EPYC Genoa is making use of 5nm chiplets and then a 6nm process for the I/O die.
The instruction support for Genoa is the same as with the AMD Ryzen 7000 series desktop CPUs, including BFloat16 and Vector Neural Network Instructions (VNNI) in addition to standard AVX-512. Genoa also adds support for 5-level page tables, L3 cache range reservation, quality of service for storage class memory, X2AVIC, SMT protection for guest VMs, Secure Memory Encryption now supports AES-256-XTS, and Automatic IBRS as an enhancement on the speculative execution side.
With prior generation AMD EPYC Milan(X) and Rome processors there was support for eight memory channels at DDR4-3200. Now with 4th Gen EPYC that's up to twelve memory channels and DDR5-4800 support. Twelve memory channels per socket of DDR5-4800 is great news for those with memory bandwidth intensive server workloads while there is still the ability to run in 2, 4, 6, 8, or even 10 memory channels.
Also exciting with AMD EPYC Genoa is support for Compute Express Link (CXL) 1.1. AMD is promoting this as CXL 1.1+ as it does include features from CXL 2.0 around Type-3 memory support for system memory expansion.