64K Kernel Page Size Performance Benefits For HPC Shown With NVIDIA's GH200 Grace CPU

Written by Michael Larabel in Software on 27 February 2024 at 12:00 PM EST. Page 1 of 5. 9 Comments.

By default the AArch64 kernel on Ubuntu and other Linux distributions tend to default to a standard 4K page size but for newer AArch64 hardware especially in the server/HPC space, there can be great benefits to using a 64K page size. As it's been a while since I last ran any 64-bit ARM 4K vs. 64K kernel page size benchmarks, while having remote access to the NVIDIA GH200 I ran a fresh comparison for looking at the performance advantages to switching over to a 64K page size kernel. These new 64K kernel numbers are shown alongside the recent AMD EPYC and Intel Xeon CPU reference benchmark results for a look at how the 4K vs. 64K page size affects the overall computing landscape.

Ubuntu, Red Hat Enterprise Linux, and other AArch64 minded distributions tend to default to a 4K kernel for AArch64 but some as well do offer a 64K kernel page size kernel build. Ubuntu does as does their Ubuntu Mainline Kernel PPA in catering to today's larger AArch64 servers.

64K kernel uname -a

A 64K page size will typically benefit high performance computing (HPC) workloads dealing with large amounts of memory. Going from a 4K to 64K page size can lead to better TLB hits, less page faults, and all around better memory efficiency. NVIDIA endorses using a 64K page size kernel for their Grace Hopper Superchip and it's becoming more common in the AArch64 space -- as well as seeing more Linux drivers improve compatibility for 64K page size, working out file-system differences, and other kernel code with typical 4K page size assumptions. Using a 64K page size can also lead to higher RAM use, albeit not as much of a problem in the server space.

This article is mainly to provide some fresh reference figures around the AArch64 64K vs. 4K page size performance on a modern kernel while testing in various CPU workloads on the GPTshop.ai GH200 AI workstation. These benchmarks build off the recent NVIDIA GH200 vs. Intel Xeon / AMD EPYC CPU benchmarks.

GPTshop.ai NVIDIA GH200 Grace CPU Linux Benchmarks

For the new data in this article is "GPTshop.ai GH200 + Linux 6.8" as a run of GH200 with the Linux 6.8 Git kernel being utilized rather than the default Linux 6.5 kernel of Ubuntu 23.10. With Ubuntu 24.04 LTS aiming to ship with Linux 6.8, I ran this run for reference to see about performance benefits in going from Ubuntu 23.10's Linux 6.5 to 6.8. The second new run in these benchmark results is "GPTshop.ai GH200 + Linux 6.8 64k" for the 64K page size kernel build. Both Linux 6.8 AArch64 kernel builds were obtained from the Ubuntu Mainline Kernel PPA for easy reproducibility.

Related Articles