The "What If" Performance Cost To Kernel Page Table Isolation On AMD CPUs
Made public this week by CPU security researchers at Graz University of Technology and CISPA Helmholtz Center for Information Security was the research paper published "AMD Prefetch Attacks through Power and Time". The paper points to AMD CPUs suffering from a side-channel leakage vulnerability through timing and power variations of the PREFETCH instruction. The paper argues that AMD CPUs should activate stronger page table isolation by default. AMD has now published their security response where they are not recommending any mitigation changes at this time. But what if Kernel Page Table Isolation (KPTI/PTI) proves necessary for AMD CPUs? Here are some initial benchmarks showing what that performance impact could look like.
The AMD PREFETCH attack whitepaper can be read here (PDF). The researchers wrote, "We discover timing and power variations of the prefetch instruction that can be observed from unprivileged user space. In contrast to previous work on prefetch attacks on Intel, we show that the prefetch instruction on AMD leaks even more information. We demonstrate the significance of this side channel with multiple case studies in real-world scenarios. We demonstrate the first microarchitectural break of (fine-grained) KASLR on AMD CPUs. We monitor kernel activity, e.g., if audio is played over Bluetooth, and establish a covert channel. Finally, we even leak kernel memory with 52.85 B/s with simple Spectre gadgets in the Linux kernel. We show that stronger page table isolation should be activated on AMD CPUs by default to mitigate our presented attacks successfully."
This paper was accepted for the USENIX Security 2022 conference. The Graz University of Technology researchers were previously involved in other CPU microarchitecture security vulnerability discoveries, such as Meltdown, PLATYPUS, LVI, and others.
A pre-pandemic trip to Graz... TU Graz in Austria is becoming increasingly known for their CPU microarchitecture security research.
The paper by Moritz Lipp, Daniel Gruss, and Michael Schwarz suggests that stronger page table isolation (PTI) should be in place by default for AMD processors given their discoveries. The Linux kernel already supports Kernel Page Table Isolation that was brought in because of Meltdown.
This vulnerability is assigned CVE-2021-26318, "A timing and power-based side channel attack leveraging the x86 PREFETCH instructions on some AMD CPUs could potentially result in leaked kernel address space information.".
AMD this week meanwhile issued bulletin AMD-SB-1017. In that security bulletin they summarized, "Researchers from Graz University of Technology with CISPA Helmholtz Center for Information Security have demonstrated timing and power-based side channel attacks leveraging the x86 PREFETCH instructions on some AMD CPUs. The attacks discussed in the paper do not directly leak data across address space boundaries. As a result, AMD is not recommending any mitigations at this time."
They acknowledge though "all AMD CPUs" are affected products and do encourage standard mitigation best practices, including keeping system software and firmware up-to-date.
So while AMD is not recommending any new mitigation changes, given the researchers suggesting otherwise and already getting around numerous Phoronix readers asking about the performance implications, *if* the default were to change... I ran some benchmarks.
With the Linux kernel already supporting Kernel Page Table Isolation albeit not enabled by default on AMD CPUs, it's easy to test the behavior otherwise. KPTI can already be forced on for AMD CPUs under Linux if booting with the "pti=on" kernel option (not to be confused with the "kpti=1" option that is for controlling page table isolation for AArch64 systems, unfortunately through a different knob).
Graz, Austria is also known for Arnold Schwarzenegger's time there and the "delicious" Puntigamer.
So for answering the question about the performance impact "if" (K)PTI proves necessary for AMD CPUs, here are some preliminary benchmarks showing that the impact would be for pti=on with AMD processors. Linux 5.15 Git was used for testing and no other changes made during testing besides a kernel reboot with "pti=on" active.
It's also worth mentioning that besides AMD not recommending any mitigation changes at this time, external Linux kernel developers so far have not proposed any kernel patches changing any page table isolation behavior or the defaults. So for now just take these results for hypothetical scenario if KPTI needs to be flipped on for AMD CPUs or are very paranoid about security and side with the researchers about the need to enable it. It's also possible that should improved page table isolation become necessary, AMD or other parties may suggest enhancements or alternatives to the existing KPTI code.
So with those notices out of the way, here is a look at the AMD Ryzen 9 5900X across various Linux workloads if booting with the "pti=on" option. Just some quick testing for curiosity sake at least until there is any new/changed guidance on the matter.