The Performance Impact Of AMD Changing Their Retpoline Method For Spectre V2
Made public this week was the Spectre-BHB / BHI vulnerability and while only Intel and Arm processors are currently believed to be impacted, in the course of that research the folks at VUSec discovered AMD's current Retpoline strategy for Spectre V2 mitigations is not adequate. This has led to a change in behavior for AMD processors and is already applied to the Linux kernel. Here is a look at what it means for desktop and server performance due to the change in return trampoline handling.
The VUSec team at Vrije Universiteit Amsterdam noted in their BHI / Spectre-BHB whitepaper that AMD's current Spectre V2 software mitigation stratengy of the LFENCE/JMP-based Retpolines is inadequate. To this point with the Retpoline handling on AMD CPUs they have turned indirect branches into a LFENCE/JMP to fend off Spectre Variant Two vulnerabilities. AMD's approach was designed to perform better on their hardware than the "generic" Retpolines code sequence that results in a RET on indirect branches. The LFENCE "AMD" Retpoline approach has been the default on AMD processors for mitigating Spectre V2.
VUSec Whitepaper on Branch History Injection (BHI). While AMD CPUs are not vulnerable to Spectre-BHB / BHI announced earlier this week, AMD's current Retpoline method was found ineffective in the process.
When being informed about Spectre-BHB/BHI, while AMD CPUs are not affected by that AMD did discover flaws in their AMD Retpoline approach. The LFENCE/JMP sequence is said to be racy and thus not safe for Zen processors. Thus moving ahead using the "generic" Retpolines is now regarded as the recommended approach.
Oops
When sending in the BHI mitigations for Intel CPUs, as part of that patch series is a patch from AMD for using generic Retpolines now by default on AMD. That was applied on Tuesday to mainline Linux 5.17 and is in the process of being back-ported to existing stable Linux kernel series. AMD affirmed, "AMD retpoline may be susceptible to speculation. The speculation execution window for an incorrect indirect branch prediction using LFENCE/JMP sequence may potentially be large enough to allow exploitation using Spectre V2. By default, don't use retpoline,lfence on AMD. Instead, use the generic retpoline."
The latest Linux kernel has now modified the default Retpoline approach for AMD CPUs and other operating systems have or should be soon following suit, given the latest guidance from AMD.
Thus the same Retpoline approach as used by relevant Intel CPUs is now being used. For those desiring the "AMD" Retpoline behavior, it's been renamed to the "LFENCE" Retpoline mode. Patched versions of the Linux kernel can switch to that older but now deemed unsafe Retpoline method using the "spectre_v2=retpoline,lfence" kernel option at boot to avoid going the generic Retpoline route.
The spectre_v2=retpoline,lfence option can be used for going back to the former AMD Retpoline technique for mitigating Spectre V2 albeit puts the system into a "vulnerable" state.
But what is the performance cost in now Linux (and presumably Windows) switching by default from the AMD/LFENCE-based Retpoline mode to now using the generic Retpolines? This article is to lay out those initial benchmarks. On the latest Linux 5.17 kernel I booted up a few AMD Zen 3 systems where now the default is generic Retpolines and the performance was then compared to repeating the benchmarks after rebooting the system and using the "spectre_v2=retpoline,lfence" switch to go back to that prior AMD/LFENCE Retpoline technique as the default up until this week.