Linux Patches Rewrite The Retpoline Rewrite Code - Possible Performance Benefit
Intel engineer and longtime kernel developer Peter Zijlstra posted the set of nine patches today that rewrite the way Retpolines are rewritten. Zijlstra explained, "currently objtool emits alternative entries for most retpoline calls. However trying to extend that led to trouble (ELF files are horrid). Therefore completely overhaul this and have objtool emit a .retpoline_sites section that lists all compiler generated retpoline thunk calls. Then the kernel can do with them as it pleases."
This rewritten code in turn will ensure that the Retpolines are rewritten to indirect instructions for cases where Retpoline is not enable, and rewriting to indirect LFENCE for the AMD Retpoline handling where size allows by the compiler. The x86 BPF code is also updated to match the behavior of the rest of the kernel around Retpolines -- previously the BPF code wasn't checking the X86_FEATURE_RETPOLINE flags but unconditionally emitting a thunk call. With the rewritten code it also makes running with the "spectre_v2=off" boot option closer to the kernel image if building the kernel without the "RETPOLINE=" Kconfig option enabled.
The notable part for end users is: "All this should help improve performance by removing an indirection." This may help some albeit don't expect any miracles and there will still be obvious overhead to using return trampolines where needed.
This set of patches reworking that Retpolines logic for the Linux kernel can currently be found on the kernel mailing list.