Benchmarking Retpoline-Enabled GCC 8 With -mindirect-branch=thunk
We have looked several times already at the performance impact of Retpoline support in the Linux kernel, but what about building user-space packages with -mindirect-branch=thunk? Here is the performance cost to building some performance tests in user-space with -mindirect-branch=thunk and -mindirect-branch=thunk-inline.
Retpoline support was added this weekend to GCC 8 and has since been back-ported so far to GCC 7 that along with the CONFIG_RETPOLINE changes in the Linux kernel are used for mitigating Spectre Variant Two. The -mindirect-branch=thunk option added to the GNU Compiler Collection is used to convert an indirect call and jump to call and return thunks. There's also the -mindirect-branch=thunk-inline for converting indirect calls and jumps to an inlined call and return thunk. By default GCC isn't enabling -mindirect-branch but the default value is "keep" to keep indirect calls unmodified.
Retpoline-patched kernels automatically make use of -mindirect-branch=thunk in the kernel build when available for full kernel protection. But you can also build user-space packages with -mindirect-branch=thunk to avoid speculative indirect calls within application code. Curious about the impact, I ran tests of some user-space benchmarks when adding to the CFLAGS/CXXFLAGS -mindirect-branch=thunk and then -mindirect-branch=thunk-inline.
During this round of compiler benchmarking on the Intel Core i9 7980XE running Debian 9.3 it was using a Linux 4.15 Git kernel built with GCC 8.0.1 and full Retpoline protection as well as the yet-to-be-merged RETPOLINE_UNDERFLOW support for Skylake and Kabylake systems. KPTI is also present. The GCC 8.0.1 compiler used for building this kernel and the various benchmarks with the different compiler flags was built from SVN/Git as of 15 January. All of these benchmarks carried out using the Phoronix Test Suite.