The Brutal Performance Impact From Mitigating The LVI Vulnerability
On Tuesday the Load Value Injection (LVI) attack was disclosed by Intel and security researchers as a new class of transient-execution attacks and could lead to injecting data into a victim program and in turn stealing data, including from within SGX enclaves. While Intel has publicly stated they don't believe the LVI attack to be practical, one of their open-source compiler wizards did go ahead and add mitigation options to the GNU Assembler as part of the GCC toolchain. Here are benchmarks showing the performance impact of enabling those new LVI mitigation options and the significant impact they can cause on run-time performance in real-world workloads.
The web-site, LVIattack.eu, was setup by the independent researchers that discovered LVI. They describe the Load Value Injection impact as "bypasses all existing mitigations against transient-execution attacks, such as Meltdown, Spectre, Foreshadow, ZombieLoad, RIDL, and Fallout. We show that LVI is especially relevant in the context of Intel SGX, where LVI may arbitrarily hijack transient execution in a victim enclave and ultimately leak arbitrary secrets, breaking confidentiality guarantees in the Intel SGX ecosystem. LVI unifies the transient-execution research landscape by applying gadget-driven techniques from the Spectre world to reversely exploit prior Meltdown-type data leakages. LVI furthermore marks the end of transparently patching Meltdown-type processor vulnerabilities in CPU microcode, as LVI necessitates expensive software updates to serialize the processor pipeline and disable speculation after potentially every load operation."
The Load Value Injection mitigations authored by Intel's compiler team that were merged to GNU Binutils on Wednesday insert LFENCE barriers before vulnerable instructions. New GNU Assembler options are -mlfence-after-load=yes for generating an LFENCE after load instructions, -mlfence-before-indirect-branch=none|all|memory|register for calling the instruction before indirect branches, and -mlfence-before-ret=or|not for lfence usage before return (ret) instructions. Load Fence serializes all load operations locally prior to that instruction.
The mitigations aren't enabled by default even for affected Intel CPUs, but are options that must be passed to the assembler when compiling the code. As a reminder from Tuesday, LVI is believed to principally impact Intel CPUs with SGX (basically Skylake and newer). Newer Intel CPUs with some hardware mitigations to recent vulnerabilities (Cascadelake, Comet Lake, certain Coffee Lake steppings) are said to be only partially vulnerable to LVI while Ice Lake appears to be the first generation post-Skylake not affected at all by LVI.
LVI attack researchers noted that with their testing "we observe extensive overheads of factor 2 to 19 for prototype implementations of the full mitigation." Of course, now having the GNU toolchain mitigations merged, I was eager to see the performance impact in real workloads.
I have been running benchmarks from an Intel Xeon E3-1275 v6 (Kabylake) server while running Ubuntu 20.04 in a development state. The GNU Assembler / Binutils were built from Git master on 11 March.
The only changes between testing were rebuilding the programs under test with the different Assembler flags. Tests were first done without any mitigations and then looking at the key combinations of flags pertaining to inserting of LFENCE instructions around loads, indirect branches, and ret instructions. Besides looking at the impact individually, the combined performance impact of enabling all three assembler options were also evaluated. Via the Phoronix Test Suite dozens of benchmarks were carried out from this Intel Xeon server for a high-level look at the impact of these extra LFENCEs on performance.