Linux Developers Evaluating New "DOITM" Security Mitigation For Latest Intel CPUs
Last summer Intel published guidance around the Data Operand Independent Timing (DOIT) instruction mode that can be enabled with recent generations of Intel processors to ensure constant time execution for a subset of the Intel instruction set, which can be particularly important for cryptographic algorithms. Linux kernel developer discussions fizzled out last year over handling this DOIT functionality for what is described as a CPU vulnerability with recent Intel CPUs. However, now a Linux kernel patch from a Google developer would enable this change unconditionally for newer Intel CPUs but raises performance concerns.
Last year it was disclosed by Intel as well as Arm that instructions on recent and future processors aren't guaranteed to be "constant time" with respect to their data operands unless a special model specific register flag is set. This caused concerns particularly around the cryptography code for Linux that there is no longer a guarantee of constant time and that the instruction execution time can vary depending upon the data operated on. The constant time execution is necessary to avoid possible side channel attacks. But in enabling the new Intel flag to ensure constant time, it comes with admitted performance implications.
Google engineer and longtime Linux kernel developer Eric Biggers sent out a patch this week for enabling the Intel Data Operated Independent Timing Mode control for the Linux kernel that would enable this flag by default for newer Intel CPUs. It's enabled by default but does provide a knob for disabling this security mitigation/feature too. Biggers described the issue and motivation with this patch message:
According to documentation that Intel published recently, Intel CPUs based on the Ice Lake and later microarchitectures don't guarantee "data operand independent timing" by default. I.e., instruction execution times may depend on the values of data operated on. This is true for a wide variety of instructions, including many instructions that are heavily used in cryptography and have always been assumed to be constant-time, e.g. additions, XORs, and even the AES-NI instructions.
Cryptography algorithms require constant-time instructions to prevent side-channel attacks that recover cryptographic keys based on execution times. Therefore, without this CPU vulnerability mitigated, it's generally impossible to safely do cryptography on the latest Intel CPUs.
It's also plausible that this CPU vulnerability can expose privileged kernel data to unprivileged userspace processes more generally.
To mitigate this CPU vulnerability, it's possible to enable "Data Operand Independent Timing Mode" (DOITM) by setting a bit in a MSR. While Intel's documentation suggests that this bit should only be set where "necessary", that is highly impractical, given the fact that cryptography can happen nearly anywhere in the kernel and userspace, and the fact that the entire kernel likely needs to be protected anyway.
Therefore, let's simply enable DOITM globally by default to fix this vulnerability. At most this gives up an "optimization" on the very latest CPUs, restoring the correct behavior from previous CPUs.
The proposed kernel documentation for Data Operand Dependent Timing (DODT) vulnerability sums up the situation as:
Intel's published guidance on the Data Operand Independent Timing lays out the clear performance risks of this operating mode:DODT - Data Operand Dependent Timing
====================================Data Operand Dependent Timing (DODT) is a CPU vulnerability that makes the execution times of instructions depend on the values of the data operated on.
This vulnerability potentially enables side-channel attacks on data, including cryptographic keys. Most cryptography algorithms require that a variety of instructions be constant-time in order to prevent side-channel attacks.
Affected CPUs
-------------This vulnerability affects Intel Core family processors based on the Ice Lake and later microarchitectures, and Intel Atom family processors based on the Gracemont and later microarchitectures.
Mitigation
----------Mitigation of this vulnerability involves setting a Model Specific Register (MSR) bit to enable Data Operand Independent Timing Mode (DOITM).
By the default, the kernel does this on all CPUs. This mitigation is global, so it applies to both the kernel and userspace.
This mitigation can be disabled by adding ``doitm=off`` to the kernel command line. It's also one of the mitigations that can be disabled by ``mitigations=off``.
DOIT requires disabling hardware optimizations and/or performance features on some processors; for example, enabling data operand independent timing might disable data-dependent prefetching. This means that the DOIT mode may have a performance impact, and Intel expects the performance impact of this mode may be significantly higher on future processors.
Intel's guidance is basically to use this DOIT mode sparingly and for use by software that has already applied other Intel-recommended techniques to mitigate software timing side channels. The Linux kernel with its massive code-base built up over the years isn't necesarily in tuned with all of Intel's recommended software techniques. Somewhat worrisome is that this DOIT mode may be "significantly higher" for future processors.
Notably with that Linux kernel mailing list patch is no performance numbers or indications of just how severe the performance penalty is, reportedly due to lack of hardware and looking for testing from the kernel community.
Following that patch by Eric Biggers, the kernel discussion ensued whether the entire kernel needed to be protected, if it would be possible to just protect the cryptography operations within the kernel, and if not going for this blanket enabling how could user-space be properly protected too. There is no consensus at this point, but given the lack of solid numbers... I've been firing up some initial benchmarks on current generation Intel systems.