Linux 6.7 To Update Intel IBRS Mitigation Handling To Enhance System Performance
Motivated by a 25% performance degradation seen on an Intel Xeon Scalable dual socket server due to Indirect Branch Restricted Speculation (IBRS), Red Hat's Waiman Long has been working on a patch series to update the IBRS handling in different conditions for affected Intel processors on Linux.
Queued today in TIP.git's sched/core branch is x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance. As Waiman Long explained in the earlier patch series on the mailing list:
The patch message goes on to add, "Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle") disables IBRS when the CPU enters long idle. However, when a CPU becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRSis enabled. That will impact the performance of a sibling CPU. Mitigate this performance impact by clearing all the mitigation bits in SPEC_CTRL MSR when offline. When the CPU is online again, it will be re-initialized and so restoring the SPEC_CTRL value isn't needed."
As part of the patch series is also the intel_idle.ibrs_off module parameter being introduced. That patch also in TIP's sched/core separately notes some nice benefits too:
With these patches having made it into a TIP.git branch, it's material expected to be submitted when the Linux 6.7 merge window opens in about one month.
Queued today in TIP.git's sched/core branch is x86/idle: Disable IBRS when CPU is offline to improve single-threaded performance. As Waiman Long explained in the earlier patch series on the mailing list:
For Intel processors that need to turn on IBRS to protect against Spectre v2 and Retbleed, the IBRS bit in the SPEC_CTRL MSR affects the performance of the whole core even if only one thread is turning it on when running in the kernel. For user space heavy applications, the performance impact of occasionally turning IBRS on during syscalls shouldn't be significant. Unfortunately, that is not the case when the sibling thread is idling in the kernel. In that case, the performance impact can be significant.
When DPDK is running on an isolated CPU thread processing network packets in user space while its sibling thread is idle. The performance of the busy DPDK thread with IBRS on and off in the sibling idle thread are:
IBRS on IBRS off
------- --------
packets/second: 7.8M 10.4M
avg tsc cycles/packet: 282.26 209.86
This is a 25% performance degradation. The test system is a Intel Xeon 4114 CPU @ 2.20GHz.
Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle") disables IBRS when the CPU enters long idle (C6 or below). However, there are existing users out there who have set "intel_idle.max_cstate=1" to decrease latency. Those users won't be able to benefit from this commit. This patch series extends this commit by providing a new "intel_idle.ibrs_off" module parameter to force disable IBRS even when "intel_idle.max_cstate=1" at the expense of increased IRQ response latency. It also includes a commit to allow the disabling of IBRS when a CPU becomes offline.
The patch message goes on to add, "Commit bf5835bcdb96 ("intel_idle: Disable IBRS during long idle") disables IBRS when the CPU enters long idle. However, when a CPU becomes offline, the IBRS bit is still set when X86_FEATURE_KERNEL_IBRSis enabled. That will impact the performance of a sibling CPU. Mitigate this performance impact by clearing all the mitigation bits in SPEC_CTRL MSR when offline. When the CPU is online again, it will be re-initialized and so restoring the SPEC_CTRL value isn't needed."
As part of the patch series is also the intel_idle.ibrs_off module parameter being introduced. That patch also in TIP's sched/core separately notes some nice benefits too:
In the case of a Skylake server with max_cstate=1, this new ibrs_off option will likely increase the IRQ response latency as IRQ will now be disabled.
When running SPECjbb2015 with cstates set to C1 on a Skylake system.
First test when the kernel is booted with: "intel_idle.ibrs_off":
max-jOPS = 117828, critical-jOPS = 66047
Then retest when the kernel is booted without the "intel_idle.ibrs_off" added:
max-jOPS = 116408, critical-jOPS = 58958
That means booting with "intel_idle.ibrs_off" improves performance by:
max-jOPS: +1.2%, which could be considered noise range.
critical-jOPS: +12%, which is definitely a solid improvement.
With these patches having made it into a TIP.git branch, it's material expected to be submitted when the Linux 6.7 merge window opens in about one month.
Add A Comment