Linus Torvalds Comments On STIBP & He's Not Happy - STIBP Default Will End Up Changing
It turns out that Linus Torvalds himself was even taken by surprise with the performance hit we've outlined on Linux 4.20 as a result of STIBP "Single Thread Indirect Branch Predictors" introduction as well as back-porting already to stable series for cross-hyperthread Spectre V2 protection. He doesn't want this enabled in full by default.
All of the benchmarking I've been doing the past few days to shine the light on the Linux kernel's STIBP addition appears to be paying off. My tests have found Linux 4.20 to incur significant performance penalties in many workloads -- in fact, more so than some of the earlier Spectre and Meltdown mitigations -- and STIBP is already being back-ported to stable series like Linux 4.19.2. PHP, Python, Java, and many other workloads are measurably affected and even the gaming performance to some extent.
Linus Torvalds posted to the kernel mailing list on Sunday with a title of STIBP by default.. Revert?. In there he wrote:
He also followed up with, "I don't think the code needs to be reverted, but the *behavior* of just unconditionally enabling STIBP needs to be reverted. Because it was clearly way more expensive than people were told."
Longtime kernel developer Andi Kleen even argues now that the code should be reverted. "Actually I think it should be reverted. Yes of course opt-in is needed. But also when you opt-in it doesn't make sense to set STIBP when the sibling is running the same security context, which is actually a common case. So to even use it properly you would need some scheduler support to detect these cases and only enable it then with opt-in. These patches didn't even try to tackle this problem."
Intel Linux veteran Arjan van de Ven meanwhile chimed in that: "In the documentation, AMD officially recommends against this by default, and I can speak for Intel that our position is that as well: this really must not be on by default. STIBP and its friends are there as tools, and were created early on as big hammers because that is all that one can add in a microcode update.. expensive big hammers...Using these tools much more surgically is fine, if a paranoid task wants it for example, or when you know you are doing a hard core security transition. But always on? Yikes."
With some patches still under review, the default STIBP behavior will be to only enable it for tasks that request it via prctl or non-dumpable processes like the OpenSSH daemon. But even the blanket handling of non-dumpable processes to be protected by STIBP is raising some concerns too as likely end up affecting some performance-sensitive daemons, so we'll see what ends up happening.
At least now with a lot of attention on STIBP, it looks like some sane approach for protecting system processes where necessary while not killing Linux performance potential overall will be achieved by the time Linux 4.20 ships at the end of December or early January. But it's unfortunate that these STIBP patches were already back-ported and released as part of Linux 4.19.2, so hopefully there any changes will be back-ported quickly too.
All of the benchmarking I've been doing the past few days to shine the light on the Linux kernel's STIBP addition appears to be paying off. My tests have found Linux 4.20 to incur significant performance penalties in many workloads -- in fact, more so than some of the earlier Spectre and Meltdown mitigations -- and STIBP is already being back-ported to stable series like Linux 4.19.2. PHP, Python, Java, and many other workloads are measurably affected and even the gaming performance to some extent.
Linus Torvalds posted to the kernel mailing list on Sunday with a title of STIBP by default.. Revert?. In there he wrote:
This was marked for stable, and honestly, nowhere in the discussion did I see any mention of just *how* bad the performance impact of this was.
When performance goes down by 50% on some loads, people need to start asking themselves whether it was worth it. It's apparently better to just disable SMT entirely, which is what security-conscious people do anyway.
So why do that STIBP slow-down by default when the people who *really* care already disabled SMT?
I think we should use the same logic as for L1TF: we default to something that doesn't kill performance. Warn once about it, and let the crazy people say "I'd rather take a 50% performance hit than worry about a theoretical issue".
Linus
He also followed up with, "I don't think the code needs to be reverted, but the *behavior* of just unconditionally enabling STIBP needs to be reverted. Because it was clearly way more expensive than people were told."
Longtime kernel developer Andi Kleen even argues now that the code should be reverted. "Actually I think it should be reverted. Yes of course opt-in is needed. But also when you opt-in it doesn't make sense to set STIBP when the sibling is running the same security context, which is actually a common case. So to even use it properly you would need some scheduler support to detect these cases and only enable it then with opt-in. These patches didn't even try to tackle this problem."
Intel Linux veteran Arjan van de Ven meanwhile chimed in that: "In the documentation, AMD officially recommends against this by default, and I can speak for Intel that our position is that as well: this really must not be on by default. STIBP and its friends are there as tools, and were created early on as big hammers because that is all that one can add in a microcode update.. expensive big hammers...Using these tools much more surgically is fine, if a paranoid task wants it for example, or when you know you are doing a hard core security transition. But always on? Yikes."
With some patches still under review, the default STIBP behavior will be to only enable it for tasks that request it via prctl or non-dumpable processes like the OpenSSH daemon. But even the blanket handling of non-dumpable processes to be protected by STIBP is raising some concerns too as likely end up affecting some performance-sensitive daemons, so we'll see what ends up happening.
At least now with a lot of attention on STIBP, it looks like some sane approach for protecting system processes where necessary while not killing Linux performance potential overall will be achieved by the time Linux 4.20 ships at the end of December or early January. But it's unfortunate that these STIBP patches were already back-ported and released as part of Linux 4.19.2, so hopefully there any changes will be back-ported quickly too.
72 Comments