Linus Torvalds: "Let's Just Disable The Stupid [AMD] fTPM HWRND Thing"
Linux creator Linus Torvalds is growing frustrated with AMD fTPM hardware random number generator bugs on recent Ryzen systems plaguing the kernel and has expressed a desire in disabling its use.
Recently there was a stuttering issue caused by the AMD fTPM random number generator that initially affected Windows users but also turned out to affect Linux too. A fix was upstreamed and also backported to earlier kernels but some AMD fTPM RNG-related headaches have persisted with some users still reporting stuttering problems.
There's been a new bug report as of last week over the fTPM usage causing stuttering. Though this newest report appears to be running with updated firmware and also the first time it appears to happen for a Rembrandt platform. Existing kernel patches haven't helped. It was also pointed out that some have reached out to ASUS to obtain a special BIOS that appears to just disable the fTPM.
Linus Torvalds meanwhile chimed in on the mailing list:
Torvalds then added:
Hopefully with the added pressure from Torvalds there will be some additional clarity and fixes on the way for resolving these AMD fTPM issues under Linux.
Recently there was a stuttering issue caused by the AMD fTPM random number generator that initially affected Windows users but also turned out to affect Linux too. A fix was upstreamed and also backported to earlier kernels but some AMD fTPM RNG-related headaches have persisted with some users still reporting stuttering problems.
There's been a new bug report as of last week over the fTPM usage causing stuttering. Though this newest report appears to be running with updated firmware and also the first time it appears to happen for a Rembrandt platform. Existing kernel patches haven't helped. It was also pointed out that some have reached out to ASUS to obtain a special BIOS that appears to just disable the fTPM.
Linus Torvalds meanwhile chimed in on the mailing list:
"Let's just disable the stupid fTPM hwrnd thing.
Maybe use it for the boot-time "gather entropy from different sources", but clearly it should *not* be used at runtime.
Why would anybody use that crud when any machine that has it supposedly fixed (which apparently didn't turn out to be true after all) would also have the CPU rdrand instruction that doesn't have the problem?
If you don't trust the CPU rdrand implementation (and that has had bugs too - see clear_rdrand_cpuid_bit() and x86_init_rdrand()), why would you trust the fTPM version that has caused even *more* problems?
So I don't see any downside to just saying "that fTPM thing is not working". Even if it ends up working in the future, there are alternatives that aren't any worse.
Linus"
Torvalds then added:
"So that would sound very unlikely [RDRAND use exhibiting the problem], but who knows... Microcode can obviously do pretty much anything at all, but at least the original fTPM issues _seemed_ to be about BIOS doing truly crazy things like SPI flash accesses.
I can easily imagine a BIOS fTPM code using some absolutely horrid global "EFI synchronization" lock or whatever, which could then cause random problems just based on some entirely unrelated activity.
I would not be surprised, for example, if wasn't the fTPM hwrnd code itself that decided to read some random number from SPI, but that it simply got serialized with something else that the BIOS was involved with. It's not like BIOS people are famous for their scalable code that is entirely parallel...
And I'd be _very_ surprised if CPU microcode does anything even remotely like that. Not impossible - HP famously screwed with the time stamp counter with SMIs, and I could imagine them - or others - doing the same with rdrand.
But it does sound pretty damn unlikely, compared to "EFI BIOS uses a one big lock approach".
So rdrand (and rdseed in particular) can be rather slow, but I think we're talking hundreds of CPU cycles (maybe low thousands). Nothing like the stuttering reports we've seen from fTPM.
Linus"
Hopefully with the added pressure from Torvalds there will be some additional clarity and fixes on the way for resolving these AMD fTPM issues under Linux.
54 Comments