Announcement

**Guest** · 03 October 2019, 05:35 AM

Originally posted by bug77 View Post

Why single out RDRAND? Let's check MOV, JMP, CMP & co while we're at it

Lol. In some ways a good point - if you cannot trust the CPU to work correctly, then that system is unusable.

**Guest** · 03 October 2019, 05:38 AM

Originally posted by stormcrow View Post

So long as the output does change, that should be enough where all you want is a different result each time you query it in a large number of needs eg: the problem that restarted the getrandom() discussion in the first place -- GDM causing indefinite blocked boot progression due to over engineering its X connection token generation could potentially use a working RDRAND instead of getrandom().

And I thought that discussion was all due to systemd's usage of RDRAND.

**stormcrow** · 03 October 2019, 08:16 AM

Originally posted by sandy8925 View Post

And I thought that discussion was all due to systemd's usage of RDRAND.

Not at all. Not all CPUs even have an RDRAND equivalent to begin with. There's an ongoing discussion (with a recent patch set from Linus) on whether it's appropriate to block, toss an error, or ignore and pass results regardless of the (lack of) entropy contributed to getrandom() and letting the userspace fall on its face for making wrong assumptions. This conversation has really been going on for a number of years and it's included python, systemd, GDM, cryptographic key generation on initial startup like SSH keys, etc.

There is a recent write up on LWN on the current state. If you look through the comments section towards the bottom, there's a link to Linus' patch set that partly mitigates the problem by adding timing jitter from schedule(). I highly recommend reading through the comments, as they're also enlightening on the topic and problematic hardware and valid or invalid assumptions.

Quick link to the referenced patch sets:
Reverting the ext4fs extra interrupts removal since it apparently reduces entropy enough to be currently considered a "bad thing" -- Look for this to be changed/removed at a later date because SSDs are apparently moving away from using interrupts entirely (from the kernel discussion).
https://git.kernel.org/pub/scm/linux...70a7a1b65db72b

Adding the entropy source
https://git.kernel.org/pub/scm/linux...39da47ec689e55

**nivedita** · 03 October 2019, 10:56 AM

Originally posted by stormcrow View Post

It really depends on what you're looking for in the result. If you're looking for a broken register that repeats the same result over and over again, that's sufficient. If you're looking for an easily guessed weakly pseudorandom result you would indeed need to do one of the various methods for mathematical verification over a sufficient number of results. While the former might be good to check for specific type of brokenness in a random unknown CPU with RDRAND instruction, it's not sufficient to guarantee sufficient entropy to compute something where you want reasonably strong cryptographic computations (initial SSH private keys, GPG keys, etc). At that point you're getting into the ongoing argument over the behavior of syscall getrandom() in early boot stages when there's insufficient entropy to guarantee non-deterministic cryptographic output.

So long as the output does change, that should be enough where all you want is a different result each time you query it in a large number of needs eg: the problem that restarted the getrandom() discussion in the first place -- GDM causing indefinite blocked boot progression due to over engineering its X connection token generation could potentially use a working RDRAND instead of getrandom().

You're missing the point of the comment. If one change is good enough, the loop should just break out once that happens, rather than continuing to loop and uselessly accumulating the number of changes.

**stormcrow** · 03 October 2019, 11:56 AM

Originally posted by nivedita View Post

You're missing the point of the comment. If one change is good enough, the loop should just break out once that happens, rather than continuing to loop and uselessly accumulating the number of changes.

Yeah, you're right I did overlook the point of the comment. I agree with the comment once I went back and reread it. Comprehension fail. I kinda went off on a tangent on getrandom() as I'd just read an article on it and discussion on the problems with the instruction RDRAND and my brain made a link that wasn't there.

**F.Ultra** · 03 October 2019, 12:22 PM

Originally posted by arQon View Post

Except that code's still not very good, unless I'm missing something, because all it's looking for is any change ever, but it still runs SANITY_CHECK_LOOPS instead of breaking out once it finds one...

While true, the calls to rdrand() is probably so quick that adding a break in the loop will #1 not improve the performance in any meaningful way even on a low power system and #2 might actually lower performance (since a break statement would add code inside the tight loop).

**Tomin** · 03 October 2019, 03:45 PM

Originally posted by phoronix View Post

This new sanity check is calling RdRand eight times and ensuring the data has changed between calls. If the data never changed, it will now print to the dmesg output, "RDRAND gives funky smelling output, might consider not using it by booting with "nordrand"." This new sanity check will not disable RdRand but just point out to the user the likelihood it being broken over a successive RdRand call returning the same "random" data.

This reminds me of an apt Dilbert comic:

Dilbert.com

https://dilbert.com/strip/2001-10-25

**angrypie** · 03 October 2019, 04:52 PM

Originally posted by bug77 View Post

Why single out RDRAND? Let's check MOV, JMP, CMP & co while we're at it

Nobody could be so incompetent as to fuck up with those instructions, r-right guys?

**bug77** · 03 October 2019, 05:04 PM

Originally posted by angrypie View Post

Nobody could be so incompetent as to fuck up with those instructions, r-right guys?

Sure, I picked the easy examples, but where do you draw the line?

**angrypie** · 03 October 2019, 05:20 PM

Originally posted by bug77 View Post

Sure, I picked the easy examples, but where do you draw the line?

Well, the core ISA is basically set in stone (no pun intended), but it's still a long shot to assume validation will catch everything before it hits production.

So there's no line to draw anywhere. Just "plug and pray."

Announcement

Following Buggy AMD RdRand, The Linux Kernel Will Begin Sanity Checking Randomness At Boot Time

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment