Linux Proposal Adding getrandom() To The vDSO For Better Performance
Jason Donenfeld of WireGuard fame who has recently been spending much work on improving Linux's "random" kernel code has sent out a proposal adding getrandom() support to the vDSO for better performance in seeking to better address the needs of user-space developers.
While reading Phoronix, Jason Donenfeld learned of the GNU C Library adding the arc4random functions. In turn Donenfeld started a discussion on the Glibc mailing list in having some differing views over it and questions about its purpose.
That resulted in a lot of back-and-forth discussions between Donenfeld and GNU toolchain developers around their performance needs and what led to working on the arc4random functions for Glibc that have long been available on the BSDs.
In trying to better address their random performance needs, today Donenfeld has proposed implementing getrandom() in the vDSO. The vDSO of course being the virtual dynamic shared object library that the kernel automatically maps into the address space for all user-space software. By adding getrandom() to the vDSO it should be possible to up the performance considerably over the system call overhead. Jason wrote in that request for comments:
See the mailing list for more technical details if interested.
Additionally, Jason Donenfeld has simplified the arc4random design to the Glibc code and that has already been merged to Git. This design simplification is for better safety with the new arc4random function by calling getrandom() every time rather than buffering 16MiB of entropy in user-space, so there is some performance overhead with this simplification at least for the time being.
While reading Phoronix, Jason Donenfeld learned of the GNU C Library adding the arc4random functions. In turn Donenfeld started a discussion on the Glibc mailing list in having some differing views over it and questions about its purpose.
I really wonder whether this is a good idea, whether this is something that glibc wants, and whether it's a design worth committing to in the long term.
Firstly, for what use cases does this actually help? As of recent changes to the Linux kernels -- now backported all the way to 4.9! -- getrandom() and /dev/urandom are extremely fast and operate over per-cpu states locklessly. Sure you avoid a syscall by doing that in userspace, but does it really matter? Who exactly benefits from this?
Seen that way, it seems like a lot of complexity for nothing, and complexity that will lead to bugs and various oversights eventually.
For example, the kernel reseeds itself when virtual machines fork using an identifier passed to the kernel via ACPI. It also reseeds itself on system resume, both from ordinary S3 sleep but also, more importantly, from hibernation. And in general, being the arbiter of entropy, the kernel is much better poised to determine when it makes sense to reseed.
Glibc, on the other hand, can employ some heuristics and make some decisions -- on fork, after 16 MiB, and the like -- but in general these are lacking, compared to the much wider array of information the kernel
has.
You miss out on this with arc4random, and if that information _is_ to be exported to userspace somehow in the future, it would be awfully nice to design the userspace interface alongside the kernel one.
For that reason, past discussion of having some random number generation in userspace libcs has geared toward doing this in the vDSO, somehow, where the kernel can be part and parcel of that effort.
Seen from this perspective, going with OpenBSD's older paradigm might be rather limiting. Why not work together, between the kernel and libc, to see if we can come up with something better, before settling on an interface with semantics that are hard to walk back later?
As-is, it's hard to recommend that anybody really use these functions. Just keep using getrandom(2), which has mostly favorable semantics.
Yes, I get it: it's fun to make a random number generator, and so lots of projects figure out some way to make yet another one somewhere somehow. But the tendency to do so feels like a weird computer tinkerer disease rather something that has ever helped the overall ecosystem.
So I'm wondering: who actually needs this, and why? What's the performance requirement like, and why is getrandom(2) insufficient? And is this really the best approach to take? If this is something needed, how would you feel about working together on a vDSO approach instead? Or maybe nobody actually needs this in the first place?
And secondly, is there anyway that glibc can *not* do this, or has that ship fully sailed, and I really missed out by not being part of that discussion whenever it was happening?
That resulted in a lot of back-and-forth discussions between Donenfeld and GNU toolchain developers around their performance needs and what led to working on the arc4random functions for Glibc that have long been available on the BSDs.
In trying to better address their random performance needs, today Donenfeld has proposed implementing getrandom() in the vDSO. The vDSO of course being the virtual dynamic shared object library that the kernel automatically maps into the address space for all user-space software. By adding getrandom() to the vDSO it should be possible to up the performance considerably over the system call overhead. Jason wrote in that request for comments:
So far in my test results, performance is pretty stellar, and it seems to be working. But this is very, very young, immature code, suitable for an RFC and no more, so expect dragons.
See the mailing list for more technical details if interested.
Additionally, Jason Donenfeld has simplified the arc4random design to the Glibc code and that has already been merged to Git. This design simplification is for better safety with the new arc4random function by calling getrandom() every time rather than buffering 16MiB of entropy in user-space, so there is some performance overhead with this simplification at least for the time being.
9 Comments