GNU C Library Finally Adds arc4random Functions For Linux
The arc4random, arc4random_buf, and arc4random_uniform functions have been common on the BSDs for years to provide higher quality random number generation than rand/random and alternative functions. But now as of yesterday the GNU C Library (Glibc) has finally added the arc4random functions for use on Linux!
Going back to bug 4417 from 2007 has been a request to have arc4random with Glibc. Even back then were preliminary patches implementing arc4random for Glibc. The response upstream back in 2007 was, "glibc is no dumping ground for arbitrary code. The existing code is just [fine]. Put your code in separate libraries."
But then in 2018 a Red Hat engineer began working on arc4random for Glibc. Now four years after that Adhemerval Zanella Netto of Linaro has managed to get the arc4random family of functions across the finish line and into mainline Glibc.
The arc4random functions in Glibc aim to provide high quality random number generation and in a compatible manner to the long-available functions on the BSDs. This commit from Friday sums it up:
Along with adding arc4random, arc4random_buf, and arc4random_uniform functions to the standard library, Friday's patch activity also added optimized ChaCha20 versions for AArch64, x86 SSE2, x86 AVX2, PowerPC64, and s390x.
Going back to bug 4417 from 2007 has been a request to have arc4random with Glibc. Even back then were preliminary patches implementing arc4random for Glibc. The response upstream back in 2007 was, "glibc is no dumping ground for arbitrary code. The existing code is just [fine]. Put your code in separate libraries."
But then in 2018 a Red Hat engineer began working on arc4random for Glibc. Now four years after that Adhemerval Zanella Netto of Linaro has managed to get the arc4random family of functions across the finish line and into mainline Glibc.
The arc4random functions in Glibc aim to provide high quality random number generation and in a compatible manner to the long-available functions on the BSDs. This commit from Friday sums it up:
The implementation is based on scalar Chacha20 with per-thread cache. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed buffer.
To improve performance and lower memory consumption the per-thread cache is allocated lazily on first arc4random functions call, and if the memory allocation fails getentropy or /dev/urandom is used as fallback. The cache is also cleared on thread exit iff it was initialized (so if arc4random is not called it is not touched).
Although it is lock-free, arc4random is still not async-signal-safe (the per thread state is not updated atomically).
The ChaCha20 implementation is based on RFC8439, omitting the final XOR of the keystream with the plaintext because the plaintext is a stream of zeros. This strategy is similar to what OpenBSD arc4random does.
The arc4random_uniform is based on previous work by Florian Weimer, where the algorithm is based on Jérémie Lumbroso paper Optimal Discrete Uniform Generation from Coin Flips, and Applications (2013), who credits Donald E. Knuth and Andrew C. Yao, The complexity of nonuniform random number generation (1976), for solving the general case.
The main advantage of this method is the that the unit of randomness is not the uniform random variable (uint32_t), but a random bit. It optimizes the internal buffer sampling by initially consuming a 32-bit random variable and then sampling byte per byte. Depending of the upper bound requested, it might lead to better CPU utilization.
Checked on x86_64-linux-gnu, aarch64-linux, and powerpc64le-linux-gnu.
Along with adding arc4random, arc4random_buf, and arc4random_uniform functions to the standard library, Friday's patch activity also added optimized ChaCha20 versions for AArch64, x86 SSE2, x86 AVX2, PowerPC64, and s390x.
10 Comments