Intel Further Speeds Up strnlen() In The GNU C Library For Recent Intel/AMD CPUs
Intel software engineers are responsible for many of the great x86_64-related optimizations to the GNU C Library "glibc" over the years. While they've extensively tuned many Glibc functions for achieving peak performance on their modern CPUs, it's a never-ending quest. Merged this week was another optimization to strnlen(), the function for determining the number of bytes in a fixed-size string.
Matthew Sterrett of Intel unified Glibc's strnlen EVEX and EVEX512 implementations. In turn this unified, optimized strnlen handling for x86_64 Intel/AMD CPUs with EVEX support is showing some nice improvements over the prior code.
Sterrett wrote in the commit unifying the strnlen EVEX implementations:
That improved code is merged in Glibc Git for the Glibc 2.41 release coming out as stable in February.
Matthew Sterrett of Intel unified Glibc's strnlen EVEX and EVEX512 implementations. In turn this unified, optimized strnlen handling for x86_64 Intel/AMD CPUs with EVEX support is showing some nice improvements over the prior code.
Sterrett wrote in the commit unifying the strnlen EVEX implementations:
x86: Unifies 'strnlen-evex' and 'strnlen-evex512' implementations.
This commit uses a common implementation 'strnlen-evex-base.S' for both 'strnlen-evex' and 'strnlen-evex512'
This patch serves both to reduce the number of implementations, and it also does some small optimizations that benefit strnlen-evex and strnlen-evex512.
All tests pass on x86.
Benchmarks were taken on [an Intel Core i9 7900X Skylake X CPU].
Geometric mean for strnlen-evex over all benchmarks (N=10) was (new/old) 0.881
Geometric mean for strnlen-evex512 over all benchmarks (N=10) was (new/old) 0.953
That improved code is merged in Glibc Git for the Glibc 2.41 release coming out as stable in February.
35 Comments