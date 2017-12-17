Earlier this month I wrote how Intel engineers have been busy with continuing to tune glibc's performance with FMA and AVX optimizations. That work has continued but also other architectures continue tuning their GNU C Library performance ahead of the expected v2.27 update.
There has been a ton of optimization work this cycle, particularly on the Intel/x86_64 front. For those with newer Intel 64-bit processors, this next glibc release is shaping up to be a speedy update.
Just the latest done by Intel's H.J. Lu this week is a FMA'ed cosf() yielding improvements up to ~45% faster. That test was done with a Skylake processor. And by removing older cosine code was also another win.
But even if you aren't a x86_64 fan, improvements for other architectures is also ongoing. This week brought a POWER8 memcpy optimization, for AArch64 / ARMv8 64-bit is better strcmp performance, and in SPARC land is faster memcpy/mempcpy/memmove on the M7 CPU as well as memset/bzero.
While Glibc has traditionally been criticized for being slow and bloated, it's good to see the GNU C Library getting faster especially on x86_64.
