GNU C Library Tuning For AArch64 Helps Memset Performance By ~24%

Written by Michael Larabel in GNU on 10 September 2024 at 01:00 PM EDT. 9 Comments
GNU
A patch merged yesterday to the GNU C Library (glibc) codebase can help the memset() function's performance by 24% as measured on an Arm Neoverse-N1 core.

Wilco Dijkstra of Arm has landed a memset optimization for the AArch64 code within the GNU C Library. Wilco explains in the patch adjusting the hand-tuned Assembly code:
"Improve small memsets by avoiding branches and use overlapping stores. Use DC ZVA for copies over 128 bytes. Remove unnecessary code for ZVA sizes other than 64 and 128. Performance of random memset benchmark improves by 24% on Neoverse N1."

It will be interesting to see the memset performance impact of this optimization on other Arm cores as well.

Ampere Altra sever


The Neoverse-N1 is what's found in the Ampere Altra / Ampere Altra Max servers among other SoCs and thus will be nice to see this optimization rolling out in the next Glibc release. That next release will be Glibc 2.41 and should be out around February.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week