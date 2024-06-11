Show Your Support: Did you know that the hundreds of articles written on Phoronix each month are mostly authored by one individual? Phoronix.com doesn't have a whole news room with unlimited resources and relies upon people reading our content without blocking ads and alternatively by people subscribing to Phoronix Premium for our ad-free service with other extra features.
Intel's Glibc Non-Temporal Stores Memset Optimization Extended To AMD CPUs
Intel toolchain engineer Noah Goldstein last month introduced this "glibc.cpu.x86_memset_non_temporal_threshold" tunable for setting the threshold for non-temporal store in memset. The x86_memset_non_temporal_threshold documentation explains:
"The glibc.cpu.x86_memset_non_temporal_threshold tunable allows the user to set threshold in bytes for non temporal store in memset. Non temporal stores give a hint to the hardware to move data directly to memory without displacing other data from the cache. This tunable is used by some platforms to determine when to use non temporal stores memset."
This memset non-temporal tunable was artificially limited to just on Intel processors given that is where it was tested and found to be of performance benefit. After all, it was an Intel engineer leading the change.
Merged on Monday to Glibc Git though is now extending this tunable for AMD processors. Fastly's Joe Damato did the testing and found that this is beneficial for AMD processors. Benchmarks have shown the non-temporal memset is beneficial for AMD processors in tests carried out across Zen 2, Zen 3, and Zen 4 hardware. The data for those interested can be found via this Google Docs spreadsheet for the various AMD Zen CPUs as well as the Intel numbers.
This commit now in Glibc allows for this tunable to work on AMD platforms.