Linux 6.13 Adding "slab_strict_numa" SLAB Option For Helping ARM Performance
As part of the SLAB (SLUB) allocator updates pending for the upcoming Linux 6.13 cycle is a new "slab_strict_numa" option that is reported to further help ARM Linux performance such as for Ampere Computing servers.
The new "slab_strict_numa" boot parameter allows enforcing per-object memory policies on top of the SLAB folio policies. Enabling slab_strict_numa allows for reducing remote memory accesses while focusing less on reducing slab allocation overhead.
This new SLAB strict NUMA option was devised by Christoph Lameter of Ampere Computing. He explained on the patch with less than four dozen lines of new code:
This slab_strict_numa option is good news for Ampere Altra and AmpereOne servers, especially for dual socket servers. It will be fun to benchmark Linux 6.13 to see what other workloads are benefiting beyond memcached.
See the SLAB pull request of the SLUB updates intended for Linux 6.13. Besides the slab_strict_numa patch for per-object memory policies the rest of the SLUB code this cycle is focused on fixes.
The new "slab_strict_numa" boot parameter allows enforcing per-object memory policies on top of the SLAB folio policies. Enabling slab_strict_numa allows for reducing remote memory accesses while focusing less on reducing slab allocation overhead.
This new SLAB strict NUMA option was devised by Christoph Lameter of Ampere Computing. He explained on the patch with less than four dozen lines of new code:
"The old SLAB allocator used to support memory policies on a per allocation bases. In SLUB the memory policies are applied on a per page frame / folio bases. Doing so avoids having to check memory policies in critical code paths for kmalloc and friends.
This worked on general well on Intel/AMD/PowerPC because the interconnect technology is mature and can minimize the latencies
through intelligent caching even if a small object is not placed optimally.
However, on ARM we have an emergence of new NUMA interconnect technology based more on embedded devices. Caching of remote content can currently be ineffective using the standard building blocks / mesh available on that platform. Such architectures benefit if each slab object is individually placed according to memory policies and other restrictions.
This patch adds another kernel parameter
slab_strict_numa
If that is set then a static branch is activated that will cause the hotpaths of the allocator to evaluate the current memory allocation policy. Each object will be properly placed by paying the price of extra processing and SLUB will no longer defer to the page allocator to apply memory policies at the folio level.
This patch improves performance of memcached running on Ampere Altra 2P system (ARM Neoverse N1 processor) by 3.6% due to accurate placement of small kernel objects."
This slab_strict_numa option is good news for Ampere Altra and AmpereOne servers, especially for dual socket servers. It will be fun to benchmark Linux 6.13 to see what other workloads are benefiting beyond memcached.
See the SLAB pull request of the SLUB updates intended for Linux 6.13. Besides the slab_strict_numa patch for per-object memory policies the rest of the SLUB code this cycle is focused on fixes.
Add A Comment