Linux 6.10 SLUB Optimization To Reduce Memory Consumption In Extreme Scenarios
A patch to the Linux kernel's SLUB allocator has been queued ahead of the upcoming Linux 6.10 merge window to help reduce memory consumption in extreme scenarios.
The patch from Huawei engineer Chen Jun will help with lowering memory consumption in extreme cases. Chen Jun explained of the work:
When testing on a QEMU VM with four NUMA nodes each having 1GB of memory and carrying out a simple test case, Chen found that the number of allocated objects via /proc/slabinfo dropped from 13,519,712 down to 4,200,768 objects. Or just 31% the original number of allocated objects as found with current Linux kernels in this extreme case.
The patch is in the SLAB.git repository's "for-next" branch ahead of the Linux 6.10 merge window coming in mid-May.
The patch from Huawei engineer Chen Jun will help with lowering memory consumption in extreme cases. Chen Jun explained of the work:
When kmalloc_node() is called without __GFP_THISNODE and the target node lacks sufficient memory, SLUB allocates a folio from a different node other than the requested node, instead of taking a partial slab from it.
However, since the allocated folio does not belong to the requested node, on the following allocation it is deactivated and added to the partial slab list of the node it belongs to.
This behavior can result in excessive memory usage when the requested node has insufficient memory, as SLUB will repeatedly allocate folios from other nodes without reusing the previously allocated ones.
To prevent memory wastage, when a preferred node is indicated (not NUMA_NO_NODE) but without a prior __GFP_THISNODE constraint:
1) try to get a partial slab from target node only by having __GFP_THISNODE in pc.flags for get_partial()
2) if 1) failed, try to allocate a new slab from target node with GFP_NOWAIT | __GFP_THISNODE opportunistically.
3) if 2) failed, retry with original gfpflags which will allow get_partial() try partial lists of other nodes before potentially allocating new page from other nodes
When testing on a QEMU VM with four NUMA nodes each having 1GB of memory and carrying out a simple test case, Chen found that the number of allocated objects via /proc/slabinfo dropped from 13,519,712 down to 4,200,768 objects. Or just 31% the original number of allocated objects as found with current Linux kernels in this extreme case.
The patch is in the SLAB.git repository's "for-next" branch ahead of the Linux 6.10 merge window coming in mid-May.
8 Comments