NVIDIA's Proactive Memory Compaction Work Revised For The Linux Kernel
A few weeks back I wrote about NVIDIA's Nitin Gupta working on proactive memory compaction for the Linux kernel to more proactively compact memory rather than doing so on-demand when it can lead to high latencies for applications needing lots of huge-pages.
That proactive compaction work at the time was flying under a "request for comments" flag but with continued work by Nitin and developer comments, he has now published a revised patch series that is no longer RFC.
Compared to the earlier RFC patches, the new version has a lone sysfs tunable: /sys/kernel/mm/compaction/node-n/hpage_compaction_effort. That value is used for determining the thresholds for external fragmentation rather than having multiple tunables in the older patches that just led to more complicated use.
More details on this proactive compaction work via the kernel mailing list. Though given the timing it's too close for seeing it in Linux 5.5 but perhaps we'll see it ship with Linux 5.6 early next year.
That proactive compaction work at the time was flying under a "request for comments" flag but with continued work by Nitin and developer comments, he has now published a revised patch series that is no longer RFC.
For some applications we need to allocate almost all memory as hugepages. However, on a running system, higher order allocations can fail if the memory is fragmented. Linux kernel currently does on-demand compaction as we request more hugepages but this style of compaction incurs very high latency. Experiments with one-time full memory compaction (followed by hugepage allocations) shows that kernel is able to restore a highly fragmented memory state to a fairly compacted memory state within <1 sec for a 32G system. Such data suggests that a more proactive compaction can help us allocate a large fraction of memory as hugepages keeping allocation latencies low.
Compared to the earlier RFC patches, the new version has a lone sysfs tunable: /sys/kernel/mm/compaction/node-n/hpage_compaction_effort. That value is used for determining the thresholds for external fragmentation rather than having multiple tunables in the older patches that just led to more complicated use.
More details on this proactive compaction work via the kernel mailing list. Though given the timing it's too close for seeing it in Linux 5.5 but perhaps we'll see it ship with Linux 5.6 early next year.
Add A Comment