Linux 5.20 To Enable THP SWAP On 64-bit Arm For Better Swapping Performance
The "THP_SWAP" option for the Linux kernel allows swapping transparent huge-pages in one piece without splitting. With Linux 5.20 the 64-bit Arm kernel (ARM64 / AArch64) will now support this option as a performance optimization.
Queued as part of ARM64's for-next/mm is enabling the THP_SWAP option for the 64-bit Arm kernel build. The change in the "-next" code, which was queued last week, explains: "THP_SWAP has been proven to improve the swap throughput significantly on x86_64...As long as arm64 uses 4K page size, it is quite similar with x86_64
by having 2MB PMD THP. THP_SWAP is architecture-independent, thus, enabling it on arm64 will benefit arm64 as well."
That THP_SWAP improvement for x86_64 was noted by an Intel engineer back in 2017, "In this patch, splitting transparent huge page (THP) during swapping out is delayed from after adding the THP into the swap cache to after swapping out finishes. After the patch, more operations for the anonymous THP reclaiming, such as writing the THP to the swap device, removing the THP from the swap cache could be batched. So that the performance of anonymous THP swapping out could be improved...With the patchset, the swap out throughput improves 42% (from about 5.81GB/s to about 8.25GB/s) in the vm-scalability swap-w-seq test case with 16 processes. At the same time, the IPI (reflect TLB flushing) reduced about 78.9%."
A simple swapping test on a Rockchip quad-core Cortex-A55 platform saw a 22% improvement with this queued kernel change.
The THP_SWAP for ARM64 and other enhancements are coming for the Linux 5.20 kernel with its merge window kicking off next week.