Google Moves Forward With HugeTLB HGM For The Linux Kernel
HugeTLB HGM is about allowing HugeTLB pages to be mapped at high granulairty in a manner similar to transparent hugepages (THPs) being PTE-mapped. Google's motivation around HugeTLB HGM for pages at the kernel's PAGE_SIZE has useful implications for VM live migration and memory failure handling.
Some of the key benefits details from the HugeTLB HGM patch series:
Being able to unpause a vCPU 100x quicker is helpful for guest stability, and being able to use 1G pages at all can significant improve steady-state guest performance.
After fully copying a hugepage over the network, we will want to collapse the mapping down to what it would normally be (e.g., one PUD for a 1G page). Rather than having the kernel do this automatically, we leave it up to userspace to tell us to collapse a range (via MADV_COLLAPSE).
- Memory Failure
When a memory error is found within a HugeTLB page, it would be ideal if we could unmap only the PAGE_SIZE section that contained the error. This is what THPs are able to do. Using high-granularity mapping, we could do this, but this isn't tackled in this patch series.
The initial user of the proposed user-space API for this kernel addition is high-granularity userfaultfd post-copy for HugeTLB handling.
Initially this HugeTLB High Granularity Mapping support is x86_64 only but there are plans for AArch64 and potentially other CPU architectures too. More details on the HugeTLB HGM support via today's patch series.