NVIDIA Sends Out 1GB THP Support For Linux x86_64
NVIDIA software engineer Zi Yan who specializes in the Linux kernel memory management subsystem today sent out a set of patches proposing the addition of 1GB THP support for the Linux kernel.
The NVIDIA engineer is proposing 1GB transparent hugepages support for Linux on x86_64 hardware for being more flexible in reducing translation overhead and increasing the performance of applications that have a large memory footprint. Unlike Linux's huge page-tables (HugeTLB) code for supporting large memory footprints, the 1GB THP approach doesn't require application changes.
More details on the NVIDIA 1GB THP proposal for the Linux kernel via this patch series.
The NVIDIA engineer is proposing 1GB transparent hugepages support for Linux on x86_64 hardware for being more flexible in reducing translation overhead and increasing the performance of applications that have a large memory footprint. Unlike Linux's huge page-tables (HugeTLB) code for supporting large memory footprints, the 1GB THP approach doesn't require application changes.
Design
=======
1GB THP implementation looks similar to exiting THP code except some new designs for the additional page table level.
1. Page table deposit and withdraw using a new pagechain data structure: instead of one PTE page table page, 1GB THP requires 513 page table pages (one PMD page table page and 512 PTE page table pages) to be deposited at the page allocaiton time, so that we can split the page later. Currently, the page table deposit is using ->lru, thus only one page can be deposited. A new pagechain data structure is added to enable multi-page deposit.
2. Triple mapped 1GB THP : 1GB THP can be mapped by a combination of PUD, PMD, and PTE entries. Mixing PUD an PTE mapping can be achieved with existing PageDoubleMap mechanism. To add PMD mapping, PMDPageInPUD and sub_compound_mapcount are introduced. PMDPageInPUD is the 512-aligned base page in a 1GB THP and sub_compound_mapcount counts the PMD mapping by using page[N*512 + 3].compound_mapcount.
3. Using CMA allocaiton for 1GB THP: instead of bump MAX_ORDER, it is more sane to use something less intrusive. So all 1GB THPs are allocated from reserved CMA areas shared with hugetlb. At page splitting time, the bitmap for the 1GB THP is cleared as the resulting pages can be freed via normal page free path. We can fall back to alloc_contig_pages for 1GB THP if necessary.
More details on the NVIDIA 1GB THP proposal for the Linux kernel via this patch series.
5 Comments