Memory Folios Looks For Inclusion In Linux 5.16
After memory folios failed to make it into Linux 5.15, this low-level change to the kernel memory management code that has possible performance implications is looking to land for Linux 5.16.
Ahead of the Linux 5.16 merge window that may open as soon as tomorrow, Matthew Wilcox sent in his pull request for introducing folios to the kernel. Here is the main excerpt from the pull request for those not familiar with folios or having forgot the details over the months that this feature has been in the works:
For end-users, it means possible performance benefits and over succeeding kernel releases the functionality around memory folios will be built up.
See the pull request for more details. Now to see if Linus Torvalds decides to pull this ~2k+ lines of changes or if any other new/renewed objections are raised over the addition.
Ahead of the Linux 5.16 merge window that may open as soon as tomorrow, Matthew Wilcox sent in his pull request for introducing folios to the kernel. Here is the main excerpt from the pull request for those not familiar with folios or having forgot the details over the months that this feature has been in the works:
The point of all this churn is to allow filesystems and the page cache to manage memory in larger chunks than PAGE_SIZE. The original plan was to use compound pages like THP does, but I ran into problems with some functions expecting only a head page while others expect the precise page containing a particular byte. The folio type allows a function to declare that it's expecting only a head page. Almost incidentally, this allows us to remove various calls to VM_BUG_ON(PageTail(page)) and compound_head().
This pull request converts just parts of the core MM and the page cache. For 5.17, we intend to convert various filesystems (XFS and AFS are ready; other filesystems may make it) and also convert more of the MM and page cache to folios. For 5.18, multi-page folios should be ready.
The multi-page folios offer some improvement to some workloads. The 80% win is real, but appears to be an artificial benchmark (postgres startup, which isn't a serious workload). Real workloads (eg building the kernel, running postgres in a steady state, etc) seem to benefit between 0-10%. I haven't heard of any performance losses as a result of this series. Nobody has done any serious performance tuning; I imagine that tweaking the readahead algorithm could provide some more interesting wins. There are also other places where we could choose to create large folios and currently do not, such as writes that are larger than PAGE_SIZE.
For end-users, it means possible performance benefits and over succeeding kernel releases the functionality around memory folios will be built up.
See the pull request for more details. Now to see if Linus Torvalds decides to pull this ~2k+ lines of changes or if any other new/renewed objections are raised over the addition.
10 Comments