Facebook Engineer Proposing New Slab Memory Controller For Linux - Saves Lots Of RAM
Roman Gushchin of Facebook's Linux kernel engineering team has proposed a new slab memory controller for the Linux kernel.
Roman found what he refers to as a "very serious flaw" in the existing slab memory controller that leads to low utilization these days with cgroups. "The real reason why the existing design leads to a low slab utilization is simple: slab pages are used exclusively by one memory cgroup. If there are only few allocations of certain size made by a cgroup, or if some active objects (e.g. dentries) are left after the cgroup is deleted, or the cgroup contains a single-threaded application which is barely allocating any kernel objects, but does it every time on a new CPU: in all these cases the resulting slab utilization is very low. If kmem accounting is off, the kernel is able to use free space on slab pages for other allocations."
The new slab memory controller under review aims to provide better utilization via sharing slab pages between multiple memory cgroups.
With Facebook's internal testing of this code, it saved "hefty amounts of memory" up to 650~700Mb for a web front-end, 750~800Mb for a database cache, and around 700Mb for a DNS server. Overall it should save 30~40% of slab memory compared to the existing implementation.
More details on the technical implementation via this patch series. The new controller is under a "request for comments" flag so we'll see where this leads. Roman says they have not encountered any notable regressions, but more widespread testing will be needed before it would be considered for mainline. If all goes well, hopefully we'll see this in the mainline kernel in 2020.
Roman found what he refers to as a "very serious flaw" in the existing slab memory controller that leads to low utilization these days with cgroups. "The real reason why the existing design leads to a low slab utilization is simple: slab pages are used exclusively by one memory cgroup. If there are only few allocations of certain size made by a cgroup, or if some active objects (e.g. dentries) are left after the cgroup is deleted, or the cgroup contains a single-threaded application which is barely allocating any kernel objects, but does it every time on a new CPU: in all these cases the resulting slab utilization is very low. If kmem accounting is off, the kernel is able to use free space on slab pages for other allocations."
The new slab memory controller under review aims to provide better utilization via sharing slab pages between multiple memory cgroups.
With Facebook's internal testing of this code, it saved "hefty amounts of memory" up to 650~700Mb for a web front-end, 750~800Mb for a database cache, and around 700Mb for a DNS server. Overall it should save 30~40% of slab memory compared to the existing implementation.
More details on the technical implementation via this patch series. The new controller is under a "request for comments" flag so we'll see where this leads. Roman says they have not encountered any notable regressions, but more widespread testing will be needed before it would be considered for mainline. If all goes well, hopefully we'll see this in the mainline kernel in 2020.
21 Comments