New MGLRU Linux Patches Look To Improve The Scalability Of Global Reclaim
Among the many exciting new features in Linux 6.1 is the merging of the Multi-Gen LRU "MGLRU" code as what has shaped up to be one of the best kernel innovations for 2022 for overhauling the Linux kernel's page reclamation code. The performance results already are very promising and MGLRU is being used successfully at Google and other large deployments. The work isn't over though on further advancing the kernel in this area.
MGLRU is looking great for Linux 6.1 and is continuing to be evolved. Google engineer Yu Zhao who has been leading the MGLRU patches for the upstream Linux kernel last week sent out a new set of enhancements.
Yu Zhao's latest patches cover the memcg LRU. Here's how he sums up this additional feature work:
So far the test results are limited but with a sample test script to measure the effectiveness, MGLRU is in very good shape.
He expects to have more benchmark results to share soon. See the memcg LRU patches for more details on this latest MGLRU work. Though given the timing of these patches, it's not expected to land for the upcoming v6.2 cycle.
MGLRU is looking great for Linux 6.1 and is continuing to be evolved. Google engineer Yu Zhao who has been leading the MGLRU patches for the upstream Linux kernel last week sent out a new set of enhancements.
Yu Zhao's latest patches cover the memcg LRU. Here's how he sums up this additional feature work:
An memcg LRU is a per-node LRU of memcgs. It is also an LRU of LRUs, since each node and memcg combination has an LRU of folios (see mem_cgroup_lruvec()).
Its goal is to improve the scalability of global reclaim, which is critical to system-wide memory overcommit in data centers. Note that memcg reclaim is currently out of scope.
Its memory bloat is a pointer to each LRU vector and negligible to each node. In terms of traversing memcgs during global reclaim, it improves the best-case complexity from O(n) to O(1) and does not affect the worst-case complexity O(n). Therefore, on average, it has a sublinear complexity in contrast to the current linear complexity.
...
In terms of global reclaim, it has two distinct features:
1. Sharding, which allows each thread to start at a random memcg (in the old generation) and improves parallelism;
2. Eventual fairness, which allows direct reclaim to bail out and reduces latency without affecting fairness over some time.
So far the test results are limited but with a sample test script to measure the effectiveness, MGLRU is in very good shape.
He expects to have more benchmark results to share soon. See the memcg LRU patches for more details on this latest MGLRU work. Though given the timing of these patches, it's not expected to land for the upcoming v6.2 cycle.
15 Comments