Meta's Transparent Memory Offloading Saves Them 20~32% Of Memory Per Linux Server
Meta describes Transparent Memory Offloading as:
A new Linux kernel mechanism that measures the lost work due to resource shortage across CPU, memory, and I/O in real time. Guided by this information and without any prior application knowledge, TMO automatically adjusts the amount of memory to offload to a heterogeneous device, such as compressed memory or an SSD. It does so according to the device’s performance characteristics and the application’s sensitivity to slower memory accesses. TMO holistically identifies offloading opportunities from not only the application containers but also the sidecar containers that provide infrastructure-level functions.
TMO has been running in production for more than a year, and has saved 20 percent to 32 percent of total memory across millions of servers in our expansive data center fleet. We have successfully upstreamed TMO’s OS components into the Linux kernel.
The Linux kernel-side work includes the Pressure Stall Information (PSI) in the kernel already and then in user-space they have "Senpai" as a user-space agent.
The offloading is often being done to NVMe solid-state drives that are cheaper per-GB than server memory. Upcoming server platforms with Compute Express Link (CXL) also hold a lot of potential for Transparent Memory Offloading usage.
Those interested in learning more about the Facebook/Meta Transparent Memory Offloading (TMO) effort can see the Meta engineering blog for all the interesting technical details.