Linux Kernel Live Patching Working Fairly Well For Millions Of Meta Servers

Written by Michael Larabel in Linux Kernel on 14 September 2022 at 08:02 AM EDT. 3 Comments
LINUX KERNEL
Meta/Facebook has turned to kernel live-patching (KLP) with Red Hat's Kpatch the the Linux kernel livepatch infrastructure to handle live updates to "several million servers". Meta engineers shared during this week's Linux Plumbers Conference around the successes they've had with it as well as troubles encountered along the way.

As with most organizations looking at kernel live-patching, they turned to it in order to reduce server downtime on kernel updates -- primarily for the never-ending flow of security updates. Fully rebooting the servers and the often lengthy POST times can be rather problematic while with kernel live-patching they can near-seamlessly move to the new kernel when everything goes according to plan.

Livepatching allows for kernel functions to be safely patched in-place at run-time. Beyond the livepatch infrastructure within the kernel, Meta went with Red Hat's Kpatch while SUSE continues to maintain kGraft and Oracle also has Ksplice.


Meta: Kernel Live Patching at Scale


Along the way in using Linux live-patching on "millions of servers", they have found issues to overcome with tracing issues and there have been some performance issues encountered. The performance issues reported are possible 1~2 second issues during live-patching of higher I/O and fsync latency as well as higher TCP re-transmit rates.

Meta engineers continue working on dealing with corner cases, better handling for cases like Clang-compiled PGO-optimized kernel builds, and other items to increase robustness.

Those curious about Meta's kernel live-patching at scale work can see the LPC 2022 slide deck and the video recording embedded below.

Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week