Linux 5.15 Addressing Scalability Issue That Caused Huge IBM Servers 30+ Minutes To Boot
Very large IBM mainframes/servers were taking 30+ minutes to boot the Linux kernel... No, just not for POST'ing the system with memory training and the like, but for loading Linux. Fortunately, with the Linux 5.15 kernel there is a set of scalability enhancements to allow these large IBM systems to be able to boot in around five minutes.
With the driver core changes for Linux 5.15 is a set of patches working on enhancing the performance of Kernfs for functionality used around pseudo file-systems like sysfs. Leading to this Kernfs locking and concurrency improvements were engineers finding that large IBM Power mainframe systems with "several hundred CPUs and 64TB of RAM" were taking 30+ minutes to just boot the Linux kernel. Extra kernel parameters were also needed to avoid the kernel timing out on boot.
The extremely long boot times for modern, high-end servers was found to be the result of many path look-ups for non-existent files and extreme locking contention within the VFS code.
Making matters worse is that with the 64TB of system memory and IBM Power dividing them into 256MB local blocks exposed via sysfs, a heck of a lot of sysfs nodes are being created.
With the Kernfs scalability improvements for benefiting sysfs that are in the driver core changes for Linux 5.15, these IBM systems can go from 30+ minutes to boot to now under five minutes. The changes involve switching Kernfs mutex to using a read-write semaphore for allowing node searches in parallel, improving path resolution, and using the VFS negative dentry caching.
These Kernfs improvements and more can be found via the driver core PR that was merged today for the 5.15 merge window.
With the driver core changes for Linux 5.15 is a set of patches working on enhancing the performance of Kernfs for functionality used around pseudo file-systems like sysfs. Leading to this Kernfs locking and concurrency improvements were engineers finding that large IBM Power mainframe systems with "several hundred CPUs and 64TB of RAM" were taking 30+ minutes to just boot the Linux kernel. Extra kernel parameters were also needed to avoid the kernel timing out on boot.
The extremely long boot times for modern, high-end servers was found to be the result of many path look-ups for non-existent files and extreme locking contention within the VFS code.
Making matters worse is that with the 64TB of system memory and IBM Power dividing them into 256MB local blocks exposed via sysfs, a heck of a lot of sysfs nodes are being created.
With the Kernfs scalability improvements for benefiting sysfs that are in the driver core changes for Linux 5.15, these IBM systems can go from 30+ minutes to boot to now under five minutes. The changes involve switching Kernfs mutex to using a read-write semaphore for allowing node searches in parallel, improving path resolution, and using the VFS negative dentry caching.
These Kernfs improvements and more can be found via the driver core PR that was merged today for the 5.15 merge window.
31 Comments