Poking At A Big NUMA Benchmark Regression In Linux 5.18 Git

Written by Michael Larabel in Linux Kernel on 31 March 2022 at 09:30 AM EDT. 10 Comments

There still is a few days left to the Linux 5.18 merge window but already I've started firing up benchmarks of this new kernel on a handful of desktops and servers so far. One benchmark though in particular has been showing a staggering performance drop on Linux 5.18 on multiple systems but overall Linux 5.18 in my testing thus far has been working out well.

One of the systems I started running Linux 5.18 Git benchmarks during this second week of the merge window was on the AMD Ryzen 9 5950X.

Linux 5.18 on the Ryzen 9 5950X has been largely stable compared to recent kernel series...

While most benchmarks didn't show any measurable change from Linux 5.15 through 5.18 Git this week, Stress-NG was an outlier... And in particular, its NUMA benchmark.

The stress-ng load/stress program does a good job stressing Linux systems and its NUMA test case in particular regressed heavy with Linux 5.18. Other tested stress-ng stressors were unaffected. With Linux 5.18 Git, the NUMA bogo ops per second performance nosedived.

While a synthetic test case, I was able to reproduce this Stress-NG NUMA nosedive on an Intel Core i9 12900K desktop too where Linux 5.16/5.17 was steady but then Linux 5.18 fell sharply...

Ouch... Seems to be the first major regression seen so far in my early #Linux 5.18 Git testing.

At least should be a quick, easy fun one to track down. pic.twitter.com/U5nsDeqvtM
— Phoronix (@phoronix) March 30, 2022

So back on the Ryzen 9 5950X system I went through to bisect this regression affecting Linux 5.18 Git:

The regression traced back to the memory management changes merged last week for Linux 5.18... In particular, this commit was Git bisected to being the first bad commit where the stress-ng NUMA performance collapsed.

At least with the various real-world workloads I've benchmarked so far on the Ryzen 9 5950X, Core i9 12900K, and a few others I have yet to see any significant difference there with Linux 5.18. At least though stress-ng is a quick and easy to run test case. Though with perpetual time/resource limitations, that's where I ended this testing that anyone can now pick up from with the information available.

I'll be looking at other areas of Linux 5.18 performance and on more hardware as the merge window draws to a close. If you enjoy the Linux benchmarks and other work, consider showing your support via going Phoronix Premium or PayPal tip to allow for more time/resources for such investigative benchmarking.

10 Comments