IBM Turns To More Optimizations For Linux On POWER10
The big one to call out this week are wake_affine improvements to sched/fair. After IBM found "the benchmark numbers on POWER10 were lesser than expected" they traced part of that back to the Linux scheduling code.
Due to the POWER10 L2 cache being at the core level, some tuning to sched/fair was done for POWER10 including a preference of idle CPU cores to cache affinity. This set of patches plus this earlier patch series from the start of April appears to pay off. The earlier series was in ensuring that the L2 cache is correctly discovered and setting the last-level cache (LLC) domain to the SMT sched-domain.
These patches appear to be paying off with cases like the Java DayTrader benchmark showing 44% higher throughput.Synthetic scheduling benchmarks were also paying off. But these patches still need to be further reviewed and also haven't yet been tested against existing POWER9 hardware to ensure no regressions. These patches are too late to see for Linux 5.13 but perhaps later this year with the 5.14 kernel will be ready.
There have also been other smaller patches throughout the Linux/open-source ecosystem in recent days and weeks like Glibc optimizing Strlen for POWER10. Yes, some nice improvements even for the string length function.
IBM POWER10 systems are expected to begin reaching customers at the end of the calendar year so expect more tuning in the months ahead.