Intel Has Another Series Optimizing Linux Performance With PCP High Auto-Tuning
Intel's open-source software engineers are known for many great performance optimizations to the Linux kernel. Over the years Intel has contributed countless performance optimizations to the kernel and related Linux components that have made significant improvements not only for Intel hardware but x86_64 as a whole and at times CPU architecture independent improvements. One of their newest performance optimizing patch series is around Per-CPU Pageset (PCP) high auto-tuning.
The new patch series is focused on tuning the per-CPU pageset high on each CPU automatically to optimize the page allocation performance.
The end result is finding that the Linux kernel build times on an Intel Sapphire Rapids test system was reduced by 5% with lower locking contention too, the Netperf performance improved up to 7%, and an lmbench test case score was 196% the base score.
See the patch series out for testing. Exciting to see the relentless optimizations to the Linux kernel driven in large part by Intel.
The new patch series is focused on tuning the per-CPU pageset high on each CPU automatically to optimize the page allocation performance.
The end result is finding that the Linux kernel build times on an Intel Sapphire Rapids test system was reduced by 5% with lower locking contention too, the Netperf performance improved up to 7%, and an lmbench test case score was 196% the base score.
The page allocation performance requirements of different workloads are often different. So, we need to tune the PCP (Per-CPU Pageset) high on each CPU automatically to optimize the page allocation performance.
The list of patches in series is as follows,
1 mm, pcp: avoid to drain PCP when process exit
2 cacheinfo: calculate per-CPU data cache size
3 mm, pcp: reduce lock contention for draining high-order pages
4 mm: restrict the pcp batch scale factor to avoid too long latency
5 mm, page_alloc: scale the number of pages that are batch allocated
6 mm: add framework for PCP high auto-tuning
7 mm: tune PCP high automatically
8 mm, pcp: decrease PCP high if free pages < high watermark
9 mm, pcp: avoid to reduce PCP high unnecessarily
10 mm, pcp: reduce detecting time of consecutive high order page freeing
Patch 1/2/3 optimize the PCP draining for consecutive high-order pages freeing.
Patch 4/5 optimize batch freeing and allocating.
Patch 6/7/8/9 implement and optimize a PCP high auto-tuning method.
Patch 10 optimize the PCP draining for consecutive high order page freeing based on PCP high auto-tuning.
See the patch series out for testing. Exciting to see the relentless optimizations to the Linux kernel driven in large part by Intel.
1 Comment