POWER9 On Linux Will See Faster Context Switching, Other Optimizations
The POWER architecture changes have been submitted for the in-development Linux 4.20~5.0 kernel cycle, including more optimizations on the POWER9 front for these latest-generation IBM CPUs.
The code around the IBM POWER SLB (Segment Lookaside Buffer) miss-handling has been rewritten in clean C code rather than Assembly and improved upon from there. This cleaned up and optimized C code for their SLB entries is seeing a 27% speed-up in the context switching performance on one of their internal POWER9 benchmarks.
This next kernel also has improvements to their handling of SLB multi-hit errors, THP migration for Book3S POWER 7/8/9 hardware, support for physical memory up to 2PB in the linear mapping for 64-bit Book3S, stack protector support for 32-bit and 64-bit, support for recognizing POWER9 "big cores" made up of two SMT4 cores as a single SMT8 core, and various other code improvements.
A complete look at the POWER CPU changes for this next kernel cycle can be found via this pull request.
It's great timing as this coming week I will finally have my hands on POWER9 hardware as the folks at Raptor Computing Systems are kindly sending over one of their dual 22-core POWER9 Talos II systems for running a lot more benchmarks.
The code around the IBM POWER SLB (Segment Lookaside Buffer) miss-handling has been rewritten in clean C code rather than Assembly and improved upon from there. This cleaned up and optimized C code for their SLB entries is seeing a 27% speed-up in the context switching performance on one of their internal POWER9 benchmarks.
This next kernel also has improvements to their handling of SLB multi-hit errors, THP migration for Book3S POWER 7/8/9 hardware, support for physical memory up to 2PB in the linear mapping for 64-bit Book3S, stack protector support for 32-bit and 64-bit, support for recognizing POWER9 "big cores" made up of two SMT4 cores as a single SMT8 core, and various other code improvements.
A complete look at the POWER CPU changes for this next kernel cycle can be found via this pull request.
It's great timing as this coming week I will finally have my hands on POWER9 hardware as the folks at Raptor Computing Systems are kindly sending over one of their dual 22-core POWER9 Talos II systems for running a lot more benchmarks.
13 Comments