Another Look At The Performance Impact To IBM's POWER9 L1d Flushing Change
Last week I provided some benchmarks looking at the IBM POWER9 mitigation for the L1 data cache needing to be flushed upon entering the kernel and on user accesses due to a recently disclosed vulnerability. POWER9 allows speculatively operating on validated data in the L1 cache, but when it comes to incompletely validated data paired with other side channels it could lead to local users potentially obtaining improper access to data in the L1 data cache. When benchmarking the impact on a POWER9 4c/16t CPU the overall impact was fairly modest while since then I fired up some benchmarks as well on a large POWER9 server with 44 cores / 176 threads to see the performance impact of this default Linux kernel change.
Using a Raptor Talos II with two 22-core POWER9 CPUs to yield a combined 44 cores / 176 threads I ran some benchmarks before/after this Linux kernel change that happened in November. Linux 5.9.8 to 5.9.10 was compared for where this IBM POWER9 specific security change was made that by default will clear the L1d cache when entering the kernel and on user accesses. As previously outlined, this POWER9 change can also be disabled via two new kernel options to maintain the old behavior of not clearing the L1d cache.
Ubuntu 20.10 was running on this Raptor Talos II server while a wide variety of benchmarks were then fired off with the open-source Phoronix Test Suite benchmarking software.