LLVM Clang 14 Lands An "Amazing" Performance Optimization
LLVM developer Djordje Todorovic recently landed an improvement to LLVM's Loop Invariant Code Motion (LICM) Pass for being able to hoist a LOAD without STORE. The patch explains, "When doing load/store promotion within LICM, if we cannot prove that it is safe to sink the store we won't hoist the load, even though we can prove the load could be dereferenced and moved outside the loop. This patch implements the load promotion by moving it in the loop preheader by inserting proper PHI in the loop. The store is kept as is in the loop. By doing this, we avoid doing the load from a memory location in each iteration." The improvement to this pass helps to address this bug report around missed opportunities for register promotion.
But for those not into compiler internals and just interested in the net gain, Todorovic shared some benchmark results and commentary:
Wow, the numbers (by using the @OpenBenchmark; X86_64 -O3) after https://t.co/AN6QsRSQG3 look amazing: pic.twitter.com/4MGgaFdwfK
— Djordje Todorovic (@djtodoro) December 3, 2021
In our PostgreSQL benchmark he is seeing around ~12% higher performance with this load hoisting patch, and a variety of other workloads from XZ compression to C-Ray to MrBayes and others are seeing improvements by generally a few percent.
This improvement and countless other patches will be part of LLVM Clang 14.0 that going by their usual release cadence should surface as stable around March.