LLVM Clang 14 Lands An "Amazing" Performance Optimization

Written by Michael Larabel in LLVM on 6 December 2021 at 01:54 PM EST. 41 Comments

While the performance of LLVM/Clang has improved a lot over the years and for x86_64 and AArch64 can be neck-and-neck with the GCC compiler, the fierce performance battle is not over. With LLVM/Clang 14.0 due out in the early months of 2022 will be more performance work with one recent commit in particular showing a lot of promise.

LLVM developer Djordje Todorovic recently landed an improvement to LLVM's Loop Invariant Code Motion (LICM) Pass for being able to hoist a LOAD without STORE. The patch explains, "When doing load/store promotion within LICM, if we cannot prove that it is safe to sink the store we won't hoist the load, even though we can prove the load could be dereferenced and moved outside the loop. This patch implements the load promotion by moving it in the loop preheader by inserting proper PHI in the loop. The store is kept as is in the loop. By doing this, we avoid doing the load from a memory location in each iteration." The improvement to this pass helps to address this bug report around missed opportunities for register promotion.

But for those not into compiler internals and just interested in the net gain, Todorovic shared some benchmark results and commentary:

Wow, the numbers (by using the @OpenBenchmark; X86_64 -O3) after https://t.co/AN6QsRSQG3 look amazing: pic.twitter.com/4MGgaFdwfK
— Djordje Todorovic (@djtodoro) December 3, 2021

In our PostgreSQL benchmark he is seeing around ~12% higher performance with this load hoisting patch, and a variety of other workloads from XZ compression to C-Ray to MrBayes and others are seeing improvements by generally a few percent.

This improvement and countless other patches will be part of LLVM Clang 14.0 that going by their usual release cadence should surface as stable around March.

41 Comments