Facebook's BOLT Nearing Mainline LLVM For Optimizing Binaries
Facebook's BOLT project for optimizing the performance out of compiled binaries is nearing the point of being added to LLVM's official source tree with its mono repository.
BOLT has been an engineering project at Facebook going back years that is a Binary Optimization and Layout Tool for speeding up Linux binaries. It aims to optimize the speed of large applications based on a collected execution profile, generated via Linux perf or similar, by improving the code layout for greater efficiency.
BOLT leverages LLVM and for the past year Facebook engineers have been wanting to upstream this optimizer inside LLVM. Generally BOLT'ing an application can net a several percentage point improvement to performance while in some cases can even be double digit improvements. BOLT is complementary to the likes of a compiler's LTO and PGO (Profile Guided) optimizations but like PGO does require first collecting a profile on the binary to be optimized.
BOLT is designed to work with large and complex applications/services. BOLT is already in use for large, production workloads within Facebook for squeezing out greater performance. In a 2019 paper they reported a 7% performance speed-up for their data-center applications on top of the gains already achieved by feedback-directed optimizations (FDO) and LTO. In some cases can even speed up binaries by ~20% or up to 50% if not using FDO/LTO.
What's going on now is nearing the point of BOLT being added to LLVM's source tree in the form of its mono repository. This mailing list thread was started to work through remaining technicalities and ensuring no other outstanding issues remain before it's ultimately merged. It's looking soon like the landing of BOLT within the LLVM repository will happen. Facebook hopes that having BOLT upstream will encourage more contributions to this tool. Currently BOLT does not work on Windows.
BOLT benchmarks coming up soon on Phoronix.
BOLT has been an engineering project at Facebook going back years that is a Binary Optimization and Layout Tool for speeding up Linux binaries. It aims to optimize the speed of large applications based on a collected execution profile, generated via Linux perf or similar, by improving the code layout for greater efficiency.
BOLT leverages LLVM and for the past year Facebook engineers have been wanting to upstream this optimizer inside LLVM. Generally BOLT'ing an application can net a several percentage point improvement to performance while in some cases can even be double digit improvements. BOLT is complementary to the likes of a compiler's LTO and PGO (Profile Guided) optimizations but like PGO does require first collecting a profile on the binary to be optimized.
BOLT is designed to work with large and complex applications/services. BOLT is already in use for large, production workloads within Facebook for squeezing out greater performance. In a 2019 paper they reported a 7% performance speed-up for their data-center applications on top of the gains already achieved by feedback-directed optimizations (FDO) and LTO. In some cases can even speed up binaries by ~20% or up to 50% if not using FDO/LTO.
What's going on now is nearing the point of BOLT being added to LLVM's source tree in the form of its mono repository. This mailing list thread was started to work through remaining technicalities and ensuring no other outstanding issues remain before it's ultimately merged. It's looking soon like the landing of BOLT within the LLVM repository will happen. Facebook hopes that having BOLT upstream will encourage more contributions to this tool. Currently BOLT does not work on Windows.
BOLT benchmarks coming up soon on Phoronix.
4 Comments