Facebook's BOLT Is An Effort To Speed-Up Linux Binaries
BOLT is the Binary Optimization and Layout Tool that is a Facebook Incubator project for speeding up Linux x86-64/AArch64 ELF binaries.
BOLT is a post-link optimizer designed to speed-up large applications based upon an execution profile generated by the Linux perf utility and optimizes the program's code layout. BOLT leverages LLVM but can also work with binaries built by GCC.
More details on this early stage BOLT open-source project can be found via Facebook Incubator on GitHub.
I'll likely give it a whirl as time allows... It should be possible to plumb it into a Phoronix Test Suite module similar to the PGO benchmarking module to automatically gather the necessary perf trace and then feed it back through (in this case BOLT) and then look at the impact on performance.
BOLT is a post-link optimizer designed to speed-up large applications based upon an execution profile generated by the Linux perf utility and optimizes the program's code layout. BOLT leverages LLVM but can also work with binaries built by GCC.
BOLT disassembles functions and reconstructs the control flow graph (CFG) before it runs optimizations. Since this is a nontrivial task, especially when indirect branches are present, we rely on certain heuristics to accomplish it. These heuristics have been tested on a code generated with Clang and GCC compilers. The main requirement for C/C++ code is not to rely on code layout properties, such as function pointer deltas. Assembly code can be processed too. Requirements for it include a clear separation of code and data, with data objects being placed into data sections/segments. If indirect jumps are used for intra-function control transfer (e.g. jump tables), the code patterns should be matching those generated by Clang/GCC.
More details on this early stage BOLT open-source project can be found via Facebook Incubator on GitHub.
I'll likely give it a whirl as time allows... It should be possible to plumb it into a Phoronix Test Suite module similar to the PGO benchmarking module to automatically gather the necessary perf trace and then feed it back through (in this case BOLT) and then look at the impact on performance.
13 Comments