The New Compiler Features Of LLVM 10.0 / Clang 10.0

Written by Michael Larabel in LLVM on 8 March 2020 at 10:26 AM EDT. 4 Comments

After running behind schedule from the planned release last month and an extra release candidate being warranted, LLVM 10.0 should be releasing this coming weeks along with its sub-projects -- most notably, the Clang 10.0 C/C++ compiler. Here is a look at the big ticket items of LLVM/Clang 10.0.

The release of LLVM/Clang 10.0 is expected in the coming days while GCC 10 will be releasing in the next few weeks. As for the changes with this half-year update to this innovative compiler infrastructure, LLVM 10.0 highlights include:

- For Intel AVX-512 CPUs, -mprefer-vector-width=256 is now the default behavior for limiting the use of 512-bit registers due to the AVX-512 downclocking that can occur. This matches the behavior of GCC now while those wanting the previous behavior can pass -mprefer-vector-width=512 if wanting to increase the use of 512-bit registers but with possible performance implications from the AVX-512 frequency impact.

- AMD Znver2 (Zen 2) improvements.

- An option to help with the JCC microcode erratum impact.

- Support for Arm's Cortex-A65, A65AE, Neoverse N1, and Neoverse E1 cores.

- Octeon+ MIPS CPUs are now supported and improved support for existing Octeon processors.

- IBM z15 target support.

- Besides new Arm CPU targets, the AArch64 back-end for LLVM 10 also has more optimized ARMv8.1-M code generation, auto-vectorization for the ARMv8.1-M MVE vector extension, and other improvements.

- IBM POWER has seen a number of improvements too including better register pressure estimates, improved cost model for the vectorizer, vectorization of math routines using the IBM MASSV library, and other enhancements.

- LLVM's WebAssembly target has much better SIMD support, thread-local storage (TLS) now works, and other support improvements.

- Many improvements to RISC-V's LLVM support.

- LLDB can now handle debugging Windows ARM/ARM64 binaries and also has better support for being built by MinGW.

- MLIR landed as the promising new IR being picked up by an increasing number of projects.

- Numerous AMDGPU LLVM back-end improvements.

The Clang 10.0 C/C++ front-end meanwhile has going for it:

- Expanded C++20 support, including C++ Concepts support and other features, but the support isn't yet finished in full.

- A variety of diagnostics improvements with continuing to ramp up the usefulness of their warnings and ensuring they are accurate.

- Skylake-AVX512/Icelake/Cascadelake/Cooperlake targets will now default to not using 512-bit ZMM registers in vectorized code unless 512-bit intrinsics are used in the source code, due to the AVX-512 frequency hit that can lead to lower performance. Similar to the LLVM change, -mprefer-vector-width=512 can be used to restore the previous behavior.

- When building for WebAssembly, wasm-opt will be run if found on the system for helping to reduce the generated code size.

- Various other changes to better match GCC's behavior for different commands and outputs.

- Various minor improvements to Clang's OpenCL C/C++ kernel language support.

- Expanded OpenMP 5.0 support including for range-based loops, collapsing of imperfectly nested loops, unified shared memory for NVIDIA NVPTX, and other additions.

4 Comments