The New Compiler Features Of LLVM 10.0 / Clang 10.0
The release of LLVM/Clang 10.0 is expected in the coming days while GCC 10 will be releasing in the next few weeks. As for the changes with this half-year update to this innovative compiler infrastructure, LLVM 10.0 highlights include:
- For Intel AVX-512 CPUs, -mprefer-vector-width=256 is now the default behavior for limiting the use of 512-bit registers due to the AVX-512 downclocking that can occur. This matches the behavior of GCC now while those wanting the previous behavior can pass -mprefer-vector-width=512 if wanting to increase the use of 512-bit registers but with possible performance implications from the AVX-512 frequency impact.
- AMD Znver2 (Zen 2) improvements.
- An option to help with the JCC microcode erratum impact.
- Support for Arm's Cortex-A65, A65AE, Neoverse N1, and Neoverse E1 cores.
- Octeon+ MIPS CPUs are now supported and improved support for existing Octeon processors.
- IBM z15 target support.
- Besides new Arm CPU targets, the AArch64 back-end for LLVM 10 also has more optimized ARMv8.1-M code generation, auto-vectorization for the ARMv8.1-M MVE vector extension, and other improvements.
- IBM POWER has seen a number of improvements too including better register pressure estimates, improved cost model for the vectorizer, vectorization of math routines using the IBM MASSV library, and other enhancements.
- LLVM's WebAssembly target has much better SIMD support, thread-local storage (TLS) now works, and other support improvements.
- Many improvements to RISC-V's LLVM support.
- LLDB can now handle debugging Windows ARM/ARM64 binaries and also has better support for being built by MinGW.
- MLIR landed as the promising new IR being picked up by an increasing number of projects.
- Numerous AMDGPU LLVM back-end improvements.
The Clang 10.0 C/C++ front-end meanwhile has going for it:
- Expanded C++20 support, including C++ Concepts support and other features, but the support isn't yet finished in full.
- A variety of diagnostics improvements with continuing to ramp up the usefulness of their warnings and ensuring they are accurate.
- Skylake-AVX512/Icelake/Cascadelake/Cooperlake targets will now default to not using 512-bit ZMM registers in vectorized code unless 512-bit intrinsics are used in the source code, due to the AVX-512 frequency hit that can lead to lower performance. Similar to the LLVM change, -mprefer-vector-width=512 can be used to restore the previous behavior.
- When building for WebAssembly, wasm-opt will be run if found on the system for helping to reduce the generated code size.
- Various other changes to better match GCC's behavior for different commands and outputs.
- Various minor improvements to Clang's OpenCL C/C++ kernel language support.
- Expanded OpenMP 5.0 support including for range-based loops, collapsing of imperfectly nested loops, unified shared memory for NVIDIA NVPTX, and other additions.