Initial Benchmarks Of The LLVM/Clang 3.3 Compiler
While LLVM and Clang (and related LLVM projects) remain in heavy development for the 3.3 cycle, up today are some initial compiler benchmarks of LLVM/Clang 3.3 SVN compared to the current stable release.
Motivated in part by the loop vectorizer improvements that have already been committed to the SVN code-base, I ran some early LLVM/Clang 3.2 vs. 3.3 SVN benchmarks as of Monday morning (18 February). Testing happened from an AMD FX-8350 "Bulldozer2" (Vishera) system running Ubuntu 13.04 with the Linux 3.8 kernel.
Aside from the loop vectorizer enhancements, LLVM 3.3 will also feature the AMD R600 GPU back-end and the AMD 64-bit AArch64 back-end for future ARM Cortex processors coming in the future. There's also improvements to the x86 and ARM cost models, reworked attributes classes, and much more.
With it still being some months before the next LLVM 3.3 release, more changes and new features will surely pile in along with enhancements to the Clang C/C++ front-end. The benchmarks being shared today are just some very early, primitive benchmarks for whetting the appetites of those interested in compiler performance.
Results in full for some initial LLVM 3.3 loop vectorizer benchmarks when toggling the -fno-vectorize and -fvectorize compiler flags can be found within the 1302189-FO-LLVM33VEC37 result file on OpenBenchmarking.org. The result file also has all of the software/hardware details, logs, and other information for interested readers. This AMD FX-8350 system was also tested with the -fslp-vectorize compiler flag for also enabling the basic block vectorizer within LLVM.
The LLVM vectorizers on the SVN code as of Monday have only small performance benefits to the HMMer real-world scientific workload. For LLVM 3.3 it's expected that the loop vectorizer will be enabled by default.
For not all workloads will the vectorizers obviously be of benefit. For more details on what the LLVM vectorizers are capable of, read the LLVM.org documentation on the current vectorizers.
The LLVM Loop Vectorizer led to a small performance regression within Himeno.
When applying LLVM vectorizers, obviously the compile-time increases.
Uploaded separately within the 1302186-FO-LLVM33FIR21 result file are more LLVM/Clang 3.3 compiler benchmarks. Within that result file are all of the details when comparing the LLVM/Clang 3.2 stable performance to the LLVM/Clang 3.3 SVN state as of yesterday. The testing happened from the same AMD FX-8350 system running Ubuntu Linux.
More LLVM/Clang 3.3 benchmarks will come as the official release approaches in the coming months. If you are interested in more open-source compiler benchmarks until then, check out the recent PathScale EKOPath 5.0 Beta Compiler Performance and Benchmarking The New Optimization Level In GCC 4.8 articles.
Motivated in part by the loop vectorizer improvements that have already been committed to the SVN code-base, I ran some early LLVM/Clang 3.2 vs. 3.3 SVN benchmarks as of Monday morning (18 February). Testing happened from an AMD FX-8350 "Bulldozer2" (Vishera) system running Ubuntu 13.04 with the Linux 3.8 kernel.
Aside from the loop vectorizer enhancements, LLVM 3.3 will also feature the AMD R600 GPU back-end and the AMD 64-bit AArch64 back-end for future ARM Cortex processors coming in the future. There's also improvements to the x86 and ARM cost models, reworked attributes classes, and much more.
With it still being some months before the next LLVM 3.3 release, more changes and new features will surely pile in along with enhancements to the Clang C/C++ front-end. The benchmarks being shared today are just some very early, primitive benchmarks for whetting the appetites of those interested in compiler performance.
Results in full for some initial LLVM 3.3 loop vectorizer benchmarks when toggling the -fno-vectorize and -fvectorize compiler flags can be found within the 1302189-FO-LLVM33VEC37 result file on OpenBenchmarking.org. The result file also has all of the software/hardware details, logs, and other information for interested readers. This AMD FX-8350 system was also tested with the -fslp-vectorize compiler flag for also enabling the basic block vectorizer within LLVM.
The LLVM vectorizers on the SVN code as of Monday have only small performance benefits to the HMMer real-world scientific workload. For LLVM 3.3 it's expected that the loop vectorizer will be enabled by default.
For not all workloads will the vectorizers obviously be of benefit. For more details on what the LLVM vectorizers are capable of, read the LLVM.org documentation on the current vectorizers.
The LLVM Loop Vectorizer led to a small performance regression within Himeno.
When applying LLVM vectorizers, obviously the compile-time increases.
Uploaded separately within the 1302186-FO-LLVM33FIR21 result file are more LLVM/Clang 3.3 compiler benchmarks. Within that result file are all of the details when comparing the LLVM/Clang 3.2 stable performance to the LLVM/Clang 3.3 SVN state as of yesterday. The testing happened from the same AMD FX-8350 system running Ubuntu Linux.
More LLVM/Clang 3.3 benchmarks will come as the official release approaches in the coming months. If you are interested in more open-source compiler benchmarks until then, check out the recent PathScale EKOPath 5.0 Beta Compiler Performance and Benchmarking The New Optimization Level In GCC 4.8 articles.
Add A Comment