Show Your Support: This site is primarily supported by advertisements. Ads are what have allowed this site to be maintained on a daily basis for the past 18+ years. We do our best to ensure only clean, relevant ads are shown, when any nasty ads are detected, we work to remove them ASAP. If you would like to view the site without ads while still supporting our work, please consider our ad-free Phoronix Premium.
OpenBLAS 0.3.14 Released With Performance Improvements For AMD Ryzen, POWER10
OpenBLAS 0.3.14 on the x86_64 has an optimized BFloat16 GEMM kernel for Intel Cooper Lake processors, auto-detection is added for Rocket Lake and Tiger Lake, and AMD Ryzen processors are enjoying improved performance for SASUM / DASUM / SROT / DROT kernels. The OpenBLAS x86_64 code also has fixed its detection of AMD's Clang-based AOCC compiler, support for BLAS/CBLAS tests on Windows, and other fixes.
Outside of x86_64, on the POWER front there is now optimized POWER10 kernels for SSCAL / DSCAL / CSCAL / ZSCAL / SROT / DROT / CDOT / SASUM / DASUM. There are also improved performance for other existing kernels on IBM POWER10 too. The POWER code also now can be compiled by NVIDIA's HPC compiler.
On the ARM64 front there is support for compiling with the NVIDIA HPC and NAG Fortran compilers. A RISC-V compilation fix, several new CBLAS interfaces (CROTG, ZROTG, CSROT, and ZDROT), and other various compiler fixes round out this release.
More details and downloads for the OpenBLAS 0.3.14 release via GitHub.