OpenBLAS 0.3.8 Brings More AVX2/AVX512 Kernels, Other Optimizations
Written by Michael Larabel in Programming on 10 February 2020 at 10:04 AM EST. 3 Comments
For those using OpenBLAS as your BLAS (Basic Linear Algebra Subprograms) implementation, OpenBLAS 0.3.8 was released this weekend and coming with it are more AVX2/AVX-512 kernels and other optimizations.

OpenBLAS continues striving to compete with Intel's MKL and other optimized BLAS implementations and with more AVX2 and AVX-512 should help with the performance on the latest Intel and AMD CPUs. There is now an AVX-512 DGEMM kernel, the AVX-512 SGEMM kernel was "significantly" improved, and new AVX-512 optimized kernels for CGEMM and ZGEMM. On the AVX2 front the kernels for STRMM, SGEMM, and CGEMM are said to have been significantly sped-up along with new kernels for CGEMM3M and ZGEMM3M.

OpenBLAS 0.3.8 also adds support for QEMU virtual CPU detection, Intel Goldmont Plus CPU auto-detection, ARMv8 performance optimizations, various POWER optimizations, LAPACK 3.9.0 is now integrated, CMake build system improvements, and other general optimizations. There is also GCC 10 compiler support and improving compilation with g95 and non-GNU versions of the LD linker. Rounding out the release is official NetBSD support.

More details on the OpenBLAS 0.3.8 release via GitHub.
Related News
About The Author
Author picture

Michael Larabel is the principal author of and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and automated benchmarking software. He can be followed via Twitter or contacted via

Popular News This Week