OpenBLAS 0.3.8 Brings More AVX2/AVX512 Kernels, Other Optimizations
For those using OpenBLAS as your BLAS (Basic Linear Algebra Subprograms) implementation, OpenBLAS 0.3.8 was released this weekend and coming with it are more AVX2/AVX-512 kernels and other optimizations.
OpenBLAS continues striving to compete with Intel's MKL and other optimized BLAS implementations and with more AVX2 and AVX-512 should help with the performance on the latest Intel and AMD CPUs. There is now an AVX-512 DGEMM kernel, the AVX-512 SGEMM kernel was "significantly" improved, and new AVX-512 optimized kernels for CGEMM and ZGEMM. On the AVX2 front the kernels for STRMM, SGEMM, and CGEMM are said to have been significantly sped-up along with new kernels for CGEMM3M and ZGEMM3M.
OpenBLAS 0.3.8 also adds support for QEMU virtual CPU detection, Intel Goldmont Plus CPU auto-detection, ARMv8 performance optimizations, various POWER optimizations, LAPACK 3.9.0 is now integrated, CMake build system improvements, and other general optimizations. There is also GCC 10 compiler support and improving compilation with g95 and non-GNU versions of the LD linker. Rounding out the release is official NetBSD support.
More details on the OpenBLAS 0.3.8 release via GitHub.
OpenBLAS continues striving to compete with Intel's MKL and other optimized BLAS implementations and with more AVX2 and AVX-512 should help with the performance on the latest Intel and AMD CPUs. There is now an AVX-512 DGEMM kernel, the AVX-512 SGEMM kernel was "significantly" improved, and new AVX-512 optimized kernels for CGEMM and ZGEMM. On the AVX2 front the kernels for STRMM, SGEMM, and CGEMM are said to have been significantly sped-up along with new kernels for CGEMM3M and ZGEMM3M.
OpenBLAS 0.3.8 also adds support for QEMU virtual CPU detection, Intel Goldmont Plus CPU auto-detection, ARMv8 performance optimizations, various POWER optimizations, LAPACK 3.9.0 is now integrated, CMake build system improvements, and other general optimizations. There is also GCC 10 compiler support and improving compilation with g95 and non-GNU versions of the LD linker. Rounding out the release is official NetBSD support.
More details on the OpenBLAS 0.3.8 release via GitHub.
3 Comments