OpenBLAS 0.3.9 Released With More AVX-512 Tuning, Arm Neoverse N1 Support
OpenBLAS 0.3.8 was released shy of a month ago for this popular Basic Linear Algebra Subprograms implementation while now has been succeeded by OpenBLAS 0.3.9.
OpenBLAS 0.3.9 continues optimizing for x86_64 and other CPU architectures. On the x86_64 front there are a few long-standing error/bug fixes, fixed the CPU detection code for Goldmont+ and Ice Lake, fixed Skylake-X compilation on MinGW, and continued AVX work. The latest on the Advanced Vector Extensions front is improving the AVX-512 GEMM3M code, a AVX-512 kernel for STRMM, and improving the AVX2 GEMM kernel performance.
ARM support has seen growing OpenBLAS work given the higher performing chips coming to market. With OpenBLAS 0.3.9 there is now support for the Arm Neoverse N1, support for Ampere's eMAG 8180, better performance of the blas_lock code, a performance fix for TSV110 servers, and also some fixes for the older ARMv7 support.
OpenBLAS 0.3.9 is rounded out by fixes for MIPS64 and POWER too. The complete list of OpenBLAS 0.3.9 changes via GitHub.
OpenBLAS 0.3.9 continues optimizing for x86_64 and other CPU architectures. On the x86_64 front there are a few long-standing error/bug fixes, fixed the CPU detection code for Goldmont+ and Ice Lake, fixed Skylake-X compilation on MinGW, and continued AVX work. The latest on the Advanced Vector Extensions front is improving the AVX-512 GEMM3M code, a AVX-512 kernel for STRMM, and improving the AVX2 GEMM kernel performance.
ARM support has seen growing OpenBLAS work given the higher performing chips coming to market. With OpenBLAS 0.3.9 there is now support for the Arm Neoverse N1, support for Ampere's eMAG 8180, better performance of the blas_lock code, a performance fix for TSV110 servers, and also some fixes for the older ARMv7 support.
OpenBLAS 0.3.9 is rounded out by fixes for MIPS64 and POWER too. The complete list of OpenBLAS 0.3.9 changes via GitHub.
1 Comment