OpenBLAS 0.3.28 Brings More Optimizations, Meteor Lake & Emerald Rapids Support

Written by Michael Larabel in Programming on 8 August 2024 at 09:06 PM EDT. Add A Comment
PROGRAMMING
OpenBLAS 0.3.28 made it out today as the open-source optimized BLAS library that caters to a wide range of processors spanning various architectures. With this OpenBLAS 0.3.28 release are yet more optimizations and new CPU optimized paths.

OpenBLAS 0.3.28 reworks its "HUGETLB" implementation from GotoBLAS, improves multi-threaded GEMM performance for certain matrices, improved BLAS3 performance on large multi-core systems via enhanced parallelism, improved performance of initial memory allocation, and a range of other common optimizations and fixes.

OpenBLAS 0.3.28 also brings official support for Intel Xeon Emerald Rapids and Intel Core Ultra (Meteor Lake) processors. There is also now auto-detection of Zhaoxin KX-7000 CPUs, fixing auto-detection for old Intel Prescott CPUs, improved compiler options for CMake and LLVM builds on AVX-512 capable targets, and other x86_64 optimizations.

Intel Xeon Platinum Emerald Rapids CPU


Over on the ARM64 side is improved GEMM performance on the Arm Neoverse V1, new optimized kernels for the A64FX, and other changes. There are also a number of LoongArch, RISC-V, and POWER optimizations too in this BLAS library update.

Downloads and more details on OpenBLAS 0.3.28 for this leading open-source BLAS implementation via GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week