OpenBLAS 0.3.29 Brings Auto-Detection For Intel Granite Rapids, Apple M4 & AMD Zen 5

Written by Michael Larabel in Programming on 12 January 2025 at 10:58 AM EST. Add A Comment
PROGRAMMING
OpenBLAS 0.3.29 is out today as a big update for this widely-used, open-source implementation for Basic Linear Algebra Subprograms and LAPACK APIs.

OpenBLAS 0.3.29 brings improved thread scaling for multi-threaded SBGEMV and TRTRI, various multi-threaded fixes, improved documentation, and other general fixes.

When it comes to CPU/platform-specific work, there is initial support for detecting Apple M4 SoCs, various ARM64 performance optimizations, a number of x86_64 improvements, improved CGEMM and ZGEMM kernels for POWER10, many LoongArch 64-bit improvements, and some tuning/optimizations for RISC-V.

Apple Mac Mini M4 + AMD Zen 5 CPU


On the x86_64 side for OpenBLAS 0.3.29 there is CPU auto-detection for Intel Granite Rapids processors, auto-detection for AMD Zen 5 series processors, optimized SOMATCOPY_CT for AVX-capable targets, and a variety of other fixes/optimizations.

Downloads and more details on OpenBLAS 0.3.29 via GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week