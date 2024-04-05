Show Your Support: Did you know that the hundreds of articles written on Phoronix each month are mostly authored by one individual? Phoronix.com doesn't have a whole news room with unlimited resources and relies upon people reading our content without blocking ads and alternatively by people subscribing to Phoronix Premium for our ad-free service with other extra features.
OpenBLAS 0.3.27 Adds C-SKY Arch, Improved GEMM For AMD Zen & Sapphire Rapids Fixes
OpenBLAS 0.3.27 brings initial support for the C-SKY architecture, caps the maximum number of threads for GEMM / GETRF / POTRF to avoid under-utilized/idle threads, better multi-threaded POTRF performance for all platforms, various other multi-threaded enhancements, faster OpenMP thread management, and a lot of other common enhancements to this great BLAS library.
OpenBLAS 0.3.27 has changes both for AMD and Intel this release.
OpenBLAS 0.3.27 also has a number of x86_64 fixes, including fixing LLVM compiler options for Intel Sapphire Rapids and improving fallbacks for Sapphire Rapids. On the AMD side, there is improved GEMM performance for AMD Zen targets.
Besides the x86_64 and C-SKY work, the new OpenBLAS release has further ARM tuning including initial support for the Cortex-A76 processor cores and Neoverse-V2 support within the DYNAMIC_ARCH builds. IBM POWER also has DGEMM and SGEMM performance optimizations, X280 CPU support in the RISC-V space, various LoongArch 64-bit optimizations, a few MIPS fixes, and more. This is a rather big release for the OpenBLAS library.
Downloads and the full list of OpenBLAS 0.3.27 changes via GitHub.