Intel oneDNN 3.0 Being Prepared With More Performance Optimizations
Intel oneDNN is the oneAPI library to help in building deep learning applications that is optimized for CPUs / GPUs / XPUs. Friday afternoon Intel engineers issued the release candidate for the upcoming oneDNN 3.0.
Intel oneDNN 3.0 is bringing more performance optimizations for 4th Gen Xeon Scalable "Sapphire Rapids" processors, beyond various Sapphire Rapids optimizations introduced in prior 2.x series releases. There is also FP16 support and initial optimizations being made with oneDNN 3.0 for Intel Xeon Scalable Granite Rapids.
Intel oneDNN 3.0 continues optimizing for their new Max Series hardware among other Intel and non-Intel devices.
In addition to the Intel CPU optimizations, oneDNN 3.0 brings performance improvements for Intel Data Center GPU Max Series "Ponte Vecchio" and improved performance for Intel Arc Graphics and Intel Data Center GPU Flex Series.
The oneDNN library has supported 64-bit Arm for some time while with the oneDNN 3.0 release there is improved AArch64 performance for CPUs with Scalable Vector Extensions (SVE), various SVE 512 optimizations, and improved FP16 performance when using the Arm Compute Library.
Also seeing some love with oneDNN 3.0 is better INT8 GEMM performance on 64-bit IBM Power hardware. The oneDNN 3.0 release is also bringing improvements atop the existing NVIDIA and AMD GPU support.
Aside from GPU/GPU performance optimizations, the oneDNN 3.0 release also is introducing a new quantization scheme, experimental Graph API support, support for the Intel DPC++/C++ Compiler 2023, and other improvements. With the v3.0 milestone also comes the removal of previously deprecated oneDNN APIs.
Downloads and more details on the new Intel oneDNN 3.0 release candidate can be found via GitHub.