Intel NPU Library v1.2 Adds Int4 Support & Performance Optimizations
Intel released a new version of its NPU Acceleration Library, the user-space Python library for leveraging the Neural Processing Unit (NPU) found within their Core Ultra "Meteor Lake" laptops and upcoming Lunar Lake and Arrow Lake hardware as well.
The Intel NPU Library makes it easy to exploit the potential of the NPU hardware on systems with the IVPU kernel driver present. Thanks to this library, in just a few lines of Python code it's possible to perform matrix multiplication on the NPU, compiling PyTorch models for the NPU, and even run Tiny Llama on the NPU.
With today's v1.2 update, there is now int4 support plumbed into this library. In addition to now working with int4 data types, the Intel NPU Library 1.2 offers new backend performance optimizations, Scalar Dot Production Attention (SDPA) NPU kernel support, and persistent compilation handling.
Downloads and more details on the Intel NPU Library 1.2 release via GitHub.
The Intel NPU Library makes it easy to exploit the potential of the NPU hardware on systems with the IVPU kernel driver present. Thanks to this library, in just a few lines of Python code it's possible to perform matrix multiplication on the NPU, compiling PyTorch models for the NPU, and even run Tiny Llama on the NPU.
With today's v1.2 update, there is now int4 support plumbed into this library. In addition to now working with int4 data types, the Intel NPU Library 1.2 offers new backend performance optimizations, Scalar Dot Production Attention (SDPA) NPU kernel support, and persistent compilation handling.
Downloads and more details on the Intel NPU Library 1.2 release via GitHub.
4 Comments