Intel Releases OpenVINO 2024.1 With More Gen AI & LLM Features
Intel engineers have just released OpenVINO 2024.1, the newest feature release for this excellent open-source AI toolkit that continues expanding its features and capabilities particularly around Generative AI "GenAI" and Large Language Models (LLMs).
On the generative AI front, OpenVINO 2024.1 adds Mixtral and URLNet models optimized for Intel Xeon CPUs, Stable Diffusion 1.5 / ChatGLM3-6B / Qwen-7B models have been optimized for faster Intel Core Ultra (Meteor Lake) performance with its Arc Graphics, and there is now support added for the Falcon-7B-Instruct LLM.
OpenVINO 2024.1 also has reduced large language model compilation time for Intel processors having Intel Advanced Matrix Extensions (AMX) support, better LLM compression and performance with oneDNN / INT4 / INT8 on Intel Arc Graphics GPUs, and significant memory reductions for smaller GenAI models on Intel Core Ultra processors.
OpenVINO 2024.1 also pulls in the Neural Processing Unit (NPU) plug-in for Intel Core Ultra "Meteor Lake" processors into the GitHub repository rather than having to rely on the external PyPi package. OpenVINO's JavaScript API is also now more accessible via the NPM repository. For OpenVINO on ARM processors, FP16 inference is now supported by default with Arm's Convolutional Neural Network.
Overall OpenVINO 2024.1 is looking like a great release. I look forward to trying out OpenVINO 2024.1 as well as running some fresh OpenVINO benchmarks especially if the Core Ultra NPU plug-in is now in good shape. The OpenVINO 2024.1 toolkit can be downloaded from GitHub.
On the generative AI front, OpenVINO 2024.1 adds Mixtral and URLNet models optimized for Intel Xeon CPUs, Stable Diffusion 1.5 / ChatGLM3-6B / Qwen-7B models have been optimized for faster Intel Core Ultra (Meteor Lake) performance with its Arc Graphics, and there is now support added for the Falcon-7B-Instruct LLM.
OpenVINO 2024.1 also has reduced large language model compilation time for Intel processors having Intel Advanced Matrix Extensions (AMX) support, better LLM compression and performance with oneDNN / INT4 / INT8 on Intel Arc Graphics GPUs, and significant memory reductions for smaller GenAI models on Intel Core Ultra processors.
OpenVINO 2024.1 also pulls in the Neural Processing Unit (NPU) plug-in for Intel Core Ultra "Meteor Lake" processors into the GitHub repository rather than having to rely on the external PyPi package. OpenVINO's JavaScript API is also now more accessible via the NPM repository. For OpenVINO on ARM processors, FP16 inference is now supported by default with Arm's Convolutional Neural Network.
Overall OpenVINO 2024.1 is looking like a great release. I look forward to trying out OpenVINO 2024.1 as well as running some fresh OpenVINO benchmarks especially if the Core Ultra NPU plug-in is now in good shape. The OpenVINO 2024.1 toolkit can be downloaded from GitHub.
2 Comments