Intel Releases OpenVINO 2024.2 With Llama 3 Optimizations, More AVX2 & AVX-512 Optimizations

Written by Michael Larabel in Intel on 17 June 2024 at 02:28 PM EDT. Add A Comment
INTEL
Intel today released OpenVINO 2024.2, the newest version of its open-source AI toolkit for optimizing and deploying deep learning (A) inference models across a range of AI frameworks and broad hardware types.

With OpenVINO 2024.2 they have continued optimizing for Meta's Llama 3 large language model. OpenVINO 2024.2 brings more Llama 3 optimizations for execution across CPUs, integrated GPUs, and discrete GPUs to further enhance performance while yielding more efficient memory use too.

OpenVINO 2024.2 also adds support for Phi-3-mini AI models, broader large language model support, support for Intel Atom Processor X Series, preview support for Intel Xeon 6 processors, and more AVX2/AVX-512 tuning. Intel is seeing a "significant improvement" in second token latency and memory footprint of FP16 weight LLMs for AVX2 on Intel Core CPus and then AVX-512 with Intel Xeon processors when leveraging small batch sizes.

Intel OpenVINO diagram


Downloads and more details on the OpenVINO 2024.2 release via GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week