Llamafile 0.7 Brings AVX-512 Support: 10x Faster Prompt Eval Times For AMD Zen 4

Written by Michael Larabel in Programming on 31 March 2024 at 10:00 AM EDT. 25 Comments
A new release of Llamafile is available this Easter Sunday from the Mozilla Ocho group. Llamafile is a means of distributing and running large language models (LLMs) from a single file, making LLMs much easier to distribute and use by developers and end-users. Llamafile remains one of the more interesting non-browser projects out of Mozilla in recent times that so far has a bright future.

Llamafile makes dealing with large language models much more convenient and easier to deploy by leveraging Llama.cpp and making it easy to deliver an entire LLM within a single-file executable that works on most systems while being able to leverage both CPU and GPU execution.

With Llamafile 0.7 out today there is finally AVX-512 support! Those testing out Llama 0.7 on AVX-512 enabled CPUs like AMD Zen 4 are finding around 10x faster prompt evaluation times with this support. It's a very nice Easter gift for those with AVX-512 and using Llamafile for large language models on CPUs.

AMD AM5 CPUs for Easter with Korbinian Starkbier

I've been running some Llamafile benchmarks for a few months and look forward to trying out Llamafile 0.7 for looking at its performance gains on AVX-512 Intel and AMD processors.

Llamafile 0.7 also brings BF16 CPU support, a security fix, various Windows improvements, prompt evaluation on the Raspberry Pi 5 with F16 weights is now around 8x faster, and various other improvements.

Downloads and more information on Llamafile 0.7 via GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week