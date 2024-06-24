Llamafile 0.8.7 Brings Fixes, Better ARM Performance & Preps For New Server

Written by Michael Larabel in Mozilla on 24 June 2024 at 11:11 AM EDT. 2 Comments
MOZILLA
Llamafile has been one of the better new initiatives out of Mozilla in recent years. Llamafile makes it easy to conveniently distribute and run large language models as a single file while supporting both CPU and GPU execution and all-around making AI LLMs much more approachable for end-users. Out today is Llamafile 0.8.7 with more performance optimizations and new features.

After recent Llamafile releases have been tuning the Intel/AMD AVX performance, today's Llamafile 0.8.7 release brings some ARM performance improvements. There is better performance on Arm for legacy and K-quants while also bringing optimized matrix multiplication for I-quants on AArch64.

Llamafile 0.8.7 also fixes some AMD GPU issues on Windows by now always using tinyBLAS there, improved CPU brand detection, and other fixes.

Llamafile logo


Moving forward, a new Llamafile server is preparing to roll-out. Justine Tunney mentioned in the v0.8.7 release announcement on GitHub:
"It should be noted that, in future releases, we plan to introduce a new server for llamafile. This new server is being designed for performance and production-worthiness. It's not included in this release, since the new server currently only supports a tokenization endpoint. However the endpoint is capable of doing 2 million requests per second whereas with the current server, the most we've ever seen is a few thousand."

This patch adding the new Llamafile server notes that it is not only much faster than before but also designed to be crash-proof, reliable, and preempting.

Llamafile continues looking great for easy to distribute and run large language models. Learn more about this open-source project via Llamafile.ai.
2 Comments
Related News
Firefox 127 With 32-bit x86 Linux Pretending To Be "x86_64" To Reduce Fingerprinting
Firefox 126 Available - Adds "Linux" To The Android User Agent String
Mozilla's Llamafile 0.8.2 Scores Big With New AVX2 Performance Optimizations
Mozilla Has Been Rewriting Its Crash Reporter In Rust
Mozilla Finally Begins Offering Firefox ARM64 Linux Binaries
Firefox 125 Adds AV1 Support In Encrypted Media Extensions, Other New Features
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week
Linux's New DRM Panic "Blue Screen of Death" In Action
Systemd 256.1 Fixes "systemd-tmpfiles" Unexpectedly Deleting Your /home Directory
Longtime Linux Wireless Developer Passes Away
Blumenkrantz "Massively Improves" Mesa's glReadPixels Performance With 7 Lines Of Code
Linus Torvalds Demotes "FORCE_NR_CPUS" Embedded Linux Option To Avoid Confusion
Fedora 41 Hopes The GIMP 3.0 Photoshop Alternative Will Be Ready To Shine
Wine Staging 9.11 Released With A Patch For A 17 Year Old Bug Report
Linux Can Have A "Black Screen Of Death" For Kernel Panics