Llamafile 0.8 Releases With LLaMA3 & Grok Support, Faster F16 Performance

phoronix

Administrator

Join Date: Jan 2007

Posts: 64832
- Share
- Tweet
#1

Llamafile 0.8 Releases With LLaMA3 & Grok Support, Faster F16 Performance

25 April 2024, 06:49 AM

Phoronix: Llamafile 0.8 Releases With LLaMA3 & Grok Support, Faster F16 Performance

Llamafile has been quite an interesting project out of Mozilla's Ocho group in the era of AI. Llamafile makes it easy to run and distribute large language models (LLMs) that are self-contained within a single file. Llamafile builds off Llama.cpp and makes it easy to ship an entire LLM as a single file with both CPU and GPU execution support. Llamafile 0.8 is out now to join in on the LLaMA3 fun as well as delivering other model support and enhancing the CPU performance...

Llamafile 0.8 Releases With LLaMA3 & Grok Support, Faster F16 Performance - Phoronix

https://www.phoronix.com/news/Llamafile-0.8-LLaMA3

Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite
Tags: None
frief

Junior Member

Join Date: Sep 2010

Posts: 22
- Share
- Tweet
#2

27 April 2024, 05:04 PM

I tried two Llamafiles (llava-v1.5-7b-q4.llamafile and mistral-7b-instruct-v0.2.Q4_0.llamafile) both about 4 GB in size.
(run fine on ryzen 9 and on i3 10100 without GPU support)

And I'm both surprised and shocked what these small LLMs can do.
Comment

Announcement