Announcement

**Jedibeeftrix** · 25 May 2024, 07:04 AM

does llamafile depend on the software stock providing compute from the hardware vendor?

i.e. rocm from AMD

**rabcor** · 25 May 2024, 11:05 AM

Originally posted by Jedibeeftrix View Post

does llamafile depend on the software stock providing compute from the hardware vendor?

i.e. rocm from AMD

for gpu acceleration yes. i think u can also use rocr, on nvidia you need cuda.

CPU performance ain't that bad though on the smaller models (but they suck)

**pWe00Iri3e7Z9lHOX2Qx** · 25 May 2024, 11:20 AM

I look forward to the time when workstations like the one pictured are dirt cheap on eBay.

**Lycanthropist** · 25 May 2024, 11:35 AM

What size counts as a tiny model? 7b or even smaller?

**Kjell** · 25 May 2024, 12:49 PM

Originally posted by pWe00Iri3e7Z9lHOX2Qx View Post

I look forward to the time when workstations like the one pictured are dirt cheap on eBay.

I see you're playing the long game then

Respect

**Henk717** · 25 May 2024, 10:15 PM

Originally posted by Jedibeeftrix View Post

does llamafile depend on the software stock providing compute from the hardware vendor?

i.e. rocm from AMD

To my knowledge yes, if you want a portable solution that can also run on fully open drivers Koboldcpp is worth checking out. Just like llamafile its a fork of llamacpp and comes with its own more userfriendly UI, API servers, and various backends in one file.

You could run a K quant GGUF on Vulkan if you wish to avoid ROCm or CUDA. But if you do use CUDA all you need is the propriatary blob. I managed to run it on the manjaro live cd before without installing any packages.

Also includes a --benchmark which may be interesting for the test suite considering the benchmark can output to csv and its pretty easy to launch the various backends since at least for the nvidia side its all single bin. Only rocm relies on a seperate fork.

**yoshi314** · 26 May 2024, 02:55 AM

just a nitpick, but "tiny large language model" sounds off. maybe they should label it "portable language model" ?

Announcement

Llamafile 0.8.5 Delivers Greater Performance: Tiny Models 2x Faster On Threadripper

Llamafile 0.8.5 Delivers Greater Performance: Tiny Models 2x Faster On Threadripper

Comment

Comment

Comment

Comment

Comment

Comment

Comment