Announcement

Collapse
No announcement yet.

Llamafile 0.8.7 Brings Fixes, Better ARM Performance & Preps For New Server

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Llamafile 0.8.7 Brings Fixes, Better ARM Performance & Preps For New Server

    Phoronix: Llamafile 0.8.7 Brings Fixes, Better ARM Performance & Preps For New Server

    Llamafile has been one of the better new initiatives out of Mozilla in recent years. Llamafile makes it easy to conveniently distribute and run large language models as a single file while supporting both CPU and GPU execution and all-around making AI LLMs much more approachable for end-users. Out today is Llamafile 0.8.7 with more performance optimizations and new features...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    phoronix would love to see a mega cpu shootout around llamafile. Any plans for it?

    Comment


    • #3
      What would be the point? Maybe only which cpu is the smallest / cheapest that is able to run llamafile at acceptable performance.

      Comment


      • #4
        Still confused why Llamafile is getting all the attention, how about llamacpp its based on? How about the other forks and products like Koboldcpp?

        Comment


        • #5
          this isn't related to Llamafile, but not totally off topic. one of the uses for Llama I didn't realize until I saw it in action is using it for keyboard recommendations, Futo Keyboard uses it for transformer predictions and it works pretty well, even on my LG G7 thinq https://gitlab.futo.org/alex/keyboar.../FUTO-Keyboard

          Comment


          • #6
            Originally posted by Henk717 View Post
            Still confused why Llamafile is getting all the attention, how about llamacpp its based on? How about the other forks and products like Koboldcpp?
            Those did get a lot of attention when llamacpp was released...like a year ago? More?

            Anyway llama.cpp is to the llama model as llamafile is to every other ML model. Jan AI is the closest to what llamafile is about but even that requires some manual labor to get the model supported, or hope someone else wants to add a PR. KoboldAI? I guess you can put that in this list but the model list is limited and the UI is horrendous and requires expert knowledge and dev knowledge....

            llamafile is really just to PoC or make a CLI wrapper for your one liner queries, its not really a super useful project for 99,9% of use cases, but then again neither is llama.cpp or koboldai

            Comment


            • #7
              Llamafile is a fork and Koboldcpp is a fork thats why I drew the comparison. UI is personal taste but on the model support you are completely wrong. KoboldCpp has the most model support out of all the llamacpp based applications because the fork kept compatibility with the older formats. It will run every GGUF thats not newer than the latest release (Weeklyish schedule) and still run the GGML files. It runs stable diffusion models and it runs whispercpp. For language models it even has a mode that benchmarks to a csv.

              If you dislike the UI it can be paired with any other KoboldAI API or OpenAI API compatible UI.

              Comment

              Working...
              X