AMD Releases AMD-135M: An Open-Source Small Language Model

Written by Michael Larabel in AMD on 27 September 2024 at 02:05 PM EDT. 9 Comments
AMD
AMD today announced "AMD-135M" as their first small language model they are publicly releasing. AMD-135M is open-source with the training code, dataset, and weights all being open-source to help in the development of other SLMs and LLMs.

AMD-135M features speculative decoding and was trained from scratch using AMD Instinct MI250 accelerators with 670 billion tokens. Training using four MI250 nodes took six days. There is also an AMD-Llama-135M-code variant that has an additional 20 billion tokens of code data. AMD-135M is based on the LLaMA2 model architecture.

AMD is making all of the AMD-135M model assets open-source in hopes of helping other AI development -- and for AMD's part, hoping that the training and inferencing is happening from AMD hardware.

AMD-135M


More details on the AMD-135M SLM via the AMD blog. AMD-135M is available via HuggingFace and GitHub.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week