Rusticl Capable Of Running Tinygrad For LLaMA Model
Mesa's Rusticl OpenGL implementation written in Rust it turns out can already run the Tinygrad open-source software with its OpenCL back-end for running the LLaMA model.
Tinygrad is the deep learning framework out of Tiny Corp that is run by George Hotz. He's been working on Tinygrad with an emphasis on using the AMD ROCm compute stack but it turns out Mesa's Rusticl OpenCL state tracker for Gallium3D is good enough for Tinygrad too.
Red Hat's David Airlie has begun testing and profiling Tinygrad both with the ACO compiler back-end and the AMDGPU LLVM back-end. In the coming week Airlie is also likely to test Rusticl against the ROCm OpenCL back-end for seeing how the Mesa performance is compared to AMD's official Linux open-source GPU compute stack.
Airlie's testing has been with a Radeon RX 6700 XT graphics card thus far. He's found using Rusticl with ACO for Tinygrad is around four times faster at compiling than the AMDGPU LLVM back-end but the run-time performance is lower and less optimized -- at least until the ACO experts get to optimizing it further for compute.
Details for those interested on Airlie's blog.
Those wanting to learn more about Tinygrad can do so via tinygrad on GitHub.
Tinygrad is the deep learning framework out of Tiny Corp that is run by George Hotz. He's been working on Tinygrad with an emphasis on using the AMD ROCm compute stack but it turns out Mesa's Rusticl OpenCL state tracker for Gallium3D is good enough for Tinygrad too.
Red Hat's David Airlie has begun testing and profiling Tinygrad both with the ACO compiler back-end and the AMDGPU LLVM back-end. In the coming week Airlie is also likely to test Rusticl against the ROCm OpenCL back-end for seeing how the Mesa performance is compared to AMD's official Linux open-source GPU compute stack.
Airlie's testing has been with a Radeon RX 6700 XT graphics card thus far. He's found using Rusticl with ACO for Tinygrad is around four times faster at compiling than the AMDGPU LLVM back-end but the run-time performance is lower and less optimized -- at least until the ACO experts get to optimizing it further for compute.
Details for those interested on Airlie's blog.
Those wanting to learn more about Tinygrad can do so via tinygrad on GitHub.
18 Comments