Lisa Su Reaffirms Commitment To Improving AMD ROCm Support, Engaging The Community
Prominent software engineer George Hotz has been working on Tiny Corp as the company behind tinygrad as a neural network framework and they are creating the "tinybox" as a $15,000 HPC/AI-focused system aiming for 738 FP16 TFLOPs with AMD EPYC CPUs and hopefully AMD GPUs. But for AMD GPUs to work out in the Tinybox, they need improvements to the open-source AMD compute stack to get AMD running and performing well for MLPerf.
George Hotz has been quite vocal about the problems currently facing the AMD ROCm compute stack and wanted to personally work on improvements to the AMD compute stack. A few weeks ago he had a falling out with the AMD compute efforts and decided to no longer pursue AMD GPU compute options. But it appears there's been another change of course and he's back to focusing on improving AMD GPU compute support with the open-source stack.
Hotz tweeted on Friday that Tiny Corp is back to working on the "get AMD on MLPerf plan" and he spoke with Lisa Su. Following his conversation with Lisa Su he believes "things will get better and AMD will begin to develop in public."
Lisa Su later tweeted to re-affirm her commitment to working woth the community and improving their support around ROCm on Radeon.
Thanks for connecting @realGeorgeHotz. Appreciate the work you and tiny corp are doing. We are committed to working with the community and improving our support. More to come on ROCm on @radeon soon. Lots of work ahead but excited about what we can do together.
— Lisa Su (@LisaSu) June 16, 2023
Hopefully this will include seeing ROCm supported on more hardware than just their workstation graphics cards and Instinct accelerators. One of the biggest frustrations right now is that while NVIDIA CUDA is broadly supported across all their hardware going on for generations, AMD ROCm support is officially supported on just a small subset of cards. For consumer cards the RDNA/RDNA2/RDNA3 support has been slow to materialize. ROCm's software ecosystem support has been improving over the years but also isn't as easy/straight-forward as the mature CUDA landscape. Many in the community would also like to see AMD more quickly address bugs as well as making it easier to deploy the ROCm stack outside of the few supported enterprise Linux distributions that are officially endorsed by the company.