New "SCALE" Software Allows Natively Compiling CUDA Apps For AMD GPUs
While there have been various efforts like HIPIFY to help in translating CUDA source code to portable C++ code for AMD GPUs and then the previously-AMD-funded ZLUDA to allow CUDA binaries to run on AMD GPUs via a drop-in replacement to CUDA libraries, there's a new contender in town: SCALE. SCALE is now public as a GPGPU toolchain for allowing CUDA programs to be natively run on AMD graphics processors.
SCALE has been seven years in the making by UK firm Spectral Compute. SCALE is a "clean room" implementation of CUDA that leverages some open-source LLVM components while forming a solution to natively compile CUDA sources for AMD GPUs without modification -- a big benefit over alternative projects that only assist in code translation by transpiling to another "portable" language or other manual developer steps being involved.
SCALE takes CUDA programs as-is and can even handle CUDA programs relying on line NVPTX Assembly. The SCALE compiler also is a drop-in replacement to NVIDIA's nvcc compiler and has a runtime that "impersonates" the NVIDIA CUDA Toolkit.
SCALE has been successfully tested with software like Blender, Llama-cpp, XGboost, FAISS, GOMC, STDGPU, Hashcat, and even NVIDIA Thrust. Spectral Compute has been testing SCALE across RDNA2 and RDNA3 GPUs along with basic testing on RDNA1 while Vega support is still a work-in-progress.
At its heart, SCALE is an nvcc-compatible compiler that can compile CUDA code for AMD GPUs, implementations of the CUDA runtime and driver APIs for AMD GPUs, and open-source wrapper libraries that in turn interface with AMD's ROCm libraries.
While ZLUDA for instance was quietly funded by AMD, I'm told by Spectral Compute that they've just been funding this development since 2017 via their consulting business. The only immediate downside seen with SCALE is that it itself is not open-source software but to at least there being a free edition license available for users.
Those wishing to learn more about this very promising SCALE effort for compiling and running CUDA codes on AMD GPUs can see the announcement on scale-lang.com. Head straight to the documentation if wanting to try out SCALE. It's compatible with ROCm 6 and I look forward to trying out for benchmarking as time allows.
SCALE has been seven years in the making by UK firm Spectral Compute. SCALE is a "clean room" implementation of CUDA that leverages some open-source LLVM components while forming a solution to natively compile CUDA sources for AMD GPUs without modification -- a big benefit over alternative projects that only assist in code translation by transpiling to another "portable" language or other manual developer steps being involved.
SCALE takes CUDA programs as-is and can even handle CUDA programs relying on line NVPTX Assembly. The SCALE compiler also is a drop-in replacement to NVIDIA's nvcc compiler and has a runtime that "impersonates" the NVIDIA CUDA Toolkit.
SCALE has been successfully tested with software like Blender, Llama-cpp, XGboost, FAISS, GOMC, STDGPU, Hashcat, and even NVIDIA Thrust. Spectral Compute has been testing SCALE across RDNA2 and RDNA3 GPUs along with basic testing on RDNA1 while Vega support is still a work-in-progress.
At its heart, SCALE is an nvcc-compatible compiler that can compile CUDA code for AMD GPUs, implementations of the CUDA runtime and driver APIs for AMD GPUs, and open-source wrapper libraries that in turn interface with AMD's ROCm libraries.
While ZLUDA for instance was quietly funded by AMD, I'm told by Spectral Compute that they've just been funding this development since 2017 via their consulting business. The only immediate downside seen with SCALE is that it itself is not open-source software but to at least there being a free edition license available for users.
Those wishing to learn more about this very promising SCALE effort for compiling and running CUDA codes on AMD GPUs can see the announcement on scale-lang.com. Head straight to the documentation if wanting to try out SCALE. It's compatible with ROCm 6 and I look forward to trying out for benchmarking as time allows.
102 Comments