Red Hat Developers Working Towards A Vendor-Neutral Compute Stack To Take On NVIDIA's CUDA

Written by Michael Larabel in Red Hat on 17 November 2018 at 11:00 AM EST. 48 Comments

At this week's Linux Plumbers Conference, David Airlie began talking about the possibility of a vendor-neutral compute stack across Intel, Radeon, and NVIDIA GPU platforms that could potentially take on NVIDIA's CUDA dominance.

There has been the work on open-source NVIDIA (Nouveau) SPIR-V compute support all year and that's ongoing with not yet having reached mainline Mesa. That effort has been largely worked on by Karol Herbst and Rob Clark, both open-source GPU driver developers at Red Hat. There has also been other compute-motivated open-source driver/infrastructure work out of Red Hat like Jerome Glisse's ongoing kernel work around Heterogeneous Memory Management (HMM). There's also been the Radeon RADV driver that Red Hat's David Airlie co-founded and continues contributing significantly to its advancement. And then there has been other graphics/compute contributions too with Red Hat remaining one of the largest upstream contributors to the ecosystem.

Pair that work with the vibrant and rapidly growing ecosystem around SPIR-V/Vulkan and it turns out all of this could potentially culminate with something very interesting... A GPU compute stack that is vendor-neutral, open-source and largely shared across GPU vendors, and could potentially take on the dominance of NVIDIA's CUDA for GPU computing particularly with HPC.

Longtime open-source graphics driver contributor and DRM subsystem maintainer, David Airlie of Red Hat, presented at this week's Linux Plumbers Conference on such a compute stack. "Until now the clear market leader has been the CUDA stack from NVIDIA, which is a closed source solution that runs on Linux. Open source applications like tensorflow (AI/ML) rely on this closed stack to utilise GPUs for acceleration...This talk will discuss the possibility of creating a vendor neutral reference compute stack based around open source technologies and open source development models that could execute compute tasks across multiple vendor GPUs. Using SYCL/OpenCL/Vulkan and the open-source Mesa stack, as the basis for a future task that development of tools and features on top of as part of a desktop OS."

His talk focused not only on NVIDIA's closed-source Compute Unified Device Architecture (CUDA) but also how AMD's ROCm/HIP compute stack is open-source but vendor-specific. There is also Intel's NEO compute driver project that is OpenCL but also focused just on their own hardware. There is a lot of fragmentation in the GPU compute space even when it comes to open-source projects with not a whole lot of common/shared code. Each project also usually relies upon their own LLVM/Clang implementation.

The proposed GPU compute stack would be vendor neutral, offer a shared code-base as much as possible, be based upon open (Khronos) standards, offer a common API and intermediate representation (IR), and offer common tooling.

Under this proposed stack, C++ with Khronos' SYCL would be the means to which programmers could target this compute environment. SYCL is the Khronos single-source programming model based upon C++ to serve as an abstraction layer for both host and device code and is a heterogeneous framework for targeting OpenCL and other systems.

Targeting SYCL would also allow for running the C++ code on the CPU via OpenMP for multi-threading. The C++ standards body is also moving to take on possible GPU offloading/programming in a future revision to the C++ programming language, which would need an execution environment that this new theoretical compute stack could potentially fulfill.

With this proposed stack the Clang SYCL front-end would be used to either go the route of running on the CPU with the host compiler and leveraging OpenMP. Or thanks to projects centering around LLVM, take that route to spit out SPIR-V that could then be consumed by the device drivers but distributed to users still as an ELF object file.

The open-source Mesa compute code including the long-standing "Clover" OpenCL Gallium state tracker would fill the role as part of the run-time before hitting the SPIR-V/hardware drivers for execution. Using Mesa as a run-time would reduce the amount of GPU specific code and provide a clear API.

There isn't any compute stack ready today, but all of the necessary bits are moving in this direction. It will be interesting to see if/when it fully materializes and the adoption it receives if there is enough momentum around the open Khronos standards with SYCL/SPIR-V that it could take on NVIDIA CPU for HPC programming where OpenCL has failed to really take off. Even for Linux desktop applications, very few are actually using OpenCL for GPGPU... Hopefully C++ SYCL can become a more attractive target for both compute on the CPU as well as GPUs to better utilize today's hardware.

The LPC 2018 slide deck can be viewed here.

Update: David Airlie has clarified that at least at this time it is not an official Red Hat project while "all the other work done by Karol, Rob and Jerome has a real world goal." A video of the presentation is said to be coming soon that should shed more light on the proposal.

48 Comments