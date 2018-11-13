It's becoming more clear why Red Hat hired a Nouveau developer to work on SPIR-V/compute support for the open-source NVIDIA Linux driver even when that reverse-engineered driver's performance is very poor due to re-clocking / power management limitations for Maxwell and beyond. This appears to be part of a broader compute effort in pursuing a vendor-neutral compute stack across Intel, Radeon, and NVIDIA GPU platforms that could potentially take on NVIDIA's CUDA dominance.There has been the work on open-source NVIDIA (Nouveau) SPIR-V compute support all year and that's ongoing with not yet having reached mainline Mesa. That effort has been largely worked on by Karol Herbst and Rob Clark, both open-source GPU driver developers at Red Hat. There has also been other compute-motivated open-source driver/infrastructure work out of Red Hat like Jerome Glisse's ongoing kernel work around Heterogeneous Memory Management (HMM). There's also been the Radeon RADV driver that Red Hat's David Airlie co-founded and continues contributing significantly to its advancement. And then there has been other graphics/compute contributions too with Red Hat remaining one of the largest upstream contributors to the ecosystem.Pair that work with the vibrant and rapidly growing ecosystem around SPIR-V/Vulkan and it turns out all of this could culminate with something very interesting... A GPU compute stack that is vendor-neutral, open-source and largely shared across GPU vendors, and could potentially take on the dominance of NVIDIA's CUDA for GPU computing particularly with HPC. Red Hat obviously has a vested interest in the High Performance Computing space already with most of the leading super-computers relying upon Red Hat Enterprise Linux, but this could offer them another opportunity.

Longtime open-source graphics driver contributor and DRM subsystem maintainer, David Airlie of Red Hat, presented at this week's Linux Plumbers Conference on such a compute stack. "Until now the clear market leader has been the CUDA stack from NVIDIA, which is a closed source solution that runs on Linux. Open source applications like tensorflow (AI/ML) rely on this closed stack to utilise GPUs for acceleration...This talk will discuss the possibility of creating a vendor neutral reference compute stack based around open source technologies and open source development models that could execute compute tasks across multiple vendor GPUs. Using SYCL/OpenCL/Vulkan and the open-source Mesa stack, as the basis for a future task that development of tools and features on top of as part of a desktop OS."His talk focused not only on NVIDIA's closed-source Compute Unified Device Architecture (CUDA) but also how AMD's ROCm/HIP compute stack is open-source but vendor-specific. There is also Intel's NEO compute driver project that is OpenCL but also focused just on their own hardware. There is a lot of fragmentation in the GPU compute space even when it comes to open-source projects with not a whole lot of common/shared code. Each project also usually relies upon their own LLVM/Clang implementation.

The proposed GPU compute stack would be vendor neutral, offer a shared code-base as much as possible, be based upon open (Khronos) standards, offer a common API and intermediate representation (IR), and offer common tooling.

Under this proposed stack, C++ with Khronos' SYCL would be the means to which programmers could target this compute environment. SYCL is the Khronos single-source programming model based upon C++ to serve as an abstraction layer for both host and device code and is a heterogeneous framework for targeting OpenCL and other systems.

Targeting SYCL would also allow for running the C++ code on the CPU via OpenMP for multi-threading. The C++ standards body is also moving to take on possible GPU offloading/programming in a future revision to the C++ programming language, which would need an execution environment that this new theoretical compute stack could potentially fulfill.With this proposed stack the Clang SYCL front-end would be used to either go the route of running on the CPU with the host compiler and leveraging OpenMP. Or thanks to projects centering around LLVM, take that route to spit out SPIR-V that could then be consumed by the device drivers but distributed to users still as an ELF object file.