Announcement

**tildearrow** · 10 June 2022, 04:13 PM

What's the current status of compute acceleration? A mess!

NVIDIA:
- CUDA
- OpenCL (an old version)
- Rusticl OpenCL (Nouveau, poor performance)
- PoCL
- Vulkan Compute
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

AMD:
- APP OpenCL (original implementation before ROCm era)
- Clover OpenCL
- ROCm OpenCL
- Rusticl OpenCL
- ROCm HIP
- CUDA (partial, via ROCm HIPIFY)
- CUDA (partial, via SYCLomatic)
- PoCL
- Vulkan Compute (with Mesa RADV)
- Vulkan Compute (with AMDVLK)
- Vulkan Compute (with proprietary driver)
- Metal Performance Shaders (on macOS)
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

Intel:
- Probably PoCL too
- Vulkan Compute (with ANV)
- Vulkan Compute (with proprietary driver on Windows)
- Metal Performance Shaders (on macOS)
- NEO OpenCL
- Beignet OpenCL
- intel_clc OpenCL
- Rusticl OpenCL
- oneAPI Level Zero
- CUDA (partial, via SYCLomatic)
- CUDA (partial, via ZLUDA)
- SYCL (via oneAPI DPC++)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

CPU:
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- PoCL
- Vulkan Compute (using Lavapipe)

However, for some reason, CUDA has over 90% of the usage share, despite thousand of efforts to liberate ourselves from it being made!

**LinAGKar** · 10 June 2022, 05:18 PM

tildearrow, don't forget that DPC++ also has a CPU backend, and experimental CUDA and HIP backends. That (or maybe ComputeCpp) seems like the most promising option for making cross compatible binaries, although they do it by plugging in different backends at runtime (CUDA/HIP/Level Zero/CPU) rather than by targeting a a vendor independent driver API. It's a mess.

And of course, ROCm doesn't even work on all AMD GPUs.

**OneTimeShot** · 11 June 2022, 03:58 AM

Khronos aren't too good at bundling their APIs. They have Audio APIs, they have OpenCL, video codec APIs, video playback APIs, then all the AI and computer vision stuff...

...but you never know what is going to work on any one computer. OpenGL 4.0, but no OpenCL. Vulkan, but no video playback. 3d but no video compression. They should occasionally draw a line in the sand and say "in order to be Vulkan 1.2 compatible, it *must* also support these APIs: OpenAL, OpenCL 1.2, ...".

Even if they are older library versions or just software implementations, they should at least be mandatory to be present for compatibility. This is really why Nvidia won the API wars: you can have a fully Vulkan certified graphics driver, but that doesn't guarantee that anything else is usable... Khronos needs to target all GPU features.

**coder** · 11 June 2022, 05:09 PM

Originally posted by tildearrow View Post

What's the current status of compute acceleration? A mess!

Is that a list you're maintaining? If you copied it from somewhere, you should include a link.

I don't really get why the list includes Windows and MacOS, other than to pad it out and make it seem even more complicated than it is. Same thing with obsolete options, like Beignet. One could get the sense that you're really trying to sew FUD, here.

I also doubt Vulkan compute is supported by all of those backends. I think you're wrong to assume every Vulkan implementation supports Vulkan Compute. It has a different SPIR-V.

Originally posted by tildearrow View Post

for some reason, CUDA has over 90% of the usage share, despite thousand of efforts to liberate ourselves from it being made!

According to whom?

Anyway, the problem of failing to coalesce around a good alternative is largely down to most of the big players deciding to push their own solutions - Apple, Google, and Microsoft. Google had the clout to make OpenCL happen, if they'd used as the base of their compute stack and made it a requirement for Android. Instead, they banned it!

And without an API being ubiquitous, it can't really gain a lot of traction among app developers. If OpenCL support were as universal as OpenGL support, a lot more apps would be using it. POCL holds the potential to help make that a reality. Rusticle will also help with that.

**coder** · 11 June 2022, 05:17 PM

Originally posted by OneTimeShot View Post

Khronos aren't too good at bundling their APIs. They have Audio APIs, they have OpenCL, video codec APIs, video playback APIs, then all the AI and computer vision stuff...

They're separate standards for a reason. If they were meant to be coupled, then they'd all be a single standard.

Originally posted by OneTimeShot View Post

They should occasionally draw a line in the sand and say "in order to be Vulkan 1.2 compatible, it *must* also support these APIs: OpenAL, OpenCL 1.2, ...".

That's not their job. That's something a higher-level entity should do, like how Google sets API support requirements in Android. We could have the same thing on Linux, if one or more of the big distros would simply decide to do it.

Originally posted by OneTimeShot View Post

you can have a fully Vulkan certified graphics driver, but that doesn't guarantee that anything else is usable...

Vulkan on one GPU isn't the same as Vulkan on another. It has so many optional features that it's basically its own mini-version of the API support problem you're complaining about. That's why Vulkan 1.3 had to add the concept of Profiles.

Vulkan 1.3 Released With Dynamic Rendering In Core, New Roadmap Guidance For Modern GPUs - Phoronix

https://www.phoronix.com/scan.php?page=article&item=vulkan-13-2022

Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

**piotrj3** · 11 June 2022, 08:25 PM

Originally posted by tildearrow View Post

What's the current status of compute acceleration? A mess!

NVIDIA:
- CUDA
- OpenCL (an old version)
- Rusticl OpenCL (Nouveau, poor performance)
- PoCL
- Vulkan Compute
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

AMD:
- APP OpenCL (original implementation before ROCm era)
- Clover OpenCL
- ROCm OpenCL
- Rusticl OpenCL
- ROCm HIP
- CUDA (partial, via ROCm HIPIFY)
- CUDA (partial, via SYCLomatic)
- PoCL
- Vulkan Compute (with Mesa RADV)
- Vulkan Compute (with AMDVLK)
- Vulkan Compute (with proprietary driver)
- Metal Performance Shaders (on macOS)
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

Intel:
- Probably PoCL too
- Vulkan Compute (with ANV)
- Vulkan Compute (with proprietary driver on Windows)
- Metal Performance Shaders (on macOS)
- NEO OpenCL
- Beignet OpenCL
- intel_clc OpenCL
- Rusticl OpenCL
- oneAPI Level Zero
- CUDA (partial, via SYCLomatic)
- CUDA (partial, via ZLUDA)
- SYCL (via oneAPI DPC++)
- SYCL (via ComputeCpp)
- DirectCompute (on Windows)

CPU:
- SYCL (via hipSYCL)
- SYCL (via ComputeCpp)
- PoCL
- Vulkan Compute (using Lavapipe)

However, for some reason, CUDA has over 90% of the usage share, despite thousand of efforts to liberate ourselves from it being made!

Very simply, OpenCL is a mess, CUDA was earlier and was better and had way better documentation and way better support from Nvidia engineers. Scientists was using anyway workstations that had nvidia gpus anyway, so OpenCL widespread was not useful. Simply saying, Nvidia alone did more for CUDA, then rest of the world for OpenCL.

Remaining technologies are simply too fresh, to judge (Vulkan compute is relativly fresh but also it is more made for sake of usage in game engines along the rendering) or are simply CUDA copycats that want to achieve compability with CUDA.

**coder** · 12 June 2022, 03:49 AM