A Big Patch Could Yield Big Performance Benefits For GPU Offloading With LLVM
LLVM has a new patch for at least some benchmarks can yield big performance benefits for GPU offloading.
Johannes Doerfert, formerly Saarland University, now employed by Argonne National Laboratory working on LLVM as part of the DOE Exascale Computing Project, published code this week on the OpenMP GPU code "SPMD-zation". The code builds upon their earlier proposal from months ago to allow for more code targeting the GPU to be executed in SPMD (Single Program, Multiple Data) mode and lightweight "guarded" modes where appropriate in order to overcome some bottlenecks in LLVM's existing OpenMP GPU offloading code.
With basic tests thus far, in the Rodinia benchmark suite for some tests they are seeing 30% improvements after the SPMD mode was enabled automatically. Further optimizations are also still possible.
More technical details on their vision for improving the OpenMP GPU offloading and the original proposal can be found on the LLVM mailing list.
While this code is interesting, the critiques so far come down to only testing against the NVIDIA NVPTX target and also the patch simply being huge. At thousands of lines of code by itself, it's difficult and time consuming to review in one-go. At least some LLVM developers would like to see the work broken up and to evolve with time rather than coming down with one big shot.
Anyhow, we'll see where this work leads with there being growing interest around GPU/device offloading with LLVM and more opportunities opening up with SYCL support, continued OpenCL work, OpenMP continuing to advance and pick up better device offloading abilities, etc.
Johannes Doerfert, formerly Saarland University, now employed by Argonne National Laboratory working on LLVM as part of the DOE Exascale Computing Project, published code this week on the OpenMP GPU code "SPMD-zation". The code builds upon their earlier proposal from months ago to allow for more code targeting the GPU to be executed in SPMD (Single Program, Multiple Data) mode and lightweight "guarded" modes where appropriate in order to overcome some bottlenecks in LLVM's existing OpenMP GPU offloading code.
With basic tests thus far, in the Rodinia benchmark suite for some tests they are seeing 30% improvements after the SPMD mode was enabled automatically. Further optimizations are also still possible.
More technical details on their vision for improving the OpenMP GPU offloading and the original proposal can be found on the LLVM mailing list.
While this code is interesting, the critiques so far come down to only testing against the NVIDIA NVPTX target and also the patch simply being huge. At thousands of lines of code by itself, it's difficult and time consuming to review in one-go. At least some LLVM developers would like to see the work broken up and to evolve with time rather than coming down with one big shot.
Anyhow, we'll see where this work leads with there being growing interest around GPU/device offloading with LLVM and more opportunities opening up with SYCL support, continued OpenCL work, OpenMP continuing to advance and pick up better device offloading abilities, etc.
Add A Comment