LLVM 11 Flips On NVIDIA CUDA Offloading From 64-Bit ARM
The latest LLVM 11 development code has enabled support for NVIDIA CUDA GPU device offloading from 64-bit ARM.
LLVM AArch64 has the build system support enabled for allowing CUDA offload from 64-bit ARM hosts. Up to now this wasn't enabled but it turns out it works and has been passing all of the OpenMP offload tests.
The enablement for CUDA offloading on AArch64 was merged at the end of last week.
It was mentioned in the patch that the 64-bit ARM CUDA offloading was tested on Wombat, a single rack cluster at Oak Ridge National Laboratory exploring the 64-bit ARM architecture. ORNL.gov outlines Wombat as having a total of eight NVIDIA GPUs and two 28-core Cavium ThunderX2 processors.
LLVM supports compiling CUDA code with Clang among other use-cases with LLVM's NVPTX back-end that can then be consumed by the proprietary NVIDIA driver.
LLVM AArch64 has the build system support enabled for allowing CUDA offload from 64-bit ARM hosts. Up to now this wasn't enabled but it turns out it works and has been passing all of the OpenMP offload tests.
The enablement for CUDA offloading on AArch64 was merged at the end of last week.
It was mentioned in the patch that the 64-bit ARM CUDA offloading was tested on Wombat, a single rack cluster at Oak Ridge National Laboratory exploring the 64-bit ARM architecture. ORNL.gov outlines Wombat as having a total of eight NVIDIA GPUs and two 28-core Cavium ThunderX2 processors.
LLVM supports compiling CUDA code with Clang among other use-cases with LLVM's NVPTX back-end that can then be consumed by the proprietary NVIDIA driver.
Add A Comment