CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal

Written by Michael Larabel in NVIDIA on 12 June 2016 at 12:03 PM EDT. 10 Comments
NVIDIA
Following yesterday's Deep Learning and CUDA Benchmarks On The GeForce GTX 1080 Under Linux one of the Phoronix reader inquiries was about the OpenCL vs. CUDA performance on the GTX 1080... Is one GPGPU compute API faster than the other with NVIDIA's proprietary driver? Here are some side-by-side benchmarks.

We've done both OpenCL and CUDA benchmarks on the GeForce GTX 1080 in multiple articles while I hadn't posted any of the results side-by-side for those curious about these competing APIs. For making some comparisons, I did some benchmarks this morning of the SHOC test profile (the Scalable HeterOgeneous Computing platform). SHOC has many of the same GPGPU micro-tests implemented in both CUDA and OpenCL for making such comparisons possible, assuming they are implemented roughly the same by a developer experienced with both APIs.
CUDA vs. OpenCL NVIDIA Pascal GPU Computing

The tests were done with a GeForce GTX 1080 using the NVIDIA 367.18 beta driver. CUDA 8.0 RC1 was installed while OpenCL 1.2 remains the latest Khronos compute API supported by the NVIDIA proprietary driver.
CUDA vs. OpenCL NVIDIA Pascal GPU Computing

CUDA vs. OpenCL NVIDIA Pascal GPU Computing

CUDA vs. OpenCL NVIDIA Pascal GPU Computing

CUDA vs. OpenCL NVIDIA Pascal GPU Computing

In many of these micro-benchmarks via SHOC, the OpenCL vs. CUDA performance was close to the same.
CUDA vs. OpenCL NVIDIA Pascal GPU Computing

With the Triad test though the CUDA performance was noticeably faster...
CUDA vs. OpenCL NVIDIA Pascal GPU Computing

The FFT single-precision test was also noticeably much faster with CUDA.

These differences could come down to SHOC's CUDA code-paths being better implemented than their respective OpenCL code, but generally this is what I've seen with similar comparisons in the past and what I've heard from those well experienced in GPGPU programming that well-tuned CUDA code tends to outperform matching OpenCL code on NVIDIA hardware. With that said though, hopefully we'll see more improvements to NVIDIA's OpenCL driver this year, with still waiting for OpenCL 2.0~2.1+ support!
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week