Announcement

Collapse
No announcement yet.

CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal

    Phoronix: CUDA vs. OpenCL GPGPU Performance On NVIDIA's Pascal

    Following yesterday's Deep Learning and CUDA Benchmarks On The GeForce GTX 1080 Under Linux one of the Phoronix reader inquiries was about the OpenCL vs. CUDA performance on the GTX 1080... Is one GPGPU compute API faster than the other with NVIDIA's proprietary driver? Here are some side-by-side benchmarks...

    http://www.phoronix.com/scan.php?pag...OpenCL-vs-CUDA

  • #2
    How does AMD perform in these tests?

    Comment


    • #3
      Originally posted by oleid View Post
      How does AMD perform in these tests?
      http://www.phoronix.com/scan.php?pag...gtx-1080&num=2

      Comment


      • #4
        It is quite sad (but typical) to see NVidia neglecting the open standard OpenCL (no OpenCL-2.x support, less optimized runtime compared to CUDA) - instead they push their proprietary CUDA.

        What makes me wonder: Instead of worrying about being depending on a single supplier, the HPC world seems to be quite happy buying NVidia Teslas without a second thought.

        Comment


        • #5
          Originally posted by Linuxhippy View Post
          It is quite sad (but typical) to see NVidia neglecting the open standard OpenCL (no OpenCL-2.x support, less optimized runtime compared to CUDA) - instead they push their proprietary CUDA.

          What makes me wonder: Instead of worrying about being depending on a single supplier, the HPC world seems to be quite happy buying NVidia Teslas without a second thought.
          I guess between vendor neutrality and this: http://www.phoronix.com/scan.php?pag...gtx-1080&num=2
          the choice is pretty clear cut when you're after performance.

          Comment


          • #6
            Originally posted by Linuxhippy View Post
            What makes me wonder: Instead of worrying about being depending on a single supplier, the HPC world seems to be quite happy buying NVidia Teslas without a second thought.
            Yes, it is a shame - but the performance we get from K80s is just too nice at the moment. At least there is some hope that OpenACC (or maybe OpenMP) extentions in C/C++ will be usable on other offloading compute resources like the Xeon Phi 2, so that we can reuse our code.

            It is really a pity that vendor politics seems to prevent a common interface for shared memory parallelism on offloaded resources.

            So being someone from that HPC world I assure you: there is a lot of second thoughts, but no alternatives at the moment - only design decisions that might help to be more independent in the future.

            Comment


            • #7
              I work with both CUDA and OpenCL in HPC and these benchmarks are definitely don't represent real-world performance! OpenCL on NVIDIA is horribly lagging behind CUDA in terms of feature support (a hint for the reason: https://twitter.com/jrprice89/status/667466444355993600). They don't support a lot of the stuff that allow actual scientific/HPC codes to run fast (not synthetic benchmarks with who knows how efficient implementation), e.g. warp shuffle just to name one.

              Our kernels run at least 2x slower in OpenCL compared to CUDA and that's not a fluke or ill-optimized OpenCL.

              So Michael, please pick some more relevant/representative benchmarks.
              [Edit/plug: for a start you could consider our code, GROMACS, a widely used open source molecular simulation package. Beside CUDA and OpenCL support (on NVIDIA and AMD GPUs) it also has SIMD kernels for a dozen or more processor architectures as well as OpenMP multi-threading and MPI].
              Last edited by pszilard; 06-14-2016, 09:12 PM.

              Comment


              • #8

                Originally posted by Linuxhippy View Post
                It is quite sad (but typical) to see NVidia neglecting the open standard OpenCL (no OpenCL-2.x support, less optimized runtime compared to CUDA) - instead they push their proprietary CUDA.
                Sad it is, but also very strong vendor-bias/lock-in campaign too.

                Originally posted by Linuxhippy View Post
                What makes me wonder: Instead of worrying about being depending on a single supplier, the HPC world seems to be quite happy buying NVidia Teslas without a second thought.
                What's the alternative? Sadly, there isn't really one. If you're lucky AMD's GPUs can keep up with NVIDIA's, but good luck fighting the compiler, runtime, lack of features exposed. Their hardware is good IMO, but the combination of poor software stack & dev-tools as well as the inherent challenges of having to deal with a nasty and another huge and aggressive competitor render the situation very difficult for AMD. Plus the issues that come with relying on the relatively slow evolution of an open standard don't make things easier for them to compete.
                Last edited by pszilard; 06-14-2016, 09:14 PM.

                Comment


                • #9

                  Originally posted by Foolou View Post
                  Yes, it is a shame - but the performance we get from K80s is just too nice at the moment. At least there is some hope that OpenACC (or maybe OpenMP) extentions in C/C++ will be usable on other offloading compute resources like the Xeon Phi 2, so that we can reuse our code.

                  It is really a pity that vendor politics seems to prevent a common interface for shared memory parallelism on offloaded resources.
                  OpenACC on Intel? Not a chance, have you seen this? :-/
                  https://www.youtube.com/watch?v=RBFPBaxl_Jw

                  Originally posted by Foolou View Post
                  So being someone from that HPC world I assure you: there is a lot of second thoughts, but no alternatives at the moment - only design decisions that might help to be more independent in the future.
                  Let's be honest, many of us have jumped happily on the CUDA train and have not looked back much. Most have not even made an attempt to port to OpenCL and file bugs at NV and complain loud that what they are doing is not fair. Sure, it takes an effort, but without sobering up, realizing that the NVIDIA vendor lock-in efforts are working very well, and doing one's best to counteract it, if with nothing else but strong feedback, not much will change.

                  Comment


                  • #10
                    Originally posted by pszilard View Post
                    IThey don't support a lot of the stuff that allow actual scientific/HPC codes to run fast (not synthetic benchmarks with who knows how efficient implementation), e.g. warp shuffle just to name one.

                    Our kernels run at least 2x slower in OpenCL compared to CUDA and that's not a fluke or ill-optimized OpenCL.
                    Would this situation change when/if Nvidia would release a OpenCL 2.0 or even 2.1 driver? Or are the missing features and/or performance independed of the availabe version of OpenCL. (And did you use OpenCL 1.2 for Nvidia, or still 1.1?).

                    Comment

                    Working...
                    X