Announcement

Collapse
No announcement yet.

NVIDIA Officially Releases CUDA 6

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • NVIDIA Officially Releases CUDA 6

    Phoronix: NVIDIA Officially Releases CUDA 6

    We've been talking about NVIDIA's work on CUDA 6 since last November but today the sixth generation Compute Unified Device Architecture has finally been officially released...

    http://www.phoronix.com/vr.php?view=MTY2NTE

  • #2
    They'd better support OpenCL in parallel.
    It's too obvious they're after a vendor lock-in.
    Well, it's NVidia - nothing else to expect from them...

    Comment


    • #3
      There is some value there

      Originally posted by entropy View Post
      They'd better support OpenCL in parallel.
      It's too obvious they're after a vendor lock-in.
      Well, it's NVidia - nothing else to expect from them...
      I've always thought the OpenCL route was the best way to go for those wanting to do computations on the GPU, but after follwing the link provided by Phoronix and a few more clicks (https://developer.nvidia.com/cuda-toolkit) you quickly realize they do offer some value for CUDA. For starters they offer a lot of higher level libraries such as:

      cuFFT, – Fast Fourier Transforms Library
      cuBLAS – Complete BLAS library
      cuSPARSE – Sparse Matrix library
      cuRAND – Random Number Generator
      NPP – Thousands of Performance Primitives for Image & Video Processing
      Thrust – Templated Parallel Algorithms & Data Structures
      CUDA Math Library of high performance math routines

      I could not find anything similiar offered by OpenCL itself. It also looks to be a cleaner and more intuitive API then OpenCL. I may be all wrong here as I have never done OpenCL or CUDA programming, I just have a passing fancy that one day I might.

      Comment


      • #4
        Originally posted by DarkCloud View Post
        I've always thought the OpenCL route was the best way to go for those wanting to do computations on the GPU, but after follwing the link provided by Phoronix and a few more clicks (https://developer.nvidia.com/cuda-toolkit) you quickly realize they do offer some value for CUDA. For starters they offer a lot of higher level libraries such as:

        cuFFT, – Fast Fourier Transforms Library
        cuBLAS – Complete BLAS library
        cuSPARSE – Sparse Matrix library
        cuRAND – Random Number Generator
        NPP – Thousands of Performance Primitives for Image & Video Processing
        Thrust – Templated Parallel Algorithms & Data Structures
        CUDA Math Library of high performance math routines

        I could not find anything similiar offered by OpenCL itself. It also looks to be a cleaner and more intuitive API then OpenCL. I may be all wrong here as I have never done OpenCL or CUDA programming, I just have a passing fancy that one day I might.
        http://developer.amd.com/tools-and-s...ath-libraries/

        Accelerated Parallel Processing Math Libraries (APPML) is now open-source!

        clMath is the new name of this open-source project. The source is available on GitHub at: https://github.com/clMathLibraries. Please read our blog to learn more about this exciting development. We will continue to support binary releases of clMath on developer.amd.com.
        Overview
        AMD Accelerated Parallel Processing Math Libraries are software libraries containing FFT and BLAS functions written in OpenCL and designed to run on AMD GPUs. The libraries support running on CPU devices to facilitate debugging and multicore programming. APPML 1.10 is the most current generally available version of the library. Example programs are included to illustrate usage of these libraries. Additional sample programs are available separately (see list of files for download below).

        http://streamcomputing.eu/blog/2014-...bra-libraries/


        It should come as no surprise that Nvidia has a pittance of OpenCL library equivalents to their CUDA platform. They are the only outlier supporting old OpenCL 1.0/1.1 stacks while pushing CUDA. The rest of the Industry is moving onto OpenCL. If Nvidia wants to keep selling those GPGPUs they better update their OpenCL support.

        https://github.com/search?q=OpenCL+F...=searchresults

        You can find all sorts of equivalents in OpenCL for AMD GPGPUs, ImgTec GPGPUs, Altera FPGAs, etc.

        You're looking for official sanctioned OpenCL equivalents from Nvidia who wants OpenCL to die. Good luck.

        Comment


        • #5
          Originally posted by DarkCloud View Post
          I've always thought the OpenCL route was the best way to go for those wanting to do computations on the GPU, but after follwing the link provided by Phoronix and a few more clicks (https://developer.nvidia.com/cuda-toolkit) you quickly realize they do offer some value for CUDA. For starters they offer a lot of higher level libraries such as:

          cuFFT, – Fast Fourier Transforms Library
          cuBLAS – Complete BLAS library
          cuSPARSE – Sparse Matrix library
          cuRAND – Random Number Generator
          NPP – Thousands of Performance Primitives for Image & Video Processing
          Thrust – Templated Parallel Algorithms & Data Structures
          CUDA Math Library of high performance math routines

          I could not find anything similiar offered by OpenCL itself. It also looks to be a cleaner and more intuitive API then OpenCL. I may be all wrong here as I have never done OpenCL or CUDA programming, I just have a passing fancy that one day I might.
          Marc already addressed you but I'd like to add, just for future research, going to the competitor's website to get info about a product may not be the best way to get unbiased info. Do you go to the windows Dev site to learn about opengl?
          One more thing, opencl can be used just on the CPU for scaling (which reminds me I've meant to look at how it compares with openmp) and on DSPs, iirc.

          Comment


          • #6
            The whole industry is moving where?

            Originally posted by Marc Driftmeyer View Post
            http://developer.amd.com/tools-and-s...ath-libraries/




            http://streamcomputing.eu/blog/2014-...bra-libraries/


            It should come as no surprise that Nvidia has a pittance of OpenCL library equivalents to their CUDA platform. They are the only outlier supporting old OpenCL 1.0/1.1 stacks while pushing CUDA. The rest of the Industry is moving onto OpenCL. If Nvidia wants to keep selling those GPGPUs they better update their OpenCL support.

            https://github.com/search?q=OpenCL+F...=searchresults

            You can find all sorts of equivalents in OpenCL for AMD GPGPUs, ImgTec GPGPUs, Altera FPGAs, etc.

            You're looking for official sanctioned OpenCL equivalents from Nvidia who wants OpenCL to die. Good luck.
            AMD is not, and neither any of its friends in this website:
            http://developer.amd.com/resources/h...hitecture-hsa/

            NVIDIA can support easily support OpenCL since it offers a subset of features already available in CUDA.

            Moreover, the OpenCL standard is a nightmare. 5 different consistency models, plenty of optional features (bye bye portability), compiling the code and loading it at runtime... Have you ever programmed in OpenCL? It is not targeted to application developers. CUDA is.

            Comment


            • #7
              Originally posted by klim8 View Post
              AMD is not, and neither any of its friends in this website:
              http://developer.amd.com/resources/h...hitecture-hsa/

              NVIDIA can support easily support OpenCL since it offers a subset of features already available in CUDA.

              Moreover, the OpenCL standard is a nightmare. 5 different consistency models, plenty of optional features (bye bye portability), compiling the code and loading it at runtime... Have you ever programmed in OpenCL? It is not targeted to application developers. CUDA is.
              Did you read that link? They said they are working on optimizing opencl and c++ amp for HSA. So, if opencl isn't designed for application developers who is it designed for?
              The problems they listed would apply at least as much to cuda since Nvidia still didn't have an HSA equivalent solution available.

              Comment


              • #8
                OpenCL is not HSA.

                http://developer.amd.com/resources/h...hitecture-hsa/

                The team compared a CPU/GPU implementation in OpenCL™ against an HSA implementation. The HSA version seamlessly shares data between CPU and GPU, without memory copies or cache flushes because it assigns each part of the workload to the most appropriate processor with minimal dispatch overhead. The net result was a 2.3x relative performance gain at a 2.4x reduced power level*.
                Neither OpenCL nor C++AMP can exploit the features exposed by HSA. New languages will be needed. So, there is no industry standard right now. HSA is similar to PTX + UVM for NVIDIA. The difference is that the later is already implemented and working, while there is not even a draft of the HSA specification (http://www.hsafoundation.com/standards). OpenCL in its current state is a nightmare to use while CUDA has an integrated toolchain that allows you to have much simpler programs. OpenCL might make sense to develop frameworks/libraries (developed by expert programmers) that need to be portable across architectures (although it is not clear to me that using OpenCL in the host code is giving any benefit, since different kernel versions for the different devices are needed to achieve reasonable performance). Have you spent some time programming in CUDA/OpenCL? I have extensive experience in both languages.

                Originally posted by liam View Post
                Did you read that link? They said they are working on optimizing opencl and c++ amp for HSA. So, if opencl isn't designed for application developers who is it designed for?
                The problems they listed would apply at least as much to cuda since Nvidia still didn't have an HSA equivalent solution available.

                Comment


                • #9
                  OpenCL 1.2

                  At last there is some hard evidence of forthcoming OpenCL 1.2 support from Nvidia.

                  The recently released CUDA 6 toolkit (for Linux) includes an OpenCL stub library libOpenCL.so which contains all the new OpenCL 1.2 functions plus some other Nvidia functions.

                  Specifically, the new libOpenCL.so stub library adds the following functions compared to the 337.12 driver release:

                  clCompileProgram (OpenCL 1.2)
                  clCreateFromGLTexture (Nvidia)
                  clCreateImage (OpenCL 1.2)
                  clCreateProgramWithBuiltInKernels (OpenCL 1.2)
                  clCreateSubDevices (OpenCL 1.2)
                  clEnqueueBarrierWithWaitList (OpenCL 1.2)
                  clEnqueueFillBuffer (OpenCL 1.2)
                  clEnqueueFillImage (OpenCL 1.2)
                  clEnqueueMarkerWithWaitList (OpenCL 1.2)
                  clEnqueueMigrateMemObjects (OpenCL 1.2)
                  clGetExtensionFunctionAddressForPlatform (Nvidia)
                  clGetKernelArgInfo (OpenCL 1.2)
                  clLinkProgram (OpenCL 1.2)
                  clReleaseDevice (OpenCL 1.2)
                  clRetainDevice (OpenCL 1.2)
                  clUnloadPlatformCompiler (OpenCL 1.2)

                  Unfortunately the CUDA 6 toolkit does not contain the actual OpenCL implementation library libnvidia-opencl.so. The 337.12 driver release libnvidia-opencl.so does not seem to support the new functions: when I tried to use clCreateImage() in place of clCreateImage2D() my program crashed.

                  Comment


                  • #10
                    Originally posted by klim8 View Post
                    AMD is starting this process by delivering HSA optimized programming tools for today's most widely available heterogenous languages: OpenCL and C++ AMP.
                    What's more, the two key characteristics of HSA, shared virtual memory and device initiated work (dynamic parallelism) are both supported by opencl 2.0 ( the version targeted by that marketing link you reference says they targeted 1.1).

                    I snipped the last paragraph as I felt I addressed it with the above.

                    Comment

                    Working...
                    X