Announcement

Collapse
No announcement yet.

AMD's Background On The ROCm OpenCL Stack

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    So this "direct-to-ISA" path is there on Kaveri?

    Comment


    • #22
      I think so but not 100% sure.
      Test signature

      Comment


      • #23
        Thank you!

        Comment


        • #24
          bridgman how do you plan to support TensorFlow et al? The "Internets" doesn't offer a coherent answer.

          Comment


          • #25
            Originally posted by amehaye View Post
            bridgman how do you plan to support TensorFlow et al? The "Internets" doesn't offer a coherent answer.
            On ROCm, using CUDA paths ported via HIP and then presumably pushed back upstream (I'm not sure about that last part but it seems likely). The ported code should run on either AMD or NVidia hardware.

            A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.
            Test signature

            Comment


            • #26
              Originally posted by bridgman View Post

              On ROCm, using CUDA paths ported via HIP and then presumably pushed back upstream (I'm not sure about that last part but it seems likely). The ported code should run on either AMD or NVidia hardware.

              A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.
              Thanks for answering! I still fail to understand the exact details.

              To give some background, I've been training custom "deep" neural network models for the past several months using TensorFlow with nVidia's Pascal TitanX, and the performance is relatively good. However, AMD's Vega cards are very interesting for my needs (training deep networks) since they have good 16-bit float support, for a price relatively competitive with nVidia's offering. nVidia has good 16 bit float support only on the quite pricey P100 and to-be-released-sometime V100 GPUs while IIUC all Vega cards have good 16 bit float support. However theoretical GFlops and half-precision support don't mean much when you can't use it.

              On nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?

              Comment


              • #27
                Originally posted by amehaye View Post
                On nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?
                My understanding is the latter - MIOpen is an open source alternative to cuDNN. I wasn't sure if TensorFlow was built on cuDNN or not (my focus these days is the lower levels of the stack). We do not do any binary translation anywhere in the ROCm stack AFAIK - everything is source level.
                Test signature

                Comment


                • #28
                  Just to close off on this - we published a version of TensorFlow 1.3 running on the ROCm stack shortly after the last post in this thread, and after several incremental updates recently released TensorFlow 1.8 running on ROCm.
                  Test signature

                  Comment

                  Working...
                  X