Announcement
Collapse
No announcement yet.
AMD's Background On The ROCm OpenCL Stack
Collapse
X
-
Just to close off on this - we published a version of TensorFlow 1.3 running on the ROCm stack shortly after the last post in this thread, and after several incremental updates recently released TensorFlow 1.8 running on ROCm.
-
Originally posted by amehaye View PostOn nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?
- Likes 2
Leave a comment:
-
Originally posted by bridgman View Post
On ROCm, using CUDA paths ported via HIP and then presumably pushed back upstream (I'm not sure about that last part but it seems likely). The ported code should run on either AMD or NVidia hardware.
A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.
To give some background, I've been training custom "deep" neural network models for the past several months using TensorFlow with nVidia's Pascal TitanX, and the performance is relatively good. However, AMD's Vega cards are very interesting for my needs (training deep networks) since they have good 16-bit float support, for a price relatively competitive with nVidia's offering. nVidia has good 16 bit float support only on the quite pricey P100 and to-be-released-sometime V100 GPUs while IIUC all Vega cards have good 16 bit float support. However theoretical GFlops and half-precision support don't mean much when you can't use it.
On nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?
Leave a comment:
-
A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.
Leave a comment:
-
Not exactly. We have released open source OpenCL running over ROCm, and we did do the early HSA/ROCm development on Kaveri (and pushed the work upstream) so most of the big pieces are there. That said, the OpenCL-over-ROCm path has not been tested on Kaveri recently AFAIK and so would probably require a non-trivial amount of effort to get it up to production quality.
Leave a comment:
Leave a comment: