So this "direct-to-ISA" path is there on Kaveri?
Announcement
Collapse
No announcement yet.
AMD's Background On The ROCm OpenCL Stack
Collapse
X
-
A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.Test signature
Comment
-
Originally posted by bridgman View Post
On ROCm, using CUDA paths ported via HIP and then presumably pushed back upstream (I'm not sure about that last part but it seems likely). The ported code should run on either AMD or NVidia hardware.
A number of the apps have been built around vendor-specific libraries - not sure if TensorFlow was one of those but our corresponding library (MIOpen) was published as part of ROCm 1.6.
To give some background, I've been training custom "deep" neural network models for the past several months using TensorFlow with nVidia's Pascal TitanX, and the performance is relatively good. However, AMD's Vega cards are very interesting for my needs (training deep networks) since they have good 16-bit float support, for a price relatively competitive with nVidia's offering. nVidia has good 16 bit float support only on the quite pricey P100 and to-be-released-sometime V100 GPUs while IIUC all Vega cards have good 16 bit float support. However theoretical GFlops and half-precision support don't mean much when you can't use it.
On nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?
Comment
-
Originally posted by amehaye View PostOn nVidia GPUs TensorFlow is mainly using cuDNN, nVidia's deep neural network *binary* library. So if I understand your answer correctly, AMD is going to binary-translate cuDNN to something that can be consumed by AMD GPUs? Isn't that going to be sub-optimal performance wise? Or is MIOpen the AMD alternative to cuDNN, with hopefully better performance?Test signature
- Likes 2
Comment
Comment