Announcement

**msroadkill612** · 07 May 2019, 11:44 AM

"

A CUSTOM high-bandwidth, low-latency coherent Infinity Fabric, connecting four (hbm cache) AMD Radeon Instinct GPUs to one AMD EPYC CPU per node;

"

Thats quite a pimping. It wont only be doe who want that.

Intel just cant do stuff like that.

It reflects how cute amdS modular architecture is.

In the exciting and growing custom embedded market, we are seeing amd practically own design wins lately. Even now they make ~$500m pa from it.

**zxy_thf** · 07 May 2019, 11:50 AM

Originally posted by L_A_G View Post

As much as I prefer CUDA over OpenCL as a programmer that knows and has used both semi-professionally (as a university researcher), I still ideologically prefer OpenCL over CUDA and thus it's nice to see it gain at least some wins in a market so badly dominated by CUDA.

The good news might be deep learning guys are not using CUDA directly.
As long as AMD made their cards perform well under various Torch/TF workloads this product would be fine.

**torsionbar28** · 07 May 2019, 12:21 PM

Originally posted by Rayniac View Post

What does the department of energy need a supercomputer for?

Lol, the DoE is the biggest US buyer of supercomputers, and has been for a very long time. All the biggest supercomputers in the US are DoE owned. The more savvy question, is what don't they need them for.

**L_A_G** · 07 May 2019, 01:10 PM

Originally posted by zxy_thf View Post

The good news might be deep learning guys are not using CUDA directly.
As long as AMD made their cards perform well under various Torch/TF workloads this product would be fine.

My understanding of the situation is somewhat different. While they do use open stuff like TensorFlow, they still build and optimize things so heavily for Nvidia's implementations that the difference between using CUDA directly and building so heavily around those components written with CUDA really is just a distinction without a difference.

Also, AFAIK AMD hasn't really put anywhere near the amount of resources into machine learning work as Nvidia has. Thus while their hardware tends to be faster at basic number crunching*, nvidia can overcome that with really well optimized hardware-specific libraries and various hardware customizations.

*When broken down to it's basic operations most machine learning tasks consist mostly of just an enormous amount of matrix multiplications and additions. Because of this dedicated machine learning chips really tend to spend a crazy large portion of their silicon area on dedicated MAC (Multiply-Accumulate) ASICs.

**zxy_thf** · 07 May 2019, 01:23 PM

Originally posted by L_A_G View Post

My understanding of the situation is somewhat different. While they do use open stuff like TensorFlow, they still build and optimize things so heavily for Nvidia's implementations that the difference between using CUDA directly and building so heavily around those components written with CUDA really is just a distinction without a difference.

Also, AFAIK AMD hasn't really put anywhere near the amount of resources into machine learning work as Nvidia has. Thus while their hardware tends to be faster at basic number crunching*, nvidia can overcome that with really well optimized hardware-specific libraries and various hardware customizations.

*When broken down to it's basic operations most machine learning tasks consist mostly of just an enormous amount of matrix multiplications and additions. Because of this dedicated machine learning chips really tend to spend a crazy large portion of their silicon area on dedicated MAC (Multiply-Accumulate) ASICs.

TF heavily exploits NVIDIA's cuDNN, and I agree with you there isn't any equivalent library for AMD's GPU.

**xxmitsu** · 07 May 2019, 01:34 PM

Originally posted by zxy_thf View Post

TF heavily exploits NVIDIA's cuDNN, and I agree with you there isn't any equivalent library for AMD's GPU.

They have something in the MiOpen/hipdnn area: https://github.com/ROCmSoftwarePlatf...b=repositories

**LinAGKar** · 07 May 2019, 04:58 PM

Typo:

Originally posted by phoronix View Post

Phoronix: AMD EPYC + Radeon Instrinct To Power "Frontier" 1.5 Exaflop Supercomputer

**bridgman** · 07 May 2019, 07:39 PM

Originally posted by zxy_thf View Post

TF heavily exploits NVIDIA's cuDNN, and I agree with you there isn't any equivalent library for AMD's GPU.

MIOpen is our cuDNN equivalent library for AMD GPUs (except for being open source)... it's been available for a couple of years now:

Site not found · GitHub Pages

https://rocmsoftwareplatform.github.io/MIOpen/doc/html/

**microcode** · 07 May 2019, 08:08 PM

2019: Year of the Linux Warhead.

I like it.

All joking aside, though, I hope they can do something productive with this thing. Are they going to revisit the molten salt reactor?

**microcode** · 07 May 2019, 08:21 PM

Originally posted by bridgman View Post

MIOpen is our cuDNN equivalent library for AMD GPUs (except for being open source)... it's been available for a couple of years now:

https://rocmsoftwareplatform.github.io/MIOpen/doc/html/

When I looked for how to set up Tensorflow to use my AMD GPU, it wasn't easy to follow how exactly it's hooked up. Is there a document somewhere explaining how to get ROCm Tensorflow running on a generic Linux system? I use Archlinux and it seems the few press releases from GPUOpen/AMD simply assume you're running year old Ubuntu. I have MIOpen (HIP) and rocm set up but the rest is ???

NVIDIA gets to rely on the network effects of being first to market with good libraries for this stuff, AMD needs to hire people to make up for that.

Announcement

AMD EPYC + Radeon Instinct To Power

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment