Announcement

**Danny3** · 21 August 2020, 02:13 PM

Originally posted by Hibbelharry View Post

compare this to one of your lasts posts in a different thread:

If you wouldn't talk (rant) in extremes all the time, often stating kind of opposites of what you said before, your opinion on things might have more worth and impact.

I see no conflict.
I have stopped using Nvidia a long time ago because I hate them with their shitty attitude towards open source.
I started using AMD but I don't like them too much how they have bad / very late software support compared to Nvidia / Intel.
So what, I cannot complain about both ?
Yeah AMD is lesser evil between the two, but their software support totally suck!
Just because they are better than Nvidia in some cases, not all, because Nvidia has at least a control panel and high performance compute, it doesn't mean tht we shouldn't be allowed to complain about them.

In any case, this is feedback why I'm not buying their Navi GPUs, because unlike others I don't care only about gaming.

**andrei_me** · 21 August 2020, 03:40 PM

Originally posted by skeevy420 View Post

Does anyone else pronounce "ROCm" like "Rock Em" from "Rock Em Sock Em Robots"?

I do, but I speak portuguese It sounds a little different, something like "hock emê" where the 'e' have a closed sound, like in "mirage"

**MadeUpName** · 21 August 2020, 05:11 PM

Originally posted by oleid View Post

It would seem that only kernel 5.3 is supported officially. I know that 5.4 (LTS) still works with their dkms module. That being said, I just tried AchLinux's current kernel, which is 5.8.1, at the moment. While 5.7 didn't work for me (but did for others), it would seem 5.8 has enough things upstreamed that I can run their tensorflow/docker image on Arch without any kernel changes and train a model on my Radeon RX 580X!

Thanks for posting. I'm running a RX580 on this machine with Fedora and there was a time when it worked as well. Then it stopped and eventually had to remove dkms because it kept crashing my system. Once Fedora moves from 5.7 to 5.8 I will reinstall it and see if it works. Or I may go fishing in the FC33 repo.

**bridgman** · 21 August 2020, 06:23 PM

Originally posted by oleid View Post

It would seem that only kernel 5.3 is supported officially. I know that 5.4 (LTS) still works with their dkms module. That being said, I just tried AchLinux's current kernel, which is 5.8.1, at the moment. While 5.7 didn't work for me (but did for others), it would seem 5.8 has enough things upstreamed that I can run their tensorflow/docker image on Arch without any kernel changes and train a model on my Radeon RX 580X!

We actually support both slow-moving enterprise distros (with the DKMS kernel driver) and faster moving distros with a sufficiently new upstream kernel. I hadn't heard about problems running ROCm userspace over 5.7 kernel but will ask around.

There are a few features that you won't get with an upstream kernel, primarily the ability to hard-pin video memory for direct access by high-speed NIC's, but that is because the functionality is not allowed upstream. It is possible to get RDMA working without hard-pinning video memory, but that requires NIC's with recoverable page support and driver support for something like HMM, but that hasn't arrived as quickly as I was expecting.

Inside AMD most people pronounce it "rock-em" (as in rock-em sock-em) but are some folks who pronounce more like rock-emm ("roc m").

**Djip007** · 22 August 2020, 06:10 AM

I can confirme the tensorflow/docker image work on Fedora 32 with stock 5.7 kernel and RX580... and podman...

Code:

podman run --rm --network=host --device=/dev/kfd --device=/dev/dri --ipc=host -v $PWD:/root/home rocm/tensorflow:rocm3.3-tf2.1-dev bash -c "python3 -m pip install jupyterlab pillow matplotlib && jupyter lab --allow-root"

podman run --rm --network=host --device=/dev/kfd --device=/dev/dri --ipc=host -v $PWD:/root/home rocm/tensorflow:rocm3.7-tf2.3-dev bash -c "python3 -m pip install jupyterlab pillow matplotlib && jupyter lab --allow-root"

use rocm3.3 for some time... and test new rocm3.7

(note: bash -c "..." is for install/start jupyterLab on container image... not required for everyone.)

**3diStan** · 22 August 2020, 10:39 AM

What's really missing for me is ROCm support for Ryzen APUs (e.g. to be able to do development and testing with it on my notebook… even if I later use bigger machines with discrete GPUs to do the actual number crunching)

And inb4 anyone would suggest the private closed-source project/build by Frederik Bruhn (Bruhnspace): no, that's not an option in any way. As someone put it quite spot-on in heise forum (German, I'll translate the relevant bits):

Apart from the fact that that project (2) would be insta-dead if Frederik loses his interest in it tomorrow or gets run over by a bus (3), it's:

* a set of distribution-specific packages for Ubuntu 18.04.3 (with additional binding restrictions of the versions e.g. for the Kernel)
* a purely binary build
* and the guy explicitly refuses to share his patch:

"Q: I am looking for the patches to enable ROCm on APUs. Can you share them?
A: Unfortunately not. Bruhnspace cannot share the patches."

Weird: the guy does a patch against RadeonOpenCompute/ROCm ("RadeonOpenCompute/ROCm: ROCm - Open Source Platform for HPC and Ultrascale GPU Computing")… but then doesn't want to share his patch but only a distribution-specific and version-bound pure binary build. WTF?

If he would share the sources of his patch, then at least users could build it for any distribution of their choice and wouldn't be subject to the bus factor risk if the project stops for whatever reason. But like this, closed source, it's not an option.

(2): https://bruhnspace.com/en/bruhnspace-rocm-for-amd-apus/
(3): https://en.wikipedia.org/wiki/Bus_factor

**RavFX** · 22 August 2020, 12:28 PM

Well well, I'm really thinking about ditching my RX5700 XTified into the local market and just install my RX480 580ified back.

**bridgman** · 22 August 2020, 03:26 PM

Originally posted by 3diStan View Post

What's really missing for me is ROCm support for Ryzen APUs (e.g. to be able to do development and testing with it on my notebook… even if I later use bigger machines with discrete GPUs to do the actual number crunching)

Agreed, and I think it's fair to say that view is gaining broader acceptance internally.

As with Navi we are already implementing, testing and upstreaming support in the lower level components - what is not being supported officially yet is the upper level components, primarily HIP and the libraries that run over HIP. OpenCL should be in pretty good shape already, and we are testing the lower level ROCm components in the AMDGPU-PRO packaged stack as a prelude to using them for OpenCL as well.

**oleid** · 23 August 2020, 01:37 AM

Originally posted by bridgman View Post

We actually support both slow-moving enterprise distros (with the DKMS kernel driver) and faster moving distros with a sufficiently new upstream kernel. I hadn't heard about problems running ROCm userspace over 5.7 kernel but will ask around.

Training didn't start back then. This time, it started fine. But as it turn out later, problems occurred during operation. I guess I was too excited that it finally worked fine I was a little hasty. For what it's worth, this is my issue report:

OP_REQUIRES failed at reshape_op.h:57 : Invalid argument: Size 0 must be non-negative, not -1113764467 · Issue #1089 · ROCm/tensorflow-upstream

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/issues/1089

System information I'm using the official rocm/tensorflow docker container on ArchLinux with their kernel 5.8 - so I'm on upstream driver. GPU is Radeon 480RX on a first generation Ryzen 1800X. I'm...

Not sure who's to blame, though - the tensorflow port, MIOpen or ROCm.

**erkki** · 23 August 2020, 07:37 PM

ROCm 3.7 still doesn't mention any official support for Navi/GFX10 but I have heard that the code should be working albeit unofficially.

I have confirmed that ROCm 3.7 works with my 5700XT: https://rigtorp.se/notes/rocm/

Announcement

Radeon ROCm 3.7 Release Enables OpenMP 5.0 By Default In AOMP

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment