Radeon ROCm 4.1 Released - Still Without RDNA GPU Support

Grinness replied

24 March 2021, 07:29 AM
Originally posted by bridgman View Post

Good point - I don't think that all the fixes from 20.45 have worked their way into the ROCm release stream yet but we should definitely mention that once the functionality is there.

It's possible that we may have to work through at least conceptually separating "the upstream for our compute components" from "our datacenter releases" as a pre-requisite since right now the ROCm releases kinda serve as both. That is going to be an increasing problem as we expand support to consumer hardware.

Thanks !

Bridgman,

is there an effort to simplify/streamline the adoption of ROCm for end users?
Conceptually ROCm/HIP is fascinating and very promising, but it does not compare with the simplicity of installation (and 3rd party support/implementation) of nvidia.

Years ago I used to do cuda on nvidia,
Now I am on polaris (arch linux), i just spent 4 days (probably more, between AMD website, git repos, Arch AUR) to try to get pytorch, torchvisiom, torchtext to run.
Tried from native rocm on arch, to dockers. An ugly mess!

Finally yesterday night by creating my own pkgbuild (and a lot of kicking and screaming) got all instaleld
... to find out that the gpu computing stalls, no errrors, no dmesg -- just trying a jupyter notebook tutorial from torchtext

How do I debug?
How do i ask for help? ( I assume first thing will be: arch is not supported, followed by oh, polaris is not officially supported)

On top of that ROCm 4.1 is out and I have to recompile EVERITHING ....

Come on. ...
Likes 6
Leave a comment:
Paradigm Shifter replied

24 March 2021, 07:11 AM
Originally posted by raun0 View Post

For information AMD has two architectural lines GCN and RDNA.

RDNA is for people whom has empty energy drink cans on their desk and the main color lights flashing all over the place.

GCN is for super computers and servers.

You mean all those people who had Radeon 7000 series cards, or R9 2x0/3x0 series cards, or RX400 series cards... all of which were GCN?

I thought it was RDNA for desktops, CDNA for HPC?
Likes 3
Leave a comment:
raun0 replied

24 March 2021, 06:52 AM
Originally posted by sweetsuicide View Post

I still remember when someone on this forum told me I was wrong that ROCm would take an incredibly long time to come to RDNA, but, hey, here we are

Please, read my comment above.
Leave a comment:
sweetsuicide replied

24 March 2021, 04:56 AM
I still remember when someone on this forum told me I was wrong that ROCm would take an incredibly long time to come to RDNA, but, hey, here we are
Likes 1
Leave a comment:
raun0 replied

24 March 2021, 04:23 AM
Originally posted by phoronix_is_awesome View Post

ROCm is dead. It is amazing that many years after the release of RDNA 1, AMD still doesn't have ROCm support for it, yet AMD wants to sell a overclocked 192bit RDNA2 chip without compute at Nvidia ampere 256bit GA104 prices by intentionally busting its 8Gb VRAM buffer because it is "designed for gaming at max 1440P settings". What a joke. Milan, as reviewed by anandtech, is also partly a regression in idle power due to subpar IO Hub chip L3 cache design. What a disappointment.

For information AMD has two architectural lines GCN and RDNA.

RDNA is for people whom has empty energy drink cans on their desk and the main color lights flashing all over the place.

GCN is for super computers and servers.

Last edited by raun0; 24 March 2021, 04:25 AM.
Likes 1
Leave a comment:
oleid replied

24 March 2021, 02:47 AM
Originally posted by bridgman View Post

It depends on the userspace version - we do some testing of release candidate userspace against recent upstream kernels but over time newer kernels are going to be needed. Or is the question specifically for the 4.1 release ?

In general, latest userspace vs latest kernel. I was wondering if the required bits are mostly frozen.
Likes 1
Leave a comment:
bridgman replied

24 March 2021, 02:29 AM
Originally posted by oleid View Post

Can you confirm that no DKMS driver is required for mainline kernel starting 5.9?

It depends on the userspace version - we do some testing of release candidate userspace against recent upstream kernels but over time newer kernels are going to be needed. Or is the question specifically for the 4.1 release ?
Leave a comment:
oleid replied

24 March 2021, 02:04 AM
Originally posted by bridgman View Post

The kernel version restrictions only apply to the rock-dkms packaged kernel driver. If you install the rocm-dev metapackage over your existing kernel driver that should give you what you need.

https://rocmdocs.amd.com/en/latest/I...r-AMD-GPU.html

And yes that information is much harder to find than it should be. Trying to get that improved.

The dkms driver from the 20.50 amdgpu packaged driver includes support for the 5.8 kernel and is tested with the ROCm components up to OpenCL, but that only shipped a couple of days ago and hasn't made it into the ROCm stack releases yet.

Can you confirm that no DKMS driver is required for mainline kernel starting 5.9?
Likes 1
Leave a comment:
phoronix_is_awesome replied

24 March 2021, 12:54 AM
Originally posted by bridgman View Post

I'll skip over how 20 months becomes "many years" but I do need to point out that the IO chip does not include L3, just data fabric and memory controllers. The IO hub is actually pretty much the same between Zen2 and Zen3.

I missed a comma between IO chip, and L3 cache. Without shrinking the IO die, Milan is a very small incremental upgrade. Anandtech is right, the 19% IPC uplift is partially nerfed by increase in power consumption of the IO die(40% of socket power)

2 years is technically eternity for not release ROCm compute support. The most disgusting part of "NAVY FLOUNDERS" is the lie that it is designed for 1440 MAX SETTINGS, where in your own presentation it shows that 1440P max settings require 9.5-10GB of VRAM, thus forcing Nvidia GA104 8GB to swap data over PCI-e bus. Then you guys overclock the 192bit chip to 220W TGP(Do you realize that 250W used to be Nvidia's 384bit TDP range?). You are doing it only because you wanted to price the chip to the 256bit tier to make up the 100mm^2 design mistake called infinity cache. This is the most insidious design and marketing attempt by AMD in the past 5 years. And you refuse to enable ROCm support simply by policy. A single third party engineer did a few lines of modification to enable ROCm on APUs even:
https://bruhnspace.com/en/bruhnspace-rocm-for-amd-apus
So it is clear that ROCm support is a political decision, not a technical one. Brunspace is not opening up his code to enable ROCm, and the code base is not up to date. Make compute work on your weaker GPU design, otherwise RX6700XT is worth $299 in my mind without ROCm support when your competitor prices 3060 at $329(eventually you will be able to buy one at MSRP, all we need to do is fukk Bitcoin to $1k where it belongs)
Leave a comment:
bridgman replied

24 March 2021, 12:17 AM
Originally posted by atomsymbol

If you had tried to run OpenCL apps on a RDNA card with ROCm 4 then you would know that it works fairly well. bridgman In my opinion, it is a mistake that the partial/unofficial support for RDNA GPUs isn't mentioned in ROCm README files.

Good point - I don't think that all the fixes from 20.45 have worked their way into the ROCm release stream yet but we should definitely mention that once the functionality is there.

It's possible that we may have to work through at least conceptually separating "the upstream for our compute components" from "our datacenter releases" as a pre-requisite since right now the ROCm releases kinda serve as both. That is going to be an increasing problem as we expand support to consumer hardware.

Thanks !

Last edited by bridgman; 24 March 2021, 12:19 AM.
Likes 1
Leave a comment:

Announcement

Radeon ROCm 4.1 Released - Still Without RDNA GPU Support

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: