Announcement

**bridgman** · 24 March 2021, 03:59 PM

Originally posted by oleid View Post

In general, latest userspace vs latest kernel. I was wondering if the required bits are mostly frozen.

There is a problem specific to Vega20 consumer cards in the 4.1 release that is interfering with running over the upstream kernel. Digging into it now.

**Mathias** · 24 March 2021, 04:41 PM

I got OpenCL to work for the first time on my Navi10 5700 today! Yay ... ok blender renders the default cube in ~43s on the GPU and <10s on my 3700X.
But BMW: CPU: 3m25, GPU 1m22. So increase Tilesize -> default cube <7s on the GPU. Nice. It looks like the difference is more noticable for bigger scenes.

I think one of the biggest problem for AMD/Rocm/OpenCL is the installation and its image.

**smitty3268** · 24 March 2021, 04:43 PM

Originally posted by Spacefish View Post

IO Die
The IO Die was probably one of the first chips developed for the platform surrounding the AM4 socket / Zen in general.. AMD was linked up with Global Foundries back then and did not have a large R&D budget, as they had very uncompetitive CPU offerings.
It can be used as a PCH (x570 chipset) and or as the IO Die to link the chiplets / offer all the integrated USB/Ethernet,SATA PCIe connectivity.
This chips is developed for GF 12nm and still produced there. And probably the design did not change much during Zen -> Zen 3, as there is no reason for that / R&D is expensive for little gains.

AMD still has contracts with Global Foundries dating back to their split. They have to purchase a bunch of silicon from Global Foundries each year, or else they end up paying a large financial penalty - essentially paying GF for nothing, in that case. So it's not as simple as just redoing the I/O die on a better TSMC node, which would be beneficial in terms of power usage - they have to send a bunch of money GF's way one way or another.

**pete910** · 24 March 2021, 05:17 PM

Originally posted by bridgman View Post

Got it - thanks.

Just curious, which packaged driver are you using ? I'm asking because 20.45 and up use ROCm paths for OpenCL even on the packaged drivers.

Package build is here https://aur.archlinux.org/packages/rocm-opencl-runtime

looks to be builkding the 4.1 branch.

On testing my VII is not playing with rocm-opencl. Even rocminfo is not showing it!

**pete910** · 24 March 2021, 05:19 PM

Originally posted by smitty3268 View Post

AMD still has contracts with Global Foundries dating back to their split. They have to purchase a bunch of silicon from Global Foundries each year, or else they end up paying a large financial penalty - essentially paying GF for nothing, in that case. So it's not as simple as just redoing the I/O die on a better TSMC node, which would be beneficial in terms of power usage - they have to send a bunch of money GF's way one way or another.

Seem to recall reading that has now terminated last year IIRC but maybe completely wrong.

**pete910** · 24 March 2021, 05:34 PM

Originally posted by bridgman View Post

There is a problem specific to Vega20 consumer cards in the 4.1 release that is interfering with running over the upstream kernel. Digging into it now.

Looks to be that issue as according to rocminfo it's disabling it and I need to upgrade AMDGPU

Code:

[FONT=monospace][COLOR=#000000]rocminfo [/COLOR]
[COLOR=#b2b2b2]ROCk module is loaded[/COLOR]
HSA Error:  Incompatible kernel and userspace, Vega 20 [Radeon VII] disabled. Upgrade amdgpu.
=====================    
HSA System Attributes    
=====================    
Runtime Version:         1.1
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE                              
System Endianness:       LITTLE                              

==========                
HSA Agents                
=========[/FONT]

**smitty3268** · 24 March 2021, 05:42 PM

Originally posted by pete910 View Post

Seem to recall reading that has now terminated last year IIRC but maybe completely wrong.

The last I saw they have an agreement with specific targets and prices through the end of this year. Then there are additional agreements through until March 1st 2024, which seem a little vaguer and are possibly still up for some negotiation.

**bridgman** · 24 March 2021, 06:58 PM

Originally posted by pete910 View Post

Looks to be that issue as according to rocminfo it's disabling it and I need to upgrade AMDGPU

Yep... the message is a bit misleading though... what it is supposed to be saying is "you can't run 4.1 userspace without the 4.1 kernel driver".

The required kernel driver changes are on their way upstream but not there yet AFAIK, so just going to newer kernel isn't going to help. I was going to say "if you can go back to 4.0 userspace in the short term that would probably be best" but you already said that you had problems with 4.0. Do you happen to remember what problems you saw ?

**raun0** · 25 March 2021, 03:29 AM

Originally posted by skeevy420 View Post

For all intents and purposes, CDNA is basically headless RDNA; both are the successor to GCN.

https://www.amd.com/system/files/doc...whitepaper.pdf

The classic GCN compute cores contain a variety of pipelines optimized for scalar and vector instructions. In particular, each CU contains
a scalar register file, a scalar execution unit, and a scalar data cache to handle instructions that are shared across the wavefront, such as
common control logic or address calculations. Similarly, the CUs also contain four large vector register files, four vector execution units that
are optimized for FP32, and a vector data cache. Generally, the vector pipelines are 16-wide and each 64-wide wavefront is executed over
four cycles.
The AMD CDNA architecture builds on GCN’s foundation of scalars and vectors and adds matrices as a first class citizen while
simultaneously adding support for new numerical formats for machine learning and preserving backwards compatibility for any software written for the GCN architecture. These Matrix Core Engines add a new family of wavefront-level instructions, the Matrix Fused Multiply-
Add or MFMA. The MFMA family performs mixed-precision arithmetic and operates on KxN matrices using four different types of input
data: 8-bit integers (INT8), 16-bit half-precision FP (FP16), 16-bit brain FP (bf16), and 32-bit single-precision (FP32). All MFMA instructions
produce either 32-bit integer (INT32) or FP32 output, which reduces the likelihood of overflowing during the final accumulation stages of a
matrix multiplication.

**pete910** · 25 March 2021, 03:41 AM

Originally posted by bridgman View Post

Yep... the message is a bit misleading though... what it is supposed to be saying is "you can't run 4.1 userspace without the 4.1 kernel driver".

The required kernel driver changes are on their way upstream but not there yet AFAIK, so just going to newer kernel isn't going to help. I was going to say "if you can go back to 4.0 userspace in the short term that would probably be best" but you already said that you had problems with 4.0. Do you happen to remember what problems you saw ?

The problem was rocminfo listed and stated both GPU's and cpu fine. But when something like f@h, luxmark ect tried to use any GPU it failed on both GPU's
CPU worked fine.

Announcement

Radeon ROCm 4.1 Released - Still Without RDNA GPU Support

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment