Announcement

**finalzone** · 20 December 2020, 02:22 AM

Originally posted by bridgman View Post

That's the first I have heard of this - are you talking about the string that appears in lspci, or the renderer string, or something else ?

That is with AMDGPU-Pro OpenCL running on Raven Ridge, in my case Ryzen 2500u. Here is the clinfo

Code:

clinfo
Number of platforms 1
Platform Name AMD Accelerated Parallel Processing
Platform Vendor Advanced Micro Devices, Inc.
Platform Version OpenCL 2.1 AMD-APP (3180.7)
Platform Profile FULL_PROFILE
Platform Extensions cl_khr_icd cl_amd_event_callback cl_amd_offline_devices
Platform Host timer resolution 1ns
Platform Extensions function suffix AMD

Platform Name AMD Accelerated Parallel Processing
Number of devices 1
Device Name gfx902
Device Vendor Advanced Micro Devices, Inc.
Device Vendor ID 0x1002
Device Version OpenCL 2.0 AMD-APP (3180.7)
Driver Version 3180.7 (PAL,HSAIL)
Device OpenCL C Version OpenCL C 2.0
Device Type GPU
Device Board Name (AMD) Unknown AMD GPU
Device Topology (AMD) PCI-E, 03:00.0
Device Profile FULL_PROFILE
Device Available Yes
Compiler Available Yes
Linker Available Yes
Max compute units 8
SIMD per compute unit (AMD) 4
SIMD width (AMD) 16
SIMD instruction width (AMD) 1
Max clock frequency 1100MHz
Graphics IP (AMD) 9.2
Device Partition (core)
Max number of sub-devices 8
Supported partition types None
Supported affinity domains (n/a)
Max work item dimensions 3
Max work item sizes 1024x1024x1024
Max work group size 256
Preferred work group size (AMD) 256
Max work group size (AMD) 1024
Preferred work group size multiple 64
Wavefront width (AMD) 64
Preferred / native vector sizes
char 4 / 4
short 2 / 2
int 1 / 1
long 1 / 1
half 1 / 1 (cl_khr_fp16)
float 1 / 1
double 1 / 1 (cl_khr_fp64)
Half-precision Floating-point support (cl_khr_fp16)
Denormals No
Infinity and NANs No
Round to nearest No
Round to zero No
Round to infinity No
IEEE754-2008 fused multiply-add No
Support is emulated in software No
Single-precision Floating-point support (core)
Denormals No
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Correctly-rounded divide and sqrt operations Yes
Double-precision Floating-point support (cl_khr_fp64)
Denormals Yes
Infinity and NANs Yes
Round to nearest Yes
Round to zero Yes
Round to infinity Yes
IEEE754-2008 fused multiply-add Yes
Support is emulated in software No
Address bits 64, Little-Endian
Global memory size 2684354560 (2.5GiB)
Global free memory (AMD) 2551712 (2.434GiB)
Global memory channels (AMD) 4
Global memory banks per channel (AMD) 4
Global memory bank width (AMD) 256 bytes
Error Correction support No
Max memory allocation 912680550 (870.4MiB)
Unified memory for Host and Device Yes
Shared Virtual Memory (SVM) capabilities (core)
Coarse-grained buffer sharing Yes
Fine-grained buffer sharing Yes
Fine-grained system sharing No
Atomics No
Minimum alignment for any data type 128 bytes
Alignment of base address 2048 bits (256 bytes)
Preferred alignment for atomics
SVM 0 bytes
Global 0 bytes
Local 0 bytes
Max size for global variable 821412352 (783.4MiB)
Preferred total size of global vars 2684354560 (2.5GiB)
Global Memory cache type Read/Write
Global Memory cache size 16384 (16KiB)
Global Memory cache line size 64 bytes
Image support Yes
Max number of samplers per kernel 16
Max size for 1D images from buffer 134217728 pixels
Max 1D or 2D image array size 2048 images
Base address alignment for 2D image buffers 256 bytes
Pitch alignment for 2D image buffers 256 pixels
Max 2D image size 16384x16384 pixels
Max 3D image size 2048x2048x2048 pixels
Max number of read image args 128
Max number of write image args 64
Max number of read/write image args 64
Max number of pipe args 16
Max active pipe reservations 16
Max pipe packet size 912680550 (870.4MiB)
Local memory type Local
Local memory size 65536 (64KiB)
Local memory syze per CU (AMD) 65536 (64KiB)
Local memory banks (AMD) 32
Max number of constant args 8
Max constant buffer size 912680550 (870.4MiB)
Preferred constant buffer size (AMD) 16384 (16KiB)
Max size of kernel argument 1024
Queue properties (on host)
Out-of-order execution No
Profiling Yes
Queue properties (on device)
Out-of-order execution Yes
Profiling Yes
Preferred size 262144 (256KiB)
Max size 8388608 (8MiB)
Max queues on device 1
Max events on device 1024
Prefer user sync for interop Yes
Number of P2P devices (AMD) 0
P2P devices (AMD) <printDeviceInfo:147: get number of CL_DEVICE_P2P_DEVICES_AMD : error -30>
Profiling timer resolution 1ns
Profiling timer offset since Epoch (AMD) 1608419596369206987ns (Sat Dec 19 15:13:16 2020)
Execution capabilities
Run OpenCL kernels Yes
Run native kernels No
Thread trace supported (AMD) Yes
Number of async queues (AMD) 4
Max real-time compute queues (AMD) 1
Max real-time compute units (AMD) 0
printf() buffer size 4194304 (4MiB)
Built-in kernels (n/a)
Device Extensions cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_fp16 cl_khr_gl_sharing cl_khr_gl_depth_images cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes cl_amd_copy_buffer_p2p

The issue affects all Linux operating system. Interesting enough, AMDGPU-Pro OpenCL runs fine on Blender even though Vega8 is listed as AMD GPU unknown.

I don't think it is neglected on the open source side, at least for GPU. There were some CPU / OS issues with the first Raven parts but I think we're past that now.

The focus is onthe

Not sure what you mean by "official AMD driver" since upstream is the primary official AMD driver - guessing you're talking about the prebuilt driver packages on amd-com, either the all-open built from an upstream fork or the hybrid workstation drivers with open source kernel/libdrm/multimedia and closed source OpenGL/Vulkan ?

Pre-build packages from amd.com for CentOS 8 to clarify. After installing OpenCL, the GPU is listed as AMD GPU unknown. Granted that was on 20.40 and i have yet to test 20.45 and the newer ROCm OpenCL. It would be great to properly list the GPU part of Raven Ridge APU like Vega instead on the generic gfx902.

**bridgman** · 20 December 2020, 03:21 PM

Originally posted by finalzone View Post

AMDGPU-Pro version is unofficially supported for that APU but the GPU part is listed as unknown thus preventing Davinci Resolve to boot.

Originally posted by finalzone View Post

Interesting enough, AMDGPU-Pro OpenCL runs fine on Blender even though Vega8 is listed as AMD GPU unknown.

Do you have some rationale for believing that the "GPU unknown" string is causing DaVinci Resolve not to work ? I'm not saying that is impossible but it seems unlikely and I don't think we have run across cases where an application makes runtime decisions based on the Device Board Name string. It's not unusual to switch on Device Name though.

Originally posted by finalzone View Post

It would be great to properly list the GPU part of Raven Ridge APU like Vega instead on the generic gfx902.

We could change it I guess but (a) gfx0902 is the most precise indicator of device capabilities we have and (b) we can put generic strings upstream before launch but are not allowed to expose marketing names until after launch, which would interfere with any vendor use of the field IFF they are in fact doing that. If we can confirm that the Device Board Name string is *not* the cause of issues with Resolve then I guess we could change it post-launch, although that is still something we really try to avoid.

**coder** · 20 December 2020, 04:28 PM

Originally posted by Spacefish View Post

At least if games don´t suddently start to use bf16 and such,

BFloat16 doesn't make a lot of sense for 3D or imaging. Half is a better compromise (which is no coincidence, since that's what it was designed for).

Originally posted by Spacefish View Post

we won´t see consumer cards based on CDNA IMHO.

That much seems clear, but it's yet to be seen how long they'll stick with GCN for APUs.

Originally posted by Spacefish View Post

CDNA: Lacks raytracing units,

It's tricky to extrapolate from an example of 1, but it's looking like CDNA will be compute-only.

I expect they'll sell workstation- and datacenter- oriented RDNA cards for visualization and cloud-based application hosting.

**coder** · 20 December 2020, 04:33 PM

Originally posted by bridgman View Post

If we can confirm that the Device Board Name string is *not* the cause of issues with Resolve then I guess we could change it post-launch, although that is still something we really try to avoid.

Isn't there someone at DaVinci you can reach out to? It seems a shame to play a guessing game with this sort of thing.

**coder** · 20 December 2020, 04:36 PM

Originally posted by Rakot View Post

we can develop and test our software stack on both mobile and HPC video cards. This is quite handy.

Totally agree. I think this is one reason Nvidia doesn't cripple CUDA, on its consumer cards. They realize that a lot of devs are developing on consumer hardware, even if they're deploying on proper cloud hardware.

**bridgman** · 20 December 2020, 04:41 PM

Originally posted by coder View Post

Isn't there someone at DaVinci you can reach out to? It seems a shame to play a guessing game with this sort of thing.

It's not just DaVinci though... if we were going to shift policy and start changing driver strings between pre- and post-launch we would need to go out and check with pretty much every software developer out there.

It's also a bit of a tough sell pushing our ISV relations group to go out and bug DaVinci about supporting Resolve on distros that they don't even support. My understanding is that it already works OK on RHEL/CentOS.

**bridgman** · 20 December 2020, 04:52 PM

Originally posted by Rakot View Post

AMD is partnering with Kokkos team but it is not enough if we cannot even test already available hardware.

In fairness, we have been supporting Polaris and Vega consumer dGPUs from the start. Lack of Navi support is awkward though, I agree. See next point for that though.

Originally posted by Rakot View Post

On Nvidia side, despite terrible open source support and a number of problems in general like recent "GPL condom" incident, we can develop and test our software stack on both mobile and HPC video cards. This is quite handy.

Originally posted by coder View Post

Totally agree. I think this is one reason Nvidia doesn't cripple CUDA, on its consumer cards. They realize that a lot of devs are developing on consumer hardware, even if they're deploying on proper cloud hardware.

One of the things that has always baffled me is that this point never seems to get mentioned to our sales/marketing/product management folks by our datacenter customers, who all seem happy with developers working and testing on the same server systems (or previous generation) that will be used for deployment.

It's tough to promote the importance of something internally if our customers are saying "nah we don't need it" to our customer-facing teams. I don't know how to fix that, but in the meantime our developers do talk directly with customer developers enough to understand how it would help. The challenge though is that since those discussions end up being "developer to developer" it still appears internally as if it is AMD developers pushing for this rather than customers.

Anyways, I think we are making progress on this (making OpenCL-over-ROCm default for the packaged drivers was a big step) and things will continue to improve... and in the meantime there are a lot of Vega cards out there which are fully supported already. I did get confirmation that Renoir is using GPUVM paths by default rather than ATC/IOMMUv2, so that's a start.

**coder** · 20 December 2020, 04:58 PM

Originally posted by bridgman View Post

It's not just DaVinci though... if we were going to shift policy and start changing driver strings between pre- and post-launch we would need to go out and check with pretty much every software developer out there.

I had in mind to go the other way - start a discussion about HW compatibility and then maybe you can understand their needs and help them find a better way of doing HW detection.

BTW, a hack might be to somehow expose a parameter (kernel boot option?) that lets users manually override this value. Thath sounds like begging for trouble, as there will certainly be some users who set it without really understanding what they're doing and end up forgetting about it and running into problems with other software. So, I don't really see an easy way around having ISVs do HW detection properly.

Originally posted by bridgman View Post

It's also a bit of a tough sell pushing our ISV relations group to go out and bug DaVinci about supporting Resolve on distros that they don't even support. My understanding is that it already works OK on RHEL/CentOS.

I get that. It'd need to be done diplomatically, but that's how I'd probably try to approach it. Every time I get a request from customers, sales, or product management that doesn't make sense, I always try to find out what's behind it and either solve the underlying problem or find a better approach.

**coder** · 20 December 2020, 05:02 PM

Originally posted by bridgman View Post

It's tough to promote the importance of something internally if our customers are saying "nah we don't need it" to our customer-facing teams. I don't know how to fix that, but in the meantime our developers do talk directly with customer developers enough to understand how it would help. The challenge though is that since those discussions end up being "developer to developer" it still appears internally as if it is AMD developers pushing for this rather than customers.

Isn't there anyone doing university relations, or anything like that? Maybe they could do more student outreach.

Also, it's a bit paradoxical to talk only to one's existing customers, if one is interested in expanding the customer base. You really ought to be talking to the people who are not your customers, and seeing why not.

**extremesquared** · 20 December 2020, 06:50 PM

Originally posted by coder View Post

Isn't there anyone doing university relations, or anything like that? Maybe they could do more student outreach.

Also, it's a bit paradoxical to talk only to one's existing customers, if one is interested in expanding the customer base. You really ought to be talking to the people who are not your customers, and seeing why not.

I suspect it's hit the level of "common knowledge" at this point that compsci students need nvidia cards. That's going to take more than outreach. It will likely take a few years of undoing.

Announcement

Radeon ROCm 4.0 Released With CDNA GPU Support (Instinct MI100)

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment