Announcement

**Rakot** · 20 December 2020, 07:12 PM

Originally posted by bridgman View Post

In fairness, we have been supporting Polaris and Vega consumer dGPUs from the start. Lack of Navi support is awkward though, I agree. See next point for that though.

One of the things that has always baffled me is that this point never seems to get mentioned to our sales/marketing/product management folks by our datacenter customers, who all seem happy with developers working and testing on the same server systems (or previous generation) that will be used for deployment.

It's tough to promote the importance of something internally if our customers are saying "nah we don't need it" to our customer-facing teams. I don't know how to fix that, but in the meantime our developers do talk directly with customer developers enough to understand how it would help. The challenge though is that since those discussions end up being "developer to developer" it still appears internally as if it is AMD developers pushing for this rather than customers.

Anyways, I think we are making progress on this (making OpenCL-over-ROCm default for the packaged drivers was a big step) and things will continue to improve... and in the meantime there are a lot of Vega cards out there which are fully supported already. I did get confirmation that Renoir is using GPUVM paths by default rather than ATC/IOMMUv2, so that's a start.

There is a big difference though between HPC and data centers. Up until recently, CUDA was the main choice for developing HPC software that utilize GPGPU.
In addition, we are using only Dell hardware: workstations and laptops. Therefore, we never really faced AMD's graphic cards because (1) they don't support CUDA, (2) they were not widespread in DOE clusters, (3) it was quite difficult to find a good laptop with AMD graphics cards. I've also never heard about any discussions between our team and anyone from AMD while we got in touch with Nvidia developers from time to time. As a result, most of the development was done on Intel/Nvidia systems.

Now with DOE announcements and newest AMD's CPUs and GPUs offerings, the situation is quite different. The performance of AMD CPUs and GPUs is quite competitive and, more importantly, your hardware will be a part of new clusters. I myself bought a gaming AMD APU/Nvidia GPU laptop in order to develop and test HPC software for both platforms using either Kokkos or HIP directly anticipating that, similar to the current state of radeonsi and RADV/AMDVLK, there is a good support of APUs in ROCm. However, the situation with the compute stack on APUs is quite different from my expectations. I am quite satisfied with the performance and state of the APUs on both CPU and graphics side. I take my hat off to Alex, Marek and the rest of your graphics team. This is my third laptop with AMD graphics card and this is the best experience in terms of stability and features so far. The only missing feature is an adequate compute stack that is easy to install and start using.

**bridgman** · 20 December 2020, 08:47 PM

Originally posted by coder View Post

I had in mind to go the other way - start a discussion about HW compatibility and then maybe you can understand their needs and help them find a better way of doing HW detection.

BTW, a hack might be to somehow expose a parameter (kernel boot option?) that lets users manually override this value. Thath sounds like begging for trouble, as there will certainly be some users who set it without really understanding what they're doing and end up forgetting about it and running into problems with other software. So, I don't really see an easy way around having ISVs do HW detection properly.

The question I'm still trying to understand is why you think it is a hardware detection problem rather than a functionality problem.

Originally posted by coder View Post

I get that. It'd need to be done diplomatically, but that's how I'd probably try to approach it. Every time I get a request from customers, sales, or product management that doesn't make sense, I always try to find out what's behind it and either solve the underlying problem or find a better approach.

Sorry, I didn't say that right. What I was trying to say was that it's tough to get our people to spend time on having us support Resolve on a distro that Blackmagic doesn't even support when we still have work to do in other areas.

**finalzone** · 21 December 2020, 02:23 AM

[QUOTE=bridgman;n1227583]

Do you have some rationale for believing that the "GPU unknown" string is causing DaVinci Resolve not to work ? I'm not saying that is impossible but it seems unlikely and I don't think we have run across cases where an application makes runtime decisions based on the Device Board Name string. It's not unusual to switch on Device Name though.

The issue occurred on Davinci Resolve 16 with the same hardware. After clean install Fedora 33 on HP Envy x360 Ryzen 2500u, I downloaded the 17 Beta version and was pleased to see the full interface running on AMD GPU unknown. Perhaps the bug was more related on Davinci than the driver. Sorry for the noise.

We could change it I guess but (a) gfx0902 is the most precise indicator of device capabilities we have and (b) we can put generic strings upstream before launch but are not allowed to expose marketing names until after launch, which would interfere with any vendor use of the field IFF they are in fact doing that.

b) would be great from the start without interference to the vendors. gfx902 has several versions notable Vega11, Vega8 and Vega6 which can lead to confusion

If we can confirm that the Device Board Name string is *not* the cause of issues with Resolve then I guess we could change it post-launch, although that is still something we really try to avoid.

It turned out the Device Board Name string is not the cause as tested on Resolve 17 Beta6. The gfx902 Vega8 is listed as discrete AMD unknown GPU from AMDGPU-Pro OpenCL. I have yet to test the newer ROCm OpenCL yet.