If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.
Announcement
Collapse
No announcement yet.
Radeon "GFX90A" Added To LLVM As Next-Gen CDNA With Full-Rate FP64
To support this hardware in Mesa OpenCL, this new stuff needs to be exposed in the libclc bitcode files in LLVM which, as far as I have seen the changes don't touch.
Heck, even my new Renoir laptop is unsupported in libclc+mesa-libOpenCL in Fedora 33, clinfo complained about a missing bc file/symlink.
Notably, gf907, gfx908 and gfx909 support symlinks are missing from libclc. This won't help with Fedora since it uses an outdated libclc, though.
After adding the missing symlinks though (gfx909 in my case), clinfo suddenly recognized my Renoir iGPU.
bridgmanmareko Please add the missing gfx generation support into the libclc subproject in llvm officially.
Last edited by zboszor; 20 February 2021, 03:30 AM.
I wonder how such an updated GFX9 card with 64 CUs with 8GB HBM2e would perform on 7nm TSMC process. Such a Vega 64 v3 could still be a decent gaming and prosumer card.
We know the answer: often worse than Radeon VII .
What you describe is a fully-enabled Vega 20 with 2 stacks of 1200 MHz HBM2e memory, instead of its 4 stacks of 1000 MHz HBM2 memory. If your goal is to have a 4k card, then you want that extra bandwidth and capacity. If you're running it at 1440p, then maybe you can do well enough with 600 MB/sec of BW, but the extra 4 CUs are just going to make it more likely that you end up BW-constrained.
I wish they would just use some of the otherwise defective CDNA chips, disable 3/4th of the cores and release them as some entry level CDNA card.
Basically like what Nvidia did with their Titan V. It was a fully-enabled GV100, except for one stack of HBM2 being disabled. They didn't even cripple the fp64!
Maybe even allow all working CUs to work and clock them down or nerf the cards in some other way, so enterprise customers don“t want to buy them.. Like restricting using multiple of these Cards in one system or the like..
Well, Radeon VII had half of its fp64 units disabled and its PCIe limited to PCIe 3.0. I don't know if the Radeon Pro VII supports over-the top connectivity, or if that's just the special Mac Pro version, but Radeon VII has no special Crossfire-like functionality.
That's exactly along the lines of my thinking, a card you can work with during the daytime and play with at night.
Won't happen. Radeon VII was something of a one-off. That's because Vega 20 was still a graphics chip, at heart. CDNA, on the other hand, looks destined to be a compute-only architecture. Arcturus has no 3D fixed-function units nor display controller blocks, and that looks to be the way they're going, with CDNA.
unlike the article implies CNDA != GCN any more than RDNA is GCN.
That's nonsense. CDNA is a direct descendant of GCN, whereas RDNA merely has a GCN-compatibility mode. RDNA is a distinctly different ISA with even a different SIMD width!
What you describe is a fully-enabled Vega 20 with 2 stacks of 1200 MHz HBM2e memory, instead of its 4 stacks of 1000 MHz HBM2 memory. If your goal is to have a 4k card, then you want that extra bandwidth and capacity. If you're running it at 1440p, then maybe you can do well enough with 600 MB/sec of BW, but the extra 4 CUs are just going to make it more likely that you end up BW-constrained.
I would take the higher bandwith of 4 HBM2e stacks as a consumer, too. And you are certainly right that such a configuration would make more sense for prosumer and workstation users from now on. As AMD reworked and refined the CUs of Vega in Renoir, I'd like to see the impact of that in a higher end configuration with HBM memory. I also expected that by 2021 they would have figured out a way to drive costs down, but HBM3 is still not talked about in much detail.
You are right about Arcturus, but that product was designed without 3D fixed-function units and display units as it was meant as an HPC accelerator card, it doesn't mean that all future CDNA-based products will lack these features. In fact, if they wanted to target the workstation market, they would need to bring something which I outlined above to combat Nvidia in more markets.
believe it or not but they work on this right now... but even better than you think.
Zen4 in 5nm + RDNA3 + Xilinx FPGA + HBM3 +Infinity cache +SSD all conected with xGMI
but be sure "dual socket" this will not have a socket at all. soldered directly to the board
also 200 watt?... wrong such a mainboard will be ~600watt and all water cooled.
but no water will be used instead 3M Novec LIQUID is used.
I know a tleast some of what you saying has been reported else where in particular the way FPGA will be integrated. Do you have any links for the rest? I would like to give it a read.
Just remember AMD is hiring talent right now and has been since they cleaned up their financial mess. It takes awhile to do things right. Being an RDNA card owner I can say that I'm a bit disappointed that ROCm seems to never come. On the other hand I understand what they are trying to do with the resources they have.
I had high hopes that usable openCL would make it into the Fedora 34 release when they finally updated the rocm-runtime in Koji and the package maintainer said he was going to package the other parts needed to make it work. Now it appears to be abandonware. I have to be able to log into my desktop as 3 different people. Closing and opening all that to reboot into Redhat is a real pain in the ass. AMD wrote all the needed bits long ago and it has been opensourced but I guess Redhat is waitiing to get paid by AMD to package their stuff so spend all their time sucking up to NVidia instead. I will never go Ubuntu but if Debian can make it all work I will be forced to switch. In 2021 any OS with out working compute is just a toy. I'm shocked Redhat doesn't get that.
Comment