Radeon "GFX90A" Added To LLVM As Next-Gen CDNA With Full-Rate FP64
It looks like the open-source driver support to the next-generation CDNA GPU / MI100 "Arcturus" successor is on the way. Hitting mainline AMDGPU LLVM is a new "GFX90A" target adding new interesting features for compute.
The AMD GFX90A target is a big addition and was quickly and quietly merged this week... So much so that it generated some concerns and criticism in the review from other upstream LLVM developers that the merge request was just for a short time (a little more than one hour) before merging it, not allowing sufficient time for code review on such a large patch. To which one of the responses in return was over "we needed this upstreamed and no time was given to him to break it up into reasonably sized piece [across multiple patches that are easier for code review]." Code outside of the AMDGPU LLVM back-end isn't touched but understandably some of the upstream developers are put off by the rushed process not allowing for any open-source code review prior to landing such a massive addition.
GFX90A is another iteration of Vega/CDNA with GFX10 being the newer RDNA/RDNA2 graphics processors. The "Arcturus" support was under the GFX908 graphics name for what became the Radeon Instinct MI100 that launched last year.
With this new GFX90A among the differences are most FP64 instructions now being full-rate. AMD GPU FP64 arithmetic performance has tended to be half the rate of FP32 arithmetic but for this next-gen CDNA it looks like full-rate FP64 will be a big ticket feature.
The GFX90A target also adds a new "thread group split" (TgSplit) feature, additional matrix fused multiply add (MFMA) instructions, support for Data Parallel Primitives (DPP) extension, extended image intrinsics, and other changes from a quick look through the new code.
The GFX90A target amounts to around seventy thousand lines of new code for the AMDGPU LLVM compiler back-end, including test cases and some other redundant bits. But seeing full-rate FP64 and other additions make us all the more intrigued by this next-gen CDNA part. The new target landed this week by this commit.
So far we haven't seen any Linux kernel patches or other open-source driver enablement to go along with the new GFX90A compiler support but I'll be on the lookout as always. This should be a powerful compute card if the ROCm support is in good shape for the hardware and tackling modern workloads directly and via getting more CUDA code-bases migrated over to the Radeon Open eCosystem.
LLVM main is now tracking what will be LLVM 13.0 and not released as stable until the fall. So the rushed timing for landing GFX90A is a bit odd considering it was last month that the LLVM 12.0 branching happened and was already missed by the time this merge request was opened... Given that no AMDGPU kernel support has yet been published and also missed the now-open Linux 5.12 merge window, it won't be until Q3-2020 at least before this new GPU is well supported by the mainline Linux driver stack in stable/released form. Given the timing of the MI100 launch, it's likely this new GPU will debut late in the year or potentially even next year given how long the Arcturus/GFX908 support was baking. In any case with this being a workstation/HPC focused offering where the packaged Radeon Software for Linux driver is more commonly used atop enterprise Linux distributions, the timing of the upstream open-source support is less of a concern compared to the consumer Radeon GPU launches.