AMDGPU LLVM Backend Prepares FP8 Instructions For GFX940 / Next-Gen CDNA
In addition to AMD engineers being busy working on RDNA3 graphics support for their open-source Linux graphics driver stack, concurrently they have also been working on enabling GFX940 as their next-gen CDNA part, presumably to launch as the AMD Instinct MI300 if traditions hold.
Since March we have been seeing AMD publish various GFX940 patches to LLVM for their AMDGPU shader compiler back-end and with their block-by-block approach have been upstreaming various elements of the next-gen CDNA part into the upstream Linux kernel for the AMDGPU kernel driver.
We've already seen some interesting elements for their next-gen accelerator like WMMA for mixed precision matrix multiplication and accumulation operations for GPU matrix cores. New floating point atomic instructions have also been added.
Merged today for LLVM Git are native FP8 instructions being introduced with GFX940. This next-gen professional accelerator is bringing native FP8 and BF8 instructions for the GPU to help in AI / neural network performance.
NVIDIA's GH100 Hopper architecture has eight-bit FP8 floating point support as well and similarly Intel's new Habana Labs Gaudi2 has native FP8 format support. FP8 for the Instinct MI300 was previously rumored but now pretty much summed up in seeing the FP8 instructions come to the AMDGPU LLVM back-end. FP8 will become more important for AI workloads moving forward.
So far these three commits as of writing have begun plumbing the FP8 instruction support into the AMDGPU shader compiler back-end and preparing conversion support from other formats.
This GFX940 work is happening in LLVM Git for LLVM 15.0 that will be released as stable in September while the feature freeze / branching starts next week.
Since March we have been seeing AMD publish various GFX940 patches to LLVM for their AMDGPU shader compiler back-end and with their block-by-block approach have been upstreaming various elements of the next-gen CDNA part into the upstream Linux kernel for the AMDGPU kernel driver.
We've already seen some interesting elements for their next-gen accelerator like WMMA for mixed precision matrix multiplication and accumulation operations for GPU matrix cores. New floating point atomic instructions have also been added.
Merged today for LLVM Git are native FP8 instructions being introduced with GFX940. This next-gen professional accelerator is bringing native FP8 and BF8 instructions for the GPU to help in AI / neural network performance.
NVIDIA's GH100 Hopper architecture has eight-bit FP8 floating point support as well and similarly Intel's new Habana Labs Gaudi2 has native FP8 format support. FP8 for the Instinct MI300 was previously rumored but now pretty much summed up in seeing the FP8 instructions come to the AMDGPU LLVM back-end. FP8 will become more important for AI workloads moving forward.
FP8 enablement has begun for the GFX940 GPU with LLVM.
So far these three commits as of writing have begun plumbing the FP8 instruction support into the AMDGPU shader compiler back-end and preparing conversion support from other formats.
This GFX940 work is happening in LLVM Git for LLVM 15.0 that will be released as stable in September while the feature freeze / branching starts next week.
Add A Comment