The Anticipated Linux Driver Requirements For The Radeon Instinct MI50 / MI60 (Vega 20)
The Radeon Instinct MI60 is a beast with this being the first 7nm GPU/accelerator and its HBM2 memory promising memory bandwidths at 1TB/s, new deep learning instructions, PCIe 4.0 and xGMI/Infinity Fabric Link support, and 7.4 TFLOPS of double precision compute power.
While it looks to be a few months before these Vega 7nm (Vega 20) accelerators begin shipping to customers, here's a look at what we know about the Linux software support for what has already been in the works for many months.
Linux 4.20+ - With the current Linux 4.20 development cycle there is a lot of Vega 20 support code in place for the AMDGPU DRM driver, including for the xGMI interconnect and related bits. It appears Linux 4.20 will be the base requirement for having Radeon Instinct MI50/MI60 support, but that could shift depending upon if any workarounds or other unannounced additions are required. As with most new GPUs though, generally the newer the kernel the better. By the time these 7nm Vega cards hit the stores, Linux 4.21 could be out and will surely have more Vega 20 additions.
Mesa 18.3+ - While these cards are intended as accelerators for deep learning and HPC, should you want the Mesa driver components, the base support landed in Mesa 18.3 that will debut as stable in the weeks ahead. Like the kernel side though, you'll want the newest components for new GPUs and by the time these cards begin shipping will likely be Mesa 19.0.
LLVM 7.0 / 8.0 - LLVM 7.0 has initial Vega 20 bits, but the AMDGPU compiler back-end work has continued working its way upstream. This week there has been additions around the ECC memory, adding the formal product strings, and other additions in what will be released as LLVM 8.0 in early 2019. Generally for best performance, riding LLVM SVN is where it's at.
ROCm 2.0 - The most important user-space bit for these accelerators is the Radeon Open eCosystem stack. AMD confirmed this week ROCm 2.0 will ship before year's end. I'm quite excited for this next ROCm compute stack update in general and it will certainly be necessary for most MI50/MI60 deployments.
With the DL/HPC focus though, AMD will certainly have out a supported Radeon Software / AMDGPU-PRO release by the time these cards are shipping. This should suit the majority of the workstation users for having a supported driver stack that will work on Ubuntu 18.04 LTS, SUSE Enterprise Linux 15, and Red Hat Enterprise Linux 7 / CentOS 7. AMD also makes the ROCm compute stack easy to deploy on the major Linux distributions plus the fact that it is open-source and now easier to run thanks to the mainline AMDKFD kernel fusion driver support.
The open-source driver requirements are more pressing at launch for the consumer cards where users are likely running a variety of non-enterprise distributions not supported by the Radeon Software packages and where riding the Mesa/AMDGPU Git packages is often necessary for game workarounds and other OpenGL/Vulkan improvements. But for those curious how the open-source pre-launch support has been shaping up for Vega 20, now you know.
It's great that a majority of the bits at least are now in place months before the hardware is shipping and hopefully it will be that way for Navi in 2019. AMD has managed timely dGPU support now for a while and they are generally improving their punctual open-source support and functionality for launch day with each succeeding launch.