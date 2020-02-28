A new version of the Radeon Open Compute "ROCm" stack is available today but it still doesn't deliver on Navi support.
Radeon Open Compute 3.1 is the new release that now versions its default installation directory structure, adds RAS support for 7nm Vega, and also introduces SLURM support.
The Reliability, Accessibility, and Serviceability capabilities are for HBM ECC memory error handling, GFX/MMHUB ECC errors, and PCIe uncorrectable errors. The RAS behavior should these uncorrectable errors happen is to perform a GPU reset using BACO. This 7nm Vega work is presumably under the microscope still for the Vega-based "Arcturus" compute accelerator coming this year.
The other new feature of ROCm 3.1 is SLURM support for AMD GPUs, the Simple Linux Utility for Resource Management. This cluster management and job scheduling system for Linux clusters can now interact with AMD GPUs. The SLURM support is a useful addition given the increasing number of super-computing wins with Radeon GPUs and other larger AMD GPU deployments.
The ROCm 3.1 downloads and more details via GitHub. Unfortunately there still are no signs of GFX10/Navi support.
