More AMD Heterogeneous System Patches Queued Ahead Of Linux 6.5
A few new AMD heterogeneous system patches have been queued via TIP.git ahead of the upcoming Linux 6.5 kernel merge window. These newest AMD Linux patches are focused on proper heterogeneous system enumeration for AMD data center systems sporting the Instinct MI200 and newer accelerators.
The AMD EDAC (Error Detection and Correction) driver is being extended with support for AMD Heterogeneous Family 19h Model 30h-3Fh processors. The 300+ lines of new code patch explains:
This follows other recent AMD Linux kernel patches for extending their EDAC driver for GPUs.
Another patch queued ahead of Linux 6.5 adds more documentation around the AMD heterogeneous system enumeration with EPYC CPUs and Instinct GPUs.
This also included other related patches to this work that have all been collected into TIP.git's ras/core branch ahead of the Linux 6.5 merge window opening in a few weeks. It's good seeing more of the AMD EPYC + Instinct heterogeneous compute capabilities working their way to the mainline Linux kernel.
The AMD EDAC (Error Detection and Correction) driver is being extended with support for AMD Heterogeneous Family 19h Model 30h-3Fh processors. The 300+ lines of new code patch explains:
"AMD Family 19h Model 30h-3Fh systems can be connected to AMD MI200 accelerator/GPU devices such that the CPU and GPU data fabrics are connected together. In this configuration, the CPU manages error logging and reporting for MCA banks located on the GPUs. This includes HBM memory errors reported from Unified Memory Controllers (UMCs) on the GPUs. The GPU memory errors are handled like CPU memory errors.
AMD CPU UMC support in EDAC can be re-used for GPU UMC support. However, keeping them separate means drastic changes in one path (e.g. to support newer products) should have less impact on the other path."
This follows other recent AMD Linux kernel patches for extending their EDAC driver for GPUs.
Another patch queued ahead of Linux 6.5 adds more documentation around the AMD heterogeneous system enumeration with EPYC CPUs and Instinct GPUs.
This also included other related patches to this work that have all been collected into TIP.git's ras/core branch ahead of the Linux 6.5 merge window opening in a few weeks. It's good seeing more of the AMD EPYC + Instinct heterogeneous compute capabilities working their way to the mainline Linux kernel.
Add A Comment