Linux 5.14 Lands Changes For On-Package HBM Xeons, More Intel CPUs With In-Band ECC
The Linux 5.14 RAS (Reliability, Availability and Serviceability) and EDAC (Error Detection And Correction) changes have landed with several improvements this time around on the Intel side.
RAS/EDAC changes this time around include the support for on-package high bandwidth memory (HBM) for Xeon Sapphire Rapids as previously covered on Phoronix and officially confirmed earlier this week by Intel that various Sapphire Rapids SKUs will indeed have HBM. The Xeon HBM bits here are around the EDAC support for memory error checking/reporting.
This pull request also includes adding Ice Lake Neural Network Processor for Deep Learning Inference (ICL-NNPI) support to the igen6_edac driver. Additionally, Tiger Lake and Alder Lake are added as well for supporting cases where the CPUs support in-band ECC. These SoCs including Alder Lake all have the same in-band ECC (IBECC) capabilities as the Elkhart Lake SoC but with the exception TGL / ADL now has two memory controllers rather than one.
Also on the Intel front is a fix for Xeon Ice Lake and Sapphire Rapids for reporting "near" and "far" devices for errors in 2LM configurations. There is also a fix to not attempt to load the Intel EDAC drivers when running as a virtualized guest.
The only AMD RAS/EDAC change this cycle worth noting is officially marking AMD's Yazen Ghannam as the maintainer for their EDAC drivers. Yazen has been working on the AMD RAS/EDAC code for years but just lacked the formality / official recognition of being properly recognized as the upstream driver maintainer.
As mentioned yesterday there are AMD EDAC preparations for AMD heterogeneous servers with Aldebaran multi-die GPUs but that work was published too late that will not land now until Linux 5.15.
The list of RAS/EDAC patches for Linux 5.14 can be found from this PR.
RAS/EDAC changes this time around include the support for on-package high bandwidth memory (HBM) for Xeon Sapphire Rapids as previously covered on Phoronix and officially confirmed earlier this week by Intel that various Sapphire Rapids SKUs will indeed have HBM. The Xeon HBM bits here are around the EDAC support for memory error checking/reporting.
This pull request also includes adding Ice Lake Neural Network Processor for Deep Learning Inference (ICL-NNPI) support to the igen6_edac driver. Additionally, Tiger Lake and Alder Lake are added as well for supporting cases where the CPUs support in-band ECC. These SoCs including Alder Lake all have the same in-band ECC (IBECC) capabilities as the Elkhart Lake SoC but with the exception TGL / ADL now has two memory controllers rather than one.
Also on the Intel front is a fix for Xeon Ice Lake and Sapphire Rapids for reporting "near" and "far" devices for errors in 2LM configurations. There is also a fix to not attempt to load the Intel EDAC drivers when running as a virtualized guest.
The only AMD RAS/EDAC change this cycle worth noting is officially marking AMD's Yazen Ghannam as the maintainer for their EDAC drivers. Yazen has been working on the AMD RAS/EDAC code for years but just lacked the formality / official recognition of being properly recognized as the upstream driver maintainer.
As mentioned yesterday there are AMD EDAC preparations for AMD heterogeneous servers with Aldebaran multi-die GPUs but that work was published too late that will not land now until Linux 5.15.
The list of RAS/EDAC patches for Linux 5.14 can be found from this PR.
Add A Comment