AMD Threadripper 3900 Series MCE Fix Queued In RAS/Core But Not Yet Mainlined
As noted in the launch day article, AMD developers proposed a patch days ahead of launch for addressing that MCE issue with the new Threadripper systems. Though given the Linux 5.4 kernel stable release was just days later immediately followed by the Linux 5.5 merge window, that fix has yet to be merged into the mainline kernel or back-ported to any stable series.
Last week though the patch was picked up into the ras/core tree. Though as of writing isn't in mainline. And it's a bit fun the AMD MCE patches continue to be routed through Borislav Petkov thanks to his currently employment by SUSE (where he continues work on the upstream kernel, including maintainership of various areas of the kernel) where as he previously was an open-source AMD developer years ago before AMD closed their "operating system research center" when letting Linux developers go almost a decade ago... Given their current CPU successes, hopefully in 2020 we'll see them better ramping up their Linux support to avoid launch-day support blunders again. As it stands now, AMD's Linux engineering team is still very lean and tiny compared to Intel's Linux engineering resources.
Anyhow, unfortunately this patch hasn't been sent in as a "fix" for the ongoing Linux 5.5 cycle. Even with the patch fixing a boot issue and addressing "unexpected behavior" out of the system, the patch hasn't made it in yet nor does the commit message CC stable for immediate back-porting.
So with it quite possibly waiting until Linux 5.6, this is just a reminder for anyone that may find themselves having a new Threadripper system this holiday season that there is an easy workaround to boot the system in the interim of just disabling MCE on affected systems. If it doesn't land until Linux 5.6, hopefully distribution vendors like Ubuntu will take care of patching their kernel otherwise (Ubuntu 20.04 LTS will likely be riding Linux 5.5) to improve the out-of-the-box experience.