Linux 5.2+ Hit By AVX Register Corruption Bug - Affecting At Least Golang Programs
The Linux 5.2 kernel and newer appears to be suffering from an AVX register corruption bug stemming from signal delivery. This register corruption issue is manifesting itself at least for Golang programs leading to a variety of bug reports when running on Linux 5.2 through at least the newly-minted Linux 5.4.
Golang developers have traced back many recent bug reports to being a corruption issue on Linux 5.2 and newer while being built with the GCC 9 compiler (using the GCC 8 compiler does not seem to bring this kernel bug to light). Golang developers and users are finding run-time errors over invalid memory addresses / pointer dereferences, segmentation violations, and related issues.
It's not a Golang bug and the developers have written a simple C test case doing basic AVX computations and bombards it with signals. With that they opened the upstream kernel bug report today. The original bisecting points the issue starting with this commit around the x86 FPU signal code from May and with other combinations of Linux 5.3 and 5.4 too when using GCC 9.
This super bug is tracking the corruption issues as they pertain to Golang with other individual bug reports having mentioned I/O errors, unexpected signals during runtime execution, random compile errors, and other panics. This issue has been deemed as a release blocker for the upcoming Go 1.14 release.
One workaround being considered at least depending upon how long it takes kernel developers to address the fundamental issue could end up being upstream Golang blacklisting current kernel versions from AVX usage to workaround the issue. Avoiding use of AVX, however, could lead to performance regressions.
Golang developers have traced back many recent bug reports to being a corruption issue on Linux 5.2 and newer while being built with the GCC 9 compiler (using the GCC 8 compiler does not seem to bring this kernel bug to light). Golang developers and users are finding run-time errors over invalid memory addresses / pointer dereferences, segmentation violations, and related issues.
It's not a Golang bug and the developers have written a simple C test case doing basic AVX computations and bombards it with signals. With that they opened the upstream kernel bug report today. The original bisecting points the issue starting with this commit around the x86 FPU signal code from May and with other combinations of Linux 5.3 and 5.4 too when using GCC 9.
This super bug is tracking the corruption issues as they pertain to Golang with other individual bug reports having mentioned I/O errors, unexpected signals during runtime execution, random compile errors, and other panics. This issue has been deemed as a release blocker for the upcoming Go 1.14 release.
One workaround being considered at least depending upon how long it takes kernel developers to address the fundamental issue could end up being upstream Golang blacklisting current kernel versions from AVX usage to workaround the issue. Avoiding use of AVX, however, could lead to performance regressions.
23 Comments