Announcement

Collapse
No announcement yet.

Linux 5.2+ Hit By AVX Register Corruption Bug - Affecting At Least Golang Programs

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FireBurn
    replied
    Originally posted by pomac View Post
    With:
    gcc (Gentoo 9.2.0-r2 p3) 9.2.0
    Copyright (C) 2019 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    On 5.4.0

    ---
    ./a.out
    input = bb cb 00 00 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
    output = 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    mismatch
    child process failed
    ---


    Bugger

    Leave a comment:


  • pomac
    replied
    With:
    gcc (Gentoo 9.2.0-r2 p3) 9.2.0
    Copyright (C) 2019 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions. There is NO
    warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

    On 5.4.0

    ---
    ./a.out
    input = bb cb 00 00 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f 10 11 12 13 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f
    output = 00 00 80 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    mismatch
    child process failed
    ---



    Leave a comment:


  • mlau
    replied
    I ran the reproducer a few times on 5.3.13 built with gcc-9.2.1, but it didn't trigger anything. Maybe it's already fixed in gcc or the latest -stable kernels, or requires a special kernel config option I didn't set.
    EDIT: it requires CONFIG_RETPOLINE=y to trigger.
    Last edited by mlau; 28 November 2019, 03:27 AM.

    Leave a comment:


  • jabl
    replied
    Originally posted by HadrienG View Post
    Not sure if it's a compiler bug, from a look at the code it could also be a missing compiler optimization barrier in the kernel. Cohabitation of optimizing compilers that cache the contents of memory in registers, with low-level constructs like context switches that randomly change the contents of memory without warning the compiler about it, is fundamentally hard...
    Indeed, from the bug report it seems that GCC 9 just optimizes better, managing to keep a variable in a register instead of reloading it from memory, which causes the bug. Seems like a bug in the kernel, missing compiler barrier like you write, or lack of READ_ONCE() macro use to force reloading the variable, or such.

    Leave a comment:


  • HadrienG
    replied
    Originally posted by zxy_thf View Post
    According to https://bugzilla.kernel.org/show_bug.cgi?id=205663#c1 this looks like a compiler bug to me.
    Not sure if it's a compiler bug, from a look at the code it could also be a missing compiler optimization barrier in the kernel. Cohabitation of optimizing compilers that cache the contents of memory in registers, with low-level constructs like context switches that randomly change the contents of memory without warning the compiler about it, is fundamentally hard...

    Leave a comment:


  • wizard69
    replied
    Originally posted by CommunityMember View Post
    So, does the problem occur if the kernel is compiled with clang/llvm? (ducks for cover to avoid the incoming...)
    Probably not. However there is nothing unusual in these sorts of bugs getting through. An error like this highlights why having alternatives is so important.

    Leave a comment:


  • KrissN
    replied
    Originally posted by zxy_thf View Post
    According to https://bugzilla.kernel.org/show_bug.cgi?id=205663#c1 this looks like a compiler bug to me.
    Not necessarily. The existing code may have been relying on undefined behaviour. The compiler may have the full right to optimize this and just by the fact that GCC 8 did not do it, doesn't mean that GCC 9 is buggy.

    Compilers are under constant pressure of improving their optimizations and the consequence is that they more strictly follow the C/C++ standard as to what they are allowed to optimize and what not. Buggy code that assumes the compiler will compile it in a certain way, while the standard doesn't give that guarantee often becomes a victim of newer compilers.

    Leave a comment:


  • muncrief
    replied
    Oh yikes! I just read the bug report and it's pretty ugly. Please let us know when a patch is released Michael, I'm running 5.4 compiled with GCC9 even as we speak.

    Leave a comment:


  • CommunityMember
    replied
    Originally posted by zxy_thf View Post
    According to https://bugzilla.kernel.org/show_bug.cgi?id=205663#c1 this looks like a compiler bug to me.
    So, does the problem occur if the kernel is compiled with clang/llvm? (ducks for cover to avoid the incoming...)

    Leave a comment:


  • heliosh
    replied
    Kudos to the golang-devs for tracking this down.

    Leave a comment:

Working...
X