Announcement

Collapse
No announcement yet.

The "EXT4 Corruption Issue" Has Been Fixed In Linux 4.20, Backport Pending To 4.19

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • The "EXT4 Corruption Issue" Has Been Fixed In Linux 4.20, Backport Pending To 4.19

    Phoronix: The "EXT4 Corruption Issue" Has Been Fixed In Linux 4.20, Backport Pending To 4.19

    The EXT4 file-system corruption issue on Linux 4.19 that also affected 4.20 development builds is now case closed for this pesky data corruption issue...

    http://www.phoronix.com/scan.php?pag....20-BLK-MQ-Fix

  • #2
    So, what happened there?

    Comment


    • #3
      Originally posted by dungeon View Post
      So, what happened there?

      That's the 960 EVO that died on me like a year ago+ and whenever I try to RMA it to Samsung, the RMA just keeps getting rejected... So got tired of it and using it for interesting pictures for articles.
      Michael Larabel
      https://www.michaellarabel.com/

      Comment


      • #4
        Michael you linked wrong patch.

        Comment


        • #5
          Originally posted by leonmaxx View Post
          Michael you linked wrong patch.
          Should be fixed now, thanks. Anzwix mangled the URL parsing.
          Michael Larabel
          https://www.michaellarabel.com/

          Comment


          • #6
            Disks on Linux 4.19+ were only vulnerable if using BLK-MQ and using no I/O scheduler.

            [...] Axboe commented, "Under a combination of circumstance, the direct issue path in blk-mq could corrupt data. This wasn't easy to hit, but the ones that are affected by it, seem to hit it pretty easily.
            Is there any distro/hardware where this would affect the "default" configuration, or is this a configuration that usually requires manual setup?

            Comment


            • #7
              Originally posted by dstaubsauger View Post

              Is there any distro/hardware where this would affect the "default" configuration, or is this a configuration that usually requires manual setup?
              Most Linux distros where the user is relying upon NVMe SSDs would be affected.
              Michael Larabel
              https://www.michaellarabel.com/

              Comment


              • #8
                Originally posted by dstaubsauger View Post

                Is there any distro/hardware where this would affect the "default" configuration, or is this a configuration that usually requires manual setup?
                I can't speak for all distros, but it's not that easy to enable. You have to add a specif kernel parameter, and modify a udev rule. So I'm going to say no. However, it is one of the most common performance tweaks, especially for users that experience io bottlenecks in certain situations.

                As someone that was affected by this bug, I'm not even mad. I think it's just a reminder to have good data audit practices (i.e. checksum tables, several backups, etc).

                Comment


                • #9
                  Is there a command to show the IO scheduler being used and if MQ-Buf is being used?

                  Comment


                  • #10
                    Originally posted by Michael View Post

                    Most Linux distros where the user is relying upon NVMe SSDs would be affected.
                    According to the bug, supposedly NVMe drives actually are NOT affected, which explains why my home desktop is fine, but my work laptop went crashing down in flames with both machines using the same exact kernel builds.

                    Originally posted by dstaubsauger View Post
                    Is there any distro/hardware where this would affect the "default" configuration, or is this a configuration that usually requires manual setup?
                    BLK-MQ is now used by default for the scsi subsystem on compatible hardware, which includes SATA SSDs. So it's entirely possible to be affected by this on 4.19.x. My work system is a ThinkPad T440p running Ubuntu 18.10 with default Ubuntu-built scheduler settings for the 480-500GB SATA SSD in it. I had upgraded to one of the ubuntu mainline kernel ppa builds of 4.19.1 and several times over the next few days my root FS went into read-only mode, and I had to fsck it from the recovery tools in the initrd in order to be able to boot it again. Once I figured out that 4.19 was to blame, I went back to 4.18.x, and things have been solid since then.

                    So yes, it's possible to get hit by this in a fairly distro-stock configuration.

                    Also, if anyone here is affected, either downgrade to 4.18, or append the following to your kernel command line to disable scsi's use of the blk-mq subsystem, " scsi_mod.use_blk_mq=0"
                    Last edited by Veerappan; 05 December 2018, 10:57 PM.

                    Comment

                    Working...
                    X