No announcement yet.

It's Looking Like The EXT4 Corruption Issue On Linux 4.19 Is Caused By BLK-MQ

  • Filter
  • Time
  • Show
Clear All
new posts

  • It's Looking Like The EXT4 Corruption Issue On Linux 4.19 Is Caused By BLK-MQ

    Phoronix: It's Looking Like The EXT4 Corruption Issue On Linux 4.19 Is Caused By BLK-MQ

    The saga about EXT4 file-system corruption on Linux 4.19 kernels that has increased in recent weeks might soon be drawing a close... This data corruption bug though is looking like it doesn't originate from within the EXT4 code at all...

  • #2

    Originally posted by phoronix View Post
    just that EX4 is the most common file-system and thus the most reports.


    • #3

      Not that i have CONFIG_SCSI_MQ_DEFAULT enabled, is likely a reason i couldn't reproduce anything


      • #4
        So mq-deadline is affected in 4.19 and on? I have it enabled for hdds, and disabled for nvme in Debian testing with 4.19.5 (but it's available for both).

        I also see this on Arch wiki:

        SSDs can handle many IOPS and tend to perform best with simple algorithm like noop or deadline while BFQ is well adapted to HDDs.
        So does it help NVMe SSDs or not?
        Last edited by shmerl; 12-04-2018, 09:34 PM.


        • #5
          I've been following the report since I first ran into the issue myself (ubuntu 4.19.1 kernel) on my work laptop.. glad to see that they think they've got a handle on it. I'll most likely wait until a fixed release is out before I re-upgrade from 4.18.20, but at least they seem to have figured it out.


          • #6
            Fix has been pushed:



            • #7
              Originally posted by ermo View Post


              • #8
                Looks like this mq thing is still highly experimental. First there were severe problems with pluggable USB drives (see warning here and now this.


                • #9
                  Experimental and still lacking.

                  From Documentation/admin-guide/cgroup-v2.rst:

                  "The "io" controller regulates the distribution of IO resources. This
                  controller implements both weight based and absolute bandwidth or IOPS
                  limit distribution; however, weight based distribution is available
                  only if cfq-iosched is in use and neither scheme is available for
                  blk-mq devices."


                  • #10
                    This problem has been happening to me... But only on one system, and only on its volume with bcache+btrfs... And checking all my systems, this is the only one with blk-mq enabled by default. Hopefully the issue that's been affecting people has been found, and hopefully I'm having the same issue which means it will be fixed for me too!