Announcement

Collapse
No announcement yet.

Some Users Have Been Hitting EXT4 File-System Corruption On Linux 4.19

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Hmmm... Happens to some but not others? Sounds an awful like a weird issue I'm having with btrfs+bcache on 4.19.

    I have a bunch of machines with btrfs and bcache, but only one of them is having corruption issues. It consistently happens on the one machine with 4.19 but none of the others. And only on that one box's filesystem. its root filesystem, which is just plain btrfs, is fine. The controller on this machine is new, the spinning disks are new, the SSD's are new, and this box is completely okay on 4.18, 4.17, etc.

    Fortunately, due to the bcache corruption issues I had with 4.14, where I was ultimately able to recover the filesystem itself, I had previously found and fixed issues in my nightly back up script where it turned out I wasn't actually backing everything up before my new issues with 4.19 and recover from my more serious corruption issues on 4.19.

    With my corruption issues on 4.19, I get random files failing on scrub... I'll complete a whole scrub where some files will fail. I'll restore them from back up, run a whole new scrub shortly after the previous scrub and recovery, and get whole new files with issues that were fine in the previous scrub...

    Going back to 4.18 leaves some issues behind and ultimately I have to recreate the filesystem from scratch and restore from backups.

    Recreating the filesystem from scratch under 4.19 still results in corruption.

    Comment


    • #22
      Originally posted by Brisse View Post

      It's times like this that I'm glad Dedian Sid is somewhat conservative with their kernel updates compared to for example Arch.
      Debian is in three official flavors. Stable, Testing and Unstable.
      Debian testing (buster) is on 4.18 , but the good thing about Debian (as long as people is reporting bugs) is that the 4.19 kernel would never automatically be migrated to testing in the first place, and it would definitively not migrate to stable which is a good thing. So for those that complain that Debian is ancient stuff, well - sometimes the stable stuff is a bit old - would you trust a friend you had for only a few weeks? programs are very much the same, the more you know them the safer it is. If you want to meet new people then go out a bit and meet sober people during the day (debian testing). If you don't care and just want to meet people - good or bad , get drunk and go to a party at night (debian unstable).

      http://www.dirtcellar.net

      Comment


      • #23
        I've been having some crashes recently and just this sunday got Ext4 corruption that prevented boot until fixed with fsck.

        Before that I had a crash followed by a savegame corruption for Two Point Hospital because it crashed during saving the game progress. The save was a tenth of usual size so obviously never finished writing the file. Other times I had firefox forceclose (not always crashing the OS) and a few system freezes on the internet and using handbrake to convert videos. Only similarity I can see between all cases is probably software writing to disk at that moment...

        Looking at a system log after last two crashes showed some errors with /dev/sdc1 (currently my root partition) which coincided with prior crashes too... Then the issue at boot showed a "ext4-fs error ext4_lookup deleted inode referenced” kind of error.

        So I think corruption and prior crashes maybe were caused by kernel errors with the SSD where my system is installed, not just that the corruption was caused by an unrelated crash. Of course this still doesn't mean Ext4 part of kernel is necessarily to blame... looks more like something is wrong with kernel handling the SSD?

        I'm not nearly an expert though... could be anything... I'm willing to help debug if anyone can guide me through the steps to produce useful logs and maybe even test some fix. Unfortunately so far issues came at random situations, no pattern that I can think of.

        I'm running Linux Mint 19 + Padoka Stable PPA + Kernel 4.19.4 (issue started on 4.19.0... or before that, have to confirm) on an AMD Phenom II x4 965BE CPU + AMD HD7770 GPU + 2x KINGSTON SV300S3 120GB SSD + 1x WD10EARS 1TB HDD + 2x G-Skill Ripjaws X 4GB 1600MHz CL7 (F3-12800CL7D-8GBXM) RAM + ASUS M5A78L-M PLUS/USB3 MOBO

        Comment


        • #24
          Originally posted by bitman View Post
          I tend to believe corruption really does come from outside of ext4 driver. 4.19 is a total wreck of a release. People report all kinds of problems. I myself was getting random freezes every few hours. I do not recall such a disastrous release.
          I seem to remember saying in these forums at some point, "blame GKH", in regards to something with 4.19....

          No, I do not have a crystal ball.

          Comment


          • #25
            Originally posted by dungeon View Post
            Debian Sid is on 4.18.20, but 4.19.5 is still in experimental repo of Debian

            I can't reproduce this too on Sid, with 4.19.5

            I tried 4.19.5 on Debian 8 LTS even and can't reproduce it there too, so no idea

            Seems like happen when you have CONFIG_EXT4_ENCRYPTION enabled, even when you use or don't use encryption or whatever

            https://lkml.org/lkml/2018/11/28/856
            I've been running 4.19.x on three Gentoo systems for several weeks now, with not problems. I just checked, and "# CONFIG_EXT4_ENCRYPTION is not set".

            Comment


            • #26
              This is separate from the ASRock motherboard issue?

              This is a forum powered by Web Wiz Forums. To find out about Web Wiz Forums, go to www.WebWizForums.com

              Comment


              • #27
                I think my decision long ago to switch to btrfs was right then :-) But I'm sure that in the next normal news about btrfs people will complain about its quality independent of the fact that issues exists in other file systems as well...

                Comment


                • #28
                  It sounds like it could be maybe some other kernel subsystem, could be a hardware driver, may be trashing ext4 subsystems memory. I wonder if there is a similarity in the hardware of the people who are effected, and if they are using SSDs or HDs, what controllers are being used, other hardware on their systems, etc, to see if there is a common element.

                  Comment


                  • #29
                    In my experience, I've found that the following file systems are the most stable:
                    FAT32
                    XFS
                    ZFS
                    ext2

                    I've been playing around with bcachefs, and my playing around I mean keeping several terabytes of important data on it. It has been very good -so far-. The backups of that data are on several XFS hard drives.

                    I've had NTFS corruption, ext4 corruption, btrfs corruption.

                    Comment


                    • #30
                      Rolling Release must be so awesome to force this kind of breakage on you right?

                      Comment

                      Working...
                      X