Announcement

Collapse
No announcement yet.

Btrfs For Linux 6.6 Brings Fixes, Partially Recovers From Scrub Performance Regression

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by Azpegath View Post

    Only btrfs. How is it able to fix it, if the same error are present on both mirrors?
    how could the error be on both mirrors?

    when btrfs detects that the data on disk A differs from the data on disk B it is able to detect which copy is correct and fix the other.

    Comment


    • #22
      Okay, a bit tangential, but I think somewhat relevant:

      With Red Hat bowing out from Btrfs stewardship, who's picking up the responsibility now? SuSE?

      Comment


      • #23
        Originally posted by pepoluan View Post
        Okay, a bit tangential, but I think somewhat relevant:

        With Red Hat bowing out from Btrfs stewardship, who's picking up the responsibility now? SuSE?
        what?
        that happened 6 years ago, and btrfs made huge progresses since.

        there are many company interested in btrfs.
        SuSE, Oracle and Facebook just to name a few, but also various hardware manufactures.

        Also, what does it means "picking responsibility"?

        Comment


        • #24
          Originally posted by user1 View Post

          Interesting. Do you remember having any issues with ext4 because of this? I'm currently on ext4 and I recently had at least 2 unclean restarts because Steam is very buggy after the big ui update. When the Steam client crashes, it sometimes freezes the whole graphical session, so I had to hard reboot as a result.

          I haven't experienced any issues or data loss as a result, but I just wonder if I should run fsck after the unclean restart. Interestingly, I did see some additional boot messages after the unclean restarts, but the boot is too fast to read them.
          I was always fortunate with Ext4 and FSCK saving my ass. So, no, I don't remember having any issues with Ext4. That's the one that always Just Worked.

          Can you not CTRL+ALT+F# into another terminal session, log in, and reboot when that happens? REISUB? Or CTRL+ALT+Backaspace? I'd hate to think a crashed GUI requires a hard reboot. I miss the days when the key commands came enabled by default.

          Comment


          • #25
            Originally posted by cynic View Post

            Also, what does it means "picking responsibility"?
            I think they mean "who is going to set up and maintain it long-term" which we both know is currently SUSE, a bit of Red Hat, and a bit from others like Facebook and random users.

            Frankly, as far as long term maintenance and responsibility is concerned, I'm more worried about Bcachefs with its Bus Factor 1. IMHO, that's a genuinely good reason not to include something. Reiser5 and Edward Shishkin are in that same position.

            Comment


            • #26
              Originally posted by cynic View Post
              how could the error be on both mirrors?
              when btrfs detects that the data on disk A differs from the data on disk B it is able to detect which copy is correct and fix the other.
              Not sure. I'm trying to google the hell out of my issue, but all I'm finding is references to "bugs" that can cause these issues, so that might be a reason. But I'm not certain about what to do or if I'm really finding the right source.
              btrfs scrub says that everything is clean, but btrfs check is finding several issues, both with actual nodes as well as the cache. I emptied the cache, and that made those issues go away, but the real errors are still there.
              It could be caused by me running BtrfsWin from Windows. I've never actually modified anything on the file systems from Windows, but I believe they are automatically mounted as read/write, so that could have caused it.

              Originally posted by user1 View Post
              By "faults" you mean file system corruption?
              Btw, have you ever experienced something like a power cut or an unclean shutdown / restart for whatever reason and as result it caused issues on btrfs?
              Just curious because from reading various user's experiences with btrfs, it seems like a fairly common issue. Much more common at least compared to ext4.
              Yes, I mean error messages when running btrfs check. I have certainly experienced power cuts a few times, but not sure that is the cause of it. There seems to have been a bug in one Linux kernel that might have caused it, can't remember which one. It wasn't the issue that Michael has written about because I deliberately avoided that kernel version thanks to his articles on it.

              Originally posted by F.Ultra View Post
              did you mirror the 2 disks and then applied btrfs over it or did you run btrfs in raid 1c2 on them? Also how did you determine/discover that there are file system faults on all of them?
              ​I created one pure btrfs disk with subvolumes and then ran the command to mirror it to the other disk. I followed the Arch or Gentoo guide on how to do it, combined with btrfs documentation. I think I did it correctly, both volumes worked correctly when using them.

              I found the issue on my boot disk via a warning message during boot/dmesg. I then ran btrfs check on the disks, and discovered that all 3 of them contain errors.

              Comment


              • #27
                Originally posted by Azpegath View Post

                Not sure. I'm trying to google the hell out of my issue, but all I'm finding is references to "bugs" that can cause these issues, so that might be a reason. But I'm not certain about what to do or if I'm really finding the right source.
                btrfs scrub says that everything is clean, but btrfs check is finding several issues, both with actual nodes as well as the cache. I emptied the cache, and that made those issues go away, but the real errors are still there.
                It could be caused by me running BtrfsWin from Windows. I've never actually modified anything on the file systems from Windows, but I believe they are automatically mounted as read/write, so that could have caused it.
                ah, ok, got it.

                so it wasn't a disk issue (that btrfs would have managed) but a problem with btrfs metadata itself.

                this could depend on several factors. I havent seen any of this problem since long time, but they might occur, of course.
                having mounted with BtrfWin might be a cause, but who knows!

                usually btrfs developer are helpful trying to solve this issue and investigating the cause, so they can fix it.


                Comment


                • #28
                  Originally posted by Vorpal View Post

                  I had data corruption in ext4, because it doesn't checksum the data. It made it into backups without me noticing. I can only assume it was cosmic rays or something causing bit flips.

                  Never again using a file system that doesn't checksum data. Btrfs has been rock solid for me for a few years now.
                  Out of nowhere, several of my disks started experiencing corruption. Because I was using DM-RAID and DM-Integrity, most of them were automatically corrected, but some still made it up to BTRFS and were detected there.

                  It was very difficult to debug, because writing to the individual disks or even writing to multiple disks simultaneously didn’t have any issues. It turned out to be my PSU, and replacing it fixed the problem. There weren’t any other signs that the PSU was bad, and it was only on the suggestion of a mailing list that I tried it.

                  If I were an Anti-BTRFS-Bro, I’d blame BTRFS. If I had been using BTRFS for the RAID, I’d have blamed it even more. But the game was rigged from the start, and no software could have prevented data corruption. I wonder how many people are incorrectly blaming BTRFS for their difficult-to-diagnose hardware problems.

                  Comment


                  • #29
                    Originally posted by skeevy420 View Post

                    I was always fortunate with Ext4 and FSCK saving my ass. So, no, I don't remember having any issues with Ext4. That's the one that always Just Worked.

                    Can you not CTRL+ALT+F# into another terminal session, log in, and reboot when that happens? REISUB? Or CTRL+ALT+Backaspace? I'd hate to think a crashed GUI requires a hard reboot. I miss the days when the key commands came enabled by default.
                    When that issue happens, I can't do anything because Steam took down the gui session with it, so everything just becomes frozen. So no keyboard shortcuts help.

                    Regarding fsck, I wouldn't call myself a Linux noob, but I'm no expert either, so while I really want to do it after those 2 hard reboots caused by Steam, I already gave up multiple times because I can't find a clear step by step tutorial on how to properly do it. (I just know you need to reboot into safe mode or something). I really miss that on Windows you could just go to disk properties and check for filesystem errors without even rebooting.

                    Comment


                    • #30
                      Originally posted by Azpegath View Post

                      Not sure. I'm trying to google the hell out of my issue, but all I'm finding is references to "bugs" that can cause these issues, so that might be a reason. But I'm not certain about what to do or if I'm really finding the right source.
                      btrfs scrub says that everything is clean, but btrfs check is finding several issues, both with actual nodes as well as the cache. I emptied the cache, and that made those issues go away, but the real errors are still there.
                      It could be caused by me running BtrfsWin from Windows. I've never actually modified anything on the file systems from Windows, but I believe they are automatically mounted as read/write, so that could have caused it.



                      Yes, I mean error messages when running btrfs check. I have certainly experienced power cuts a few times, but not sure that is the cause of it. There seems to have been a bug in one Linux kernel that might have caused it, can't remember which one. It wasn't the issue that Michael has written about because I deliberately avoided that kernel version thanks to his articles on it.



                      ​I created one pure btrfs disk with subvolumes and then ran the command to mirror it to the other disk. I followed the Arch or Gentoo guide on how to do it, combined with btrfs documentation. I think I did it correctly, both volumes worked correctly when using them.

                      I found the issue on my boot disk via a warning message during boot/dmesg. I then ran btrfs check on the disks, and discovered that all 3 of them contain errors.
                      Note that many of the errors shown/discovered via btrfs check does not lead to actual data errors on the data stored, that btrfs scrub shows no errors tells that all your files are ok. Ext4, XFS and many other filesystems are full of these type of errors but their structure makes them non detectable so that is why it often looks like btrfs have errors (from btrfs check) vs the others.

                      I think that they can lead to loss of available space later on since I would guess that many of them are simply wrong ref counts on no longer used COW data (aka old versions of files are still kept hidden on disk since the reference count is wrong). Least stressful fix is to add new drives and do a file copy from the old to the new, riskier fix is to boot via livecd and do check+repair on the unmounted device.

                      Comment

                      Working...
                      X