Bcachefs Reining In Bugs: Test Dashboard Failures Drop By 40% Over Last Month

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Quackdoc
    Senior Member
    • Oct 2020
    • 5000

    #11
    Great work with bcachefs, Im okay with downtime bugs, I'm not okay with loosing data, For this somehow bcachefs has already managed to be better then btrfs has been for me. I have had a few issues on my artix install, for some reason bcachefs fails to remount until I can get to the "continue system startup anyways" stage, and then it works perfectly fine, no bloody clue what that is about, but that's no more then a nuisance.

    I did have a real downtime bug the other day however when I upgraded kernels and had an unclean power off. for some reason fsck failed to kick in and would just hang, I boot an archlinux iso I had hanging around (maybe 2-3 kernels old now), ran bcachefs fsck, it ran fine, reboot PC, boot fsck ran fine, and I booted no problems, a bit weird, but hey, no data loss.

    Comment

    • Raka555
      Junior Member
      • Nov 2018
      • 675

      #12
      Originally posted by Quackdoc View Post
      Great work with bcachefs, Im okay with downtime bugs, I'm not okay with loosing data, For this somehow bcachefs has already managed to be better then btrfs has been for me. I have had a few issues on my artix install, for some reason bcachefs fails to remount until I can get to the "continue system startup anyways" stage, and then it works perfectly fine, no bloody clue what that is about, but that's no more then a nuisance.

      I did have a real downtime bug the other day however when I upgraded kernels and had an unclean power off. for some reason fsck failed to kick in and would just hang, I boot an archlinux iso I had hanging around (maybe 2-3 kernels old now), ran bcachefs fsck, it ran fine, reboot PC, boot fsck ran fine, and I booted no problems, a bit weird, but hey, no data loss.
      I am not okay with downtime or eating data, but I am okay with waiting another year

      Comment

      • curfew
        Senior Member
        • Aug 2010
        • 632

        #13
        Originally posted by Quackdoc View Post
        Great work with bcachefs, Im okay with downtime bugs, I'm not okay with loosing data, For this somehow bcachefs has already managed to be better then btrfs has been for me. I have had a few issues on my artix install, for some reason bcachefs fails to remount until I can get to the "continue system startup anyways" stage, and then it works perfectly fine, no bloody clue what that is about, but that's no more then a nuisance.

        I did have a real downtime bug the other day however when I upgraded kernels and had an unclean power off. for some reason fsck failed to kick in and would just hang, I boot an archlinux iso I had hanging around (maybe 2-3 kernels old now), ran bcachefs fsck, it ran fine, reboot PC, boot fsck ran fine, and I booted no problems, a bit weird, but hey, no data loss.
        Weird that you say btrfs is worse than that, but I've never encountered these issues. Last time I had an issue with btrfs was a couple years ago when I heavily undervolted my laptop and the system would crash during initrd rebuild, which resulted in total loss of all files modified during the system upgrade. Won't blame the fs because essentially it was a hardware failure.

        Comment

        • varikonniemi
          Senior Member
          • Jan 2012
          • 1072

          #14
          Originally posted by curfew View Post
          Weird that you say btrfs is worse than that, but I've never encountered these issues. Last time I had an issue with btrfs was a couple years ago when I heavily undervolted my laptop and the system would crash during initrd rebuild, which resulted in total loss of all files modified during the system upgrade. Won't blame the fs because essentially it was a hardware failure.
          Weird that you take your experience as some standard.

          What the question here was that there exists no documented case where bcachefs has eaten the user's data. BTRFS used to be notorious for eating data, even for trivial things like running out of disk space.

          And no, hardware failure should not eat the filesystem. At most the data that is being written when the failure happens. That's why critical data structures have multiple copies, like the superblock. So that you can always recover.
          Last edited by varikonniemi; 03 November 2024, 01:37 PM.

          Comment

          • lyamc
            Senior Member
            • Jun 2020
            • 520

            #15
            I like it how every time someone tries to demonstrate some form of data eating, all they do is demonstrate how reliable/robust the filesystem is.

            Comment

            • Quackdoc
              Senior Member
              • Oct 2020
              • 5000

              #16
              Originally posted by curfew View Post
              Weird that you say btrfs is worse than that, but I've never encountered these issues. Last time I had an issue with btrfs was a couple years ago when I heavily undervolted my laptop and the system would crash during initrd rebuild, which resulted in total loss of all files modified during the system upgrade. Won't blame the fs because essentially it was a hardware failure.
              The worst one I had the unfortunate experience of having a bad root btree issue that corrupted the entire drive minutes before a backup was scheduled causing me to loose a days worth of, very productive, work. But I have had many devices running btrfs and all of them without fail had data corruption in the end, last one was 2 years ago now I think. after that I was done, btrfs has marinated too long and I don't trust it

              Comment

              • noigai
                Junior Member
                • Mar 2023
                • 3

                #17
                Originally posted by lyamc View Post
                I like it how every time someone tries to demonstrate some form of data eating, all they do is demonstrate how reliable/robust the filesystem is.
                Yes, it's my opinion too, we shouldn't mix the two.

                I've had all my home storage in zfs since 2008 and after reading about actual disk corruption rates, I thought that checksumming everything was overkill.

                However, on all these years, I've had 3 cases of silent corruption on different computers which was detected by zfs and notified to me by email, and scrubs would always find wrong data.

                As I was away for few months, I was unable to troubleshoot the problem and zfs mirroring and checksumming allowed everything to work. When I finally got to troubleshoot the issue, it was fixed by replacing the SATA cables. No errors in kernel logs.

                At work, we used to have a backup on our NTFS san for "corruption prevention". I've always asked the tech team how would we detect corruption and the answer was "if some client reports wrong data". At the end of the project, as I was archiving, I saw two xml files of 12 and 33mb each, when should be few kb. The files were corrupt and unreadable, and yet nothing in the SAN or NTFS or Windows had detected anything.

                For me the situation is clear, even if disks work fine, there's always the possibility to have failure somewhere else, which is more common and detected by checksumming. So *my* data will always be in a checksummed file system.

                I really look forward to bcachefs being production ready and tested in few years time.

                Comment

                Working...
                X