Announcement

Collapse
No announcement yet.

Btrfs Gets Fixes For Linux 4.9, Linux 4.10 To Be More Exciting

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Hello

    I used BTRFS for 5 years. I just received (literally) a shiny 10TB hard drive, and I'm contemplating going back to ext4 with it.

    I used linux for four times that long, and i have to say in 5 years i ran into more kernel bugs or troubles with BTRFS than with any other part of the kernel combined. On different computers. I'm probably a human BTRFS bug magnet, but luckily no data corruption yet. With the most recent one being :

    - "Stable kernel version 3.19.1 to 3.19.4 can cause a deadlock at mount time" -> got it. From a default ubuntu 15.04 install, and afaik was never fixed in this ubuntu version. Understanding what was wrong was the first part of the fun, running around trying to mount disks of one computer on an other to fix the problem was the next one. Reproducible with almost every power-failure or "violent" reboot.
    - "btrfs-transacti taking up a lot of CPU time" -> also got it. It was completely horrible in ubuntu 16.04. tried most suggested workaround (autodefrag mount, balance...). On an half full 512Gb ssd, i had updatedb completely freeze the system for 4 minutes every time it ran.
    - I just learned a few month ago about "The parity RAID code has multiple serious data-loss bugs". The red warning was only added to the btrfs wiki in July 2016, and before June the only thing you could read was "From 3.19, the recovery and rebuild code was integrated. This brings the implementation to the point where it should be usable for most purposes. Since this is new code, you should expect it to stabilize over the next couple of kernel releases."

    Although the first problem is fixed and ubuntu 16.10 bring a strong relief for the 2nd problem in normal workloads in my experience, all of those have been reported in the last 2 years after btrfs had been marked stable.

    I don't really care for the annoyance I have been through with it (if I'm using cutting edge, I can't really complain being cut But what annoy me is that in 2007 there was this promising filesystem that was super-safe because checksumming and super-fast because B-tree... But almost 10 years latter my user experience made me lose my faith in it's ability to converge toward a state of "convenient robustness" you would expect from a filesystem and not a toy research project. At least with reiserfs - also once a promising filesystem - it's "time of death" was quite clear. But for btrfs we don't get that (and oracle execs are not even in jail so we don't even get that either to comfort us

    Is BTRFS the future ? I would be interested to know the opinions of more knowledgable people. (kernel developers ?)

    Comment


    • #52
      Originally posted by GreatEmerald View Post

      Uh, what? I'm running Btrfs in RAID1 and all the partitions have their own UUIDs. I have no idea why yours don't. (I hope you didn't block-copy one to the other or something; that's a very bad idea.)
      Hmm, I did not do a block copy - I just assumed that this was the default behaviour since that's how mdraid works.

      But even if you change one partitions uuid via "btrfstune -u /dev/sdx[0-9]" it changed the uuid on both disks?

      Comment


      • #53
        Uhm, I always thought btrfs has the same UUID per volume (same as mdraid), but with a different partition UUID.

        Here is my setup, sda2 and sdb2 are a btrfs raid1, sda3 and sdb3 are another btrfs raid1, sda4 and sdb4 are a mdraid swap partition, sda1 and sdb1 are "raw partitions" (unformatted, they were a bootloader partition). EFI partition is on a USB flash drive that also houses a debian system I use for recovery purposes (as I had the space in there so why not).

        Code:
        sudo blkid
        /dev/sda2: LABEL="openSUSE_xeon" UUID="cad1ea22-f9ad-4a15-8a2e-e352e1820fe8" UUID_SUB="4bc9e0d5-d3c9-4213-8556-72f7d79416b3" TYPE="btrfs" PARTUUID="1f5950ac-cb2c-4467-a0e7-fceb4692f732"
        /dev/sda3: LABEL="data_xeon_btrfs" UUID="98cf4153-e227-4fb7-95d6-9fcc19620c89" UUID_SUB="572793c2-41d2-47e6-941b-fa9a4a565e6e" TYPE="btrfs" PARTUUID="a9b7f7c5-1566-49ca-8715-1a93bb664dcf"
        /dev/sda4: UUID="063d7f3a-0905-3cc3-3c33-770b40a8643c" UUID_SUB="04a31e21-8ae0-f177-6415-18076906b6fc" LABEL="any:swap" TYPE="linux_raid_member" PARTLABEL="primary" PARTUUID="2274983c-4f8d-4a1a-8d8f-96f2e74b49f4"
        /dev/sdb2: LABEL="openSUSE_xeon" UUID="cad1ea22-f9ad-4a15-8a2e-e352e1820fe8" UUID_SUB="f681c637-2b03-4f03-85de-b0349b6268ee" TYPE="btrfs" PARTUUID="59aa8b68-6a45-48d8-b28d-e69d4c149428"
        /dev/sdb3: LABEL="data_xeon_btrfs" UUID="98cf4153-e227-4fb7-95d6-9fcc19620c89" UUID_SUB="bc713b17-3085-47b9-808a-8b4be20230bb" TYPE="btrfs" PARTLABEL="primary" PARTUUID="4198aff7-6622-4b06-a5a9-debb9f261930"
        /dev/sdb4: UUID="063d7f3a-0905-3cc3-3c33-770b40a8643c" UUID_SUB="c20d649d-13c7-9ea2-da09-637a7ae44f43" LABEL="any:swap" TYPE="linux_raid_member" PARTLABEL="primary" PARTUUID="96bbc76c-048a-4d0e-96a1-5265759ca18f"
        /dev/sdc1: UUID="e38a4fb8-d9b0-49f5-9672-009a6aec7e05" UUID_SUB="43a658fb-6826-49f2-874d-c85bbf1a8915" TYPE="btrfs" PARTUUID="73223ff1-f145-4aff-ba08-4bcfa860c368"
        /dev/md127: LABEL="swap_RAID" UUID="54cf0542-2142-471d-895b-76d6e42e2a33" TYPE="swap"
        /dev/sdd1: LABEL="USB-EFI" UUID="BFF6-E8DB" TYPE="vfat" PARTUUID="24e12e24-52ac-4dab-85c1-efa7e90e752e"
        /dev/sdd2: LABEL="debian-recovery" UUID="399bd837-302f-4d54-bdf4-78cdfa8e4a6e" TYPE="ext4" PARTLABEL="debian-recovery" PARTUUID="72c2aaec-fef8-4a8d-88a4-9115666c0373"
        /dev/sda1: PARTUUID="50004411-4e25-4248-928b-c1129d789ff9"
        /dev/sdb1: PARTUUID="05f8c765-3328-4275-8927-5a031d432dd9"

        Comment


        • #54
          Originally posted by starshipeleven View Post
          Uhm, I always thought btrfs has the same UUID per volume (same as mdraid), but with a different partition UUID.

          Here is my setup, sda2 and sdb2 are a btrfs raid1, sda3 and sdb3 are another btrfs raid1, sda4 and sdb4 are a mdraid swap partition, sda1 and sdb1 are "raw partitions" (unformatted, they were a bootloader partition). EFI partition is on a USB flash drive that also houses a debian system I use for recovery purposes (as I had the space in there so why not).

          Code:
          sudo blkid
          /dev/sda2: LABEL="openSUSE_xeon" UUID="cad1ea22-f9ad-4a15-8a2e-e352e1820fe8" UUID_SUB="4bc9e0d5-d3c9-4213-8556-72f7d79416b3" TYPE="btrfs" PARTUUID="1f5950ac-cb2c-4467-a0e7-fceb4692f732"
          /dev/sda3: LABEL="data_xeon_btrfs" UUID="98cf4153-e227-4fb7-95d6-9fcc19620c89" UUID_SUB="572793c2-41d2-47e6-941b-fa9a4a565e6e" TYPE="btrfs" PARTUUID="a9b7f7c5-1566-49ca-8715-1a93bb664dcf"
          /dev/sda4: UUID="063d7f3a-0905-3cc3-3c33-770b40a8643c" UUID_SUB="04a31e21-8ae0-f177-6415-18076906b6fc" LABEL="any:swap" TYPE="linux_raid_member" PARTLABEL="primary" PARTUUID="2274983c-4f8d-4a1a-8d8f-96f2e74b49f4"
          /dev/sdb2: LABEL="openSUSE_xeon" UUID="cad1ea22-f9ad-4a15-8a2e-e352e1820fe8" UUID_SUB="f681c637-2b03-4f03-85de-b0349b6268ee" TYPE="btrfs" PARTUUID="59aa8b68-6a45-48d8-b28d-e69d4c149428"
          /dev/sdb3: LABEL="data_xeon_btrfs" UUID="98cf4153-e227-4fb7-95d6-9fcc19620c89" UUID_SUB="bc713b17-3085-47b9-808a-8b4be20230bb" TYPE="btrfs" PARTLABEL="primary" PARTUUID="4198aff7-6622-4b06-a5a9-debb9f261930"
          /dev/sdb4: UUID="063d7f3a-0905-3cc3-3c33-770b40a8643c" UUID_SUB="c20d649d-13c7-9ea2-da09-637a7ae44f43" LABEL="any:swap" TYPE="linux_raid_member" PARTLABEL="primary" PARTUUID="96bbc76c-048a-4d0e-96a1-5265759ca18f"
          /dev/sdc1: UUID="e38a4fb8-d9b0-49f5-9672-009a6aec7e05" UUID_SUB="43a658fb-6826-49f2-874d-c85bbf1a8915" TYPE="btrfs" PARTUUID="73223ff1-f145-4aff-ba08-4bcfa860c368"
          /dev/md127: LABEL="swap_RAID" UUID="54cf0542-2142-471d-895b-76d6e42e2a33" TYPE="swap"
          /dev/sdd1: LABEL="USB-EFI" UUID="BFF6-E8DB" TYPE="vfat" PARTUUID="24e12e24-52ac-4dab-85c1-efa7e90e752e"
          /dev/sdd2: LABEL="debian-recovery" UUID="399bd837-302f-4d54-bdf4-78cdfa8e4a6e" TYPE="ext4" PARTLABEL="debian-recovery" PARTUUID="72c2aaec-fef8-4a8d-88a4-9115666c0373"
          /dev/sda1: PARTUUID="50004411-4e25-4248-928b-c1129d789ff9"
          /dev/sdb1: PARTUUID="05f8c765-3328-4275-8927-5a031d432dd9"
          Mines the same - duplicate volume uuid, differen't part uuid:

          Code:
          /dev/sdb1: LABEL="raid_hell01-serv01" UUID="b45edea3-1014-4284-976b-8221d3ffeb70" UUID_SUB="9d9a504f-1238-4e6e-b674-71a8a7c4e373" TYPE="btrfs" PARTLABEL="Linux filesystem" PARTUUID="e6999f76-0010-4626-8c60-d044f2159b64"
          /dev/sdc1: LABEL="raid_hell01-serv01" UUID="b45edea3-1014-4284-976b-8221d3ffeb70" UUID_SUB="1620cf1e-953c-4770-8b22-bb7e6f3d40f5" TYPE="btrfs" PARTLABEL="Linux filesystem" PARTUUID="1ed9b20f-d437-4a10-939a-5cfaab7e868d"
          But systemd is complaining about the volume uuid ("b45edea3-1014-4284-976b-8221d3ffeb70"") :
          Code:
          Dev dev-disk-by\x2duuid-b45edea3\x2d1014\x2d4284\x2d976b\x2d8221d3ffeb70.device appeared twice with different sysfs paths.
          Don't really know where to go from here - it causes boot to stall often due to timeout and it actually causes the anaconda installer to hang if doing an OS reinstall; only resolution for the latter is to unplug the drives during install. I'd really like to drum up some attention to the bug on bugzilla because it seems to me like a highly reproducible bug - atleast on Fedora.

          Comment


          • #55
            Originally posted by dcrdev View Post
            But systemd is complaining about the volume uuid ("b45edea3-1014-4284-976b-8221d3ffeb70"") :
            Code:
            Dev dev-disk-by\x2duuid-b45edea3\x2d1014\x2d4284\x2d976b\x2d8221d3ffeb70.device appeared twice with different sysfs paths.
            Don't really know where to go from here - it causes boot to stall often due to timeout and it actually causes the anaconda installer to hang if doing an OS reinstall; only resolution for the latter is to unplug the drives during install. I'd really like to drum up some attention to the bug on bugzilla because it seems to me like a highly reproducible bug - atleast on Fedora.
            Googling around for the error string, it seems on systemd github they did fix (again) a similar bug the first week of April, that was a copycat of an older bug https://github.com/systemd/systemd/issues/2677
            Hello, I have seen this error message during system boot for about 2 days, and I couldn't find a solution. I tried to downgrade systemd to 228 and boot up Linux-lts (4.1.18-1), but it didn't help. ...


            What is your systemd version? It seems systemd 230 should fix that.

            I'm on systemd 210, btw, and I have no such issues.
            Code:
            root@openSUSE-xeon:~> systemd --version
            systemd 210
            +PAM +LIBWRAP +AUDIT +SELINUX -IMA +SYSVINIT +LIBCRYPTSETUP +GCRYPT +ACL +XZ +SECCOMP +APPARMOR
            Given the bug reports above, my systemd is too old to even have the first appearence of this bug (that was in 2015) as it was released like in 2014.

            Comment


            • #56
              Originally posted by starshipeleven View Post
              Uhm, I always thought btrfs has the same UUID per volume (same as mdraid), but with a different partition UUID.

              Here is my setup, sda2 and sdb2 are a btrfs raid1, sda3 and sdb3 are another btrfs raid1, sda4 and sdb4 are a mdraid swap partition, sda1 and sdb1 are "raw partitions" (unformatted, they were a bootloader partition). EFI partition is on a USB flash drive that also houses a debian system I use for recovery purposes (as I had the space in there so why not).
              Hm, yes, that's true:
              Code:
              /dev/sda1: LABEL="Root" UUID="0e65a112-64b5-450a-896e-75ccf2c59eb0" UUID_SUB="1d684553-7c0e-44fe-b38c-d3ac1a89519a" TYPE="btrfs" PARTUUID="534bdda7-2653-4ea
              7-89dc-8e93ac7eb639"
              /dev/sda2: LABEL="Swap2" UUID="b70b5792-0dbf-49ec-9144-3e1cad2d64bc" TYPE="swap" PARTUUID="b39abcdd-0cea-40fd-b96b-207ae1e093b3"
              /dev/sdb1: LABEL="BOOT" UUID="2B90-3ECB" TYPE="vfat" PARTUUID="32a0722c-e09a-40b7-95bd-f34b60ab5ba2"
              /dev/sdb2: LABEL="Root" UUID="0e65a112-64b5-450a-896e-75ccf2c59eb0" UUID_SUB="ceda8bb3-6fa7-4525-a1bf-9790dccdbd58" TYPE="btrfs" PARTUUID="ea8e367c-ac1a-480
              1-b0b0-c265c357193e"
              /dev/sdb3: LABEL="Swap" UUID="7d138caa-440b-4f3b-8a83-cd0d4715babd" TYPE="swap" PARTUUID="625d4fd3-42ea-4f01-9ed1-38f2a675caf1"
              I haven't seen that error and I'm on systemd 226. I guess it didn't appear until 229?

              Comment


              • #57
                Originally posted by GreatEmerald View Post
                I haven't seen that error and I'm on systemd 226. I guess it didn't appear until 229?
                It was there on 227 and 228 according to reports, but now that it is fixed (again) upstream it all depends from your distro as they can easily backport the one-line fix on whatever systemd they ship.

                Comment


                • #58
                  waaaah! unapproved post for GreatEmerald above

                  Comment


                  • #59
                    Not promising at all...
                    Folks at BTRFS did not hear about backward compatibility...



                    I recommend everyone here NOT to use btrfs in production. I manage over 200 servers but this had more issues than all together over the years. We were keep adding and adding drives into a large BTRFS10 array but had constant kernel panics, reboots, filesystem was keep going into read only all the time.

                    We have experimented with over 10 different kernel versions but 3.13 remained the most stable.

                    At the end we added 2 bigger disks (6TB) all others were 4TB because BTRFS "supposedly" should be able to handle these devices correctly and that effectively screwed up the array to a point where we were only able to mount it as degraded and only under 3.13, trying any 4.x kernel was a no go.

                    Now it's the same story from 4.8.11-> 4.9.3, it cannot even mount a simple RAID0 stripe with no redundancy with the following error:


                    97.521440] BTRFS: device label backup-temporary devid 1 transid 46058 /dev/sdj
                    [ 97.522362] BTRFS info (device sdj): use zlib compression
                    [ 97.522365] BTRFS info (device sdj): disk space caching is enabled
                    [ 97.522367] BTRFS info (device sdj): has skinny extents
                    [ 97.524239] BTRFS error (device sdj): failed to read chunk tree: -5
                    [ 97.561279] BTRFS error (device sdj): open_ctree failed

                    I could write a book by now from btrfs errors we had so many. Raid5/6 support still nowhere so when you create a RAID10 array you basically lose the capacity of half of your array due to redundancy but I still believe that this is the only way you can achieve some sort of stability with btrfs to keep your array relatively small.

                    +1 for:
                    rm -rf btrfs

                    Comment

                    Working...
                    X