Announcement

Collapse
No announcement yet.

SUSE Reworking Btrfs File-System's Locking Code

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by duby229 View Post

    Where do I even start.....

    I know, lets start with dd.... go ahead dd a btrfs disk and see if it works anywhere else..... Try it.....After you -actually- try it we can start a real conversation about lvm.
    This is where some basic thought and knowledge come in handy. Ask yourself, how does the btrfs system find all of the pieces of a multi-device filesystem? Answer: UUIDs.
    Ask yourself, what are the UUID values of an original btrfs partition and a bit-identical copy of a btrfs partition? Answer: they are identical.

    If you remember ReiserFS, any version of it, it had a similar problem. If you created a disk image copy of a ReiserFS inside of another ReiserFS it would get extremely confused.
    Even EXT2 had some serious problems if fsck had to run on a drive that contained EXT images. The fsck tool would find data structures that didn't appear to be linked anywhere and it could drop the entire contents of the disk image into /lost+found.

    More info: https://superuser.com/questions/6073...rfs-filesystem

    Comment


    • #22
      Originally posted by Zan Lynx View Post

      This is where some basic thought and knowledge come in handy. Ask yourself, how does the btrfs system find all of the pieces of a multi-device filesystem? Answer: UUIDs.
      Ask yourself, what are the UUID values of an original btrfs partition and a bit-identical copy of a btrfs partition? Answer: they are identical.

      If you remember ReiserFS, any version of it, it had a similar problem. If you created a disk image copy of a ReiserFS inside of another ReiserFS it would get extremely confused.
      Even EXT2 had some serious problems if fsck had to run on a drive that contained EXT images. The fsck tool would find data structures that didn't appear to be linked anywhere and it could drop the entire contents of the disk image into /lost+found.

      More info: https://superuser.com/questions/6073...rfs-filesystem
      Yep, basic thought is damn sure right, too bad the btrfs devs never did....

      EDIT: Fsck bugs have existed and have been adequately fixed long ago. The same cannot be said for btrfs tools, yes fsck breaks btrfs, so does balancing btrfs, so does defragmenting btrfs, so does btrfs in Raid5/6, so does -any- attempt to use lvm on a system where a btrfs volume exists. The number of scenarios where btrfs breaks due to corrupted filesystem is just astounding. Almost anything you might need to do on btrfs will break it.
      Last edited by duby229; 06-10-2019, 09:39 AM.

      Comment


      • #23
        Originally posted by starshipeleven View Post
        I clone my btrfs drives all the time with dd. (technically I do it with pv, but that's just another tool that does raw bit-by-bit copy)
        There is a limitation when you do raw copies but it is not "it will not work anywhere else". Please be more specific so I know you know.
        Go ahead and try to mount that image, please do.... or better yet, write that image to another disk and mount it if you dare..

        So funny, lol...
        Last edited by duby229; 06-10-2019, 09:30 AM.

        Comment


        • #24
          Originally posted by duby229 View Post
          Go ahead and try to mount that image, please do.... or better yet, write that image to another disk and mount it if you dare..
          I said I do that all the time.

          Try to guess what little detail allows me to do it.

          Comment


          • #25
            Originally posted by duby229 View Post
            yes fsck breaks btrfs
            Not really. If btrfs can't fix the fs issues with a scrub (and in normal cases it can, since metadata is always redundant), then there is serious fs damage caused by btrfs bugs, fsck may or may not be able to deal with that until the developers add the functionality. For most bugs or issues that crop up they do add functionality to fsck.

            so does balancing btrfs,
            No.
            so does defragmenting btrfs,
            No.
            so does btrfs in Raid5/6,
            Still marked as unstable https://btrfs.wiki.kernel.org/index.php/Status
            so does -any- attempt to use lvm on a system where a btrfs volume exists
            What is this?

            Comment


            • #26
              LVM snapshots of a btrfs filesystem have exactly the same problem as a dd copy which should be obvious if you think about it.

              Startup scripts even need to be extremely careful bringing up MD RAID-1 volumes that contain btrfs.

              LVM and MD are simply NOT intended for use with btrfs. It already includes all of their functions.

              Comment


              • #27
                Originally posted by Zan Lynx View Post
                LVM snapshots of a btrfs filesystem have exactly the same problem as a dd copy which should be obvious if you think about it.
                Yeah I know that. With his phrasing, it seems that running LVM at all in the same PC (even in an unrelated drive/partition) will break btrfs for some reason.
                Last edited by starshipeleven; 06-10-2019, 11:15 AM.

                Comment


                • #28
                  Originally posted by boxie View Post
                  I have a 5 disk raid5 in another machine, and it works flawlessly on the same type of drives - but - it has a much different access pattern.
                  Well, RAID56 is even finished.
                  Do not expect much from it.
                  (though the *parity computation* itself shares code with MDADM and DM, the way it's plugged (or half-plugged in the current state) into BTRFS is different, obviously).


                  Originally posted by boxie View Post
                  The Raid10 setup also exhibited the same bad access latency.
                  Again, as I've said: BTRFS is currently bad at load balancing between RAID1 copies.

                  (Among other, becaue there is absolutely no direct correspondency between copy number and physical drive. Any 2 random drives in a set could be holding any copy as long as they are two different drives. That's completely different from how MDADM and DM handle it, where copy 1 goes on drive 1, and copy 2 goes on drive 2. By alternatively selecting copy 1 or 2 you spread your load between the two drives)

                  Currently, it's a very primitive PID based scheme, so it might accidentally queue all requests to copies, which all happen to be phyically held on drive 0, while drive 1 is twidling thumbs idle.

                  Also, keep in mind that BTRFS will ALWAYS perform additionnal checks (like cheksum) unpon read of anything from the drive (and is using CRC, not something batshit insanely fast as xxhash64) that's also going to add some processing time until a read block is released to be used by your app.


                  Originally posted by boxie View Post
                  The common denominator is the SMR drives. I am guessing btrfs has not had tuning done on it yet for these drives
                  The main problem with shingled drive is the same as with NAND flash media: you can't straigh overwrite data inplace, you have read-modify-write cycles, where you need to overwrite a whole region to change a small bit of data somewhere (well, in theory. In practice you try to repack data somewhere else to diminish these cycles).

                  By design, log structured file systems (F2FS, UDF, etc.) and CoW file systems (BTRFS, ZFS, BCacheFS, etc.) never overwrite data inplace, always write an extra new copy somewhere else. (Some UDF have even alternate mode where it is impossible for physically overwrite in place: CD-R, DVD-R, etc.)

                  i.e.: by design they'll lead to a lot less read-modify-write cycles compared to classic (inplace overwriting) FS such as EXT4, XFS, etc.

                  shingled and flash are single only situation where BTRFS does need *less* tuning.

                  The draw back is the logical fragmentation: on CoW you'd need to traverse a giant labyrinth of pointers until you reach which extent has the latest version of some updated data. On log-structured filesystem the most often updated file might be spread over the whole filesystem log. (that's why F2FS uses RAM and is suitable for embed down to ARM CPUs in smartphone or router, but not down to microcontroller and not easy to implement on e.g. Arduino).

                  Maybe *that* specific part is eating CPU cycles (e.g.: if you're storing VM or databases (or downloading torrents) with CoW enabled on a BTRFS filesystem).

                  Comment


                  • #29
                    Originally posted by duby229 View Post
                    I know, lets start with dd.... go ahead dd a btrfs disk and see if it works anywhere else..... Try it.....
                    Been there, done that, it works.

                    Originally posted by starshipeleven View Post
                    (technically I do it with pv, but that's just another tool that does raw bit-by-bit copy)
                    hehe, seeing like-minded people. Also enjoying nice ascii-art progress bars ?

                    Originally posted by starshipeleven View Post
                    There is a limitation when you do raw copies but it is not "it will not work anywhere else". Please be more specific so I know you know.
                    I suspect that duby has tried to mount two btrfs systems with the same UUID on the same computer (which is a no-go), instead of using the DD image on a different computer as normally everyone does.



                    Comment


                    • #30
                      Originally posted by DrYak View Post
                      hehe, seeing like-minded people. Also enjoying nice ascii-art progress bars ?
                      Yep. Extremely convenient, so I know what the thing is doing instead of seeing a hung terminal command for 5 hours.

                      I suspect that duby has tried to mount two btrfs systems with the same UUID on the same computer (which is a no-go), instead of using the DD image on a different computer as normally everyone does.
                      Yes, that's the only case where there is a problem.

                      But even normal ext4/xfs/whatever partitions should not be cloned and then left like that. Fstab usually mounts them by UUID, if you leave 2 disk clones in a system, then you are going to randomly mount one or the other on boot.

                      Comment

                      Working...
                      X