Announcement

Collapse
No announcement yet.

Btrfs Enjoys More Performance With Linux 6.3 - Including Some 3~10x Speedups

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by mdedetrich View Post
    I don't know what you mean by "mixes badly". CoW is typically represented as a tree like structure where nodes are added onto the tree which is how you avoid rewriting the same block of data. While its true that due to the nature of RAID 5/6 with a pure CoW its more difficult because you can't really be atomic over multiple hard drives, but this problem is solved for the same reasons that combining the block/file interfaces is beneficial.
    Cow simply means that you never write in place; this means that you need an external data structure that keeps track of the map logical <-> physical. One of the main benefit is that you can get atomicity (and snapshot) "near" for free.

    But this assumes that the data is always "old" or "new".

    Unfortunately the parity is shared between the blocks of the stripe. This means that you lose the atomicity property (it is near impossible to guarantee an atomic writing in different disks without external structures like journal).

    You can have the same property (atomicity) using a journal; but this is more expensive than a COW filesystem.

    This is the reason why COW and raid5/6 mix bad; because you need something like a journal, when it is the purpose of the COW avoiding a journal...

    Instead ZFS uses another technique, it puts the parity inside the extent, so it avoids the sharing of the parity between the different extent.

    Originally posted by mdedetrich View Post
    ZFS does write the stripe when using raidz (their variant of RAID 5/6), read https://pthree.org/2012/12/05/zfs-ad...part-ii-raidz/ and directly quoting
    However it doesn't solve all the issue. In particular if you want to write in the middle of an extent, you still have the same problem; to handle this there are two possibility
    a) you rewrite the all the portion of the extent 'protected' by the parity, but this has near the same cost of the COW over journal where you have to write all the data in the journal before putting the data in its final place; or
    b) you write the new data + parity in a dedicated new extent, and leaving the old extent untouched; this of course implies that you can't release the space unused in the old extent until you do a balance (or defragment).

    ZFS mitigates these issues by its cache (ARC).

    Comment


    • Originally posted by kreijack View Post

      Unfortunately the parity is shared between the blocks of the stripe. This means that you lose the atomicity property (it is near impossible to guarantee an atomic writing in different disks without external structures like journal).
      This again is not correct, as is stated in what I referenced previously ZFS guarantees that when using zraid that entire the state is entirely consistent or nothing happens at all which is the literal definition of atomic.

      So, in the event of a power failure, you either have the latest flush of data, or you don't. But, your disks will not be inconsistent.
      There's a catch however. With standardized parity-based RAID, the logic is as simple as "every disk XORs to zero". With dynamic variable stripe width, such as RAIDZ, this doesn't work. Instead, we must pull up the ZFS metadata to determine RAIDZ geometry on every read. If you're paying attention, you'll notice the impossibility of such if the filesystem and the RAID are separate products; your RAID card knows nothing of your filesystem, and vice-versa. This is what makes ZFS win.
      Further, because ZFS knows about the underlying RAID, performance isn't an issue unless the disks are full. Reading filesystem metadata to construct the RAID stripe means only reading live, running data. There is no worry about reading "dead" data, or unallocated space. So, metadata traversal of the filesystem can actually be faster in many respects. You don't need expensive NVRAM to buffer your write, nor do you need it for battery backup in the event of RAID write hole. So, ZFS comes back to the old promise of a "Redundant Array of Inexpensive Disks". In fact, it's highly recommended that you use cheap SATA disk, rather than expensive fiber channel or SAS disks for ZFS.
      I would re-read how ZFS actually works because it appears that you have some misconceptions, ZFS would not include zraid if the feature wasn't atomic as you described and ZFS also happens to be the most tested open source filesystem to actually verify this behaviour.

      Comment


      • Originally posted by mdedetrich View Post

        This again is not correct, as is stated in what I referenced previously ZFS guarantees that when using zraid that entire the state is entirely consistent or nothing happens at all which is the literal definition of atomic.
        [...]
        I would re-read how ZFS actually works because it appears that you have some misconceptions, ZFS would not include zraid if the feature wasn't atomic as you described and ZFS also happens to be the most tested open source filesystem to actually verify this behaviour.
        I wrote that sharing a parity between blocks prevent the atomicity.
        Of course there are mitigation that avoid that, like
        - journal
        - variable stripe width (zraid)

        I never wrote that ZFS cannot guarantee the coherency of the data, or that the zraid is bugged or it doesn't work. I wrote that ZFS to guarantee this coherency has to take some compromise (like not freeing a partially unused stripe or rewriting a fully stripe in case of change of an extent).

        I wrote that raid5/6 doesn't mix well with COW, because in contrast it was very natural to extend btrfs to a RAID1/DUP/RAID10/.....

        Comment


        • Originally posted by kreijack View Post
          ZFS mitigates these issues by its cache (ARC).
          I wish btrfs had a similar approach: store any new data as raid1c3 (or such) and then convert that into raid5/6 when balancing (or in the background or when reclaiming disk space).

          This would provide better write performance for fresh data that usually has shorter life-span (is more likely to overwritten right away) while still offering space optimization for old data.

          Comment


          • Originally posted by Berniyh View Post
            ext2, udf, thunderbolt, xfs. There's possibly a lot more on the driver's side.
            I'm sure you'll also find a lot of user space tools that were (re-)implemented independently.
            And I'm sure you'll find some drivers and tools where there was a re-implementation on the Linux side as well.

            Personally, I'm ok with that, I understand the reasoning behind that, even though I think the fight about BSD vs. GPL is cumbersome. But if you're really into that, sure, why not.
            Also, sometimes there might be more reasons to it than "just" the license.
            https://wiki.freebsd.org/Ext2fs Driver in base OS originated from ancestor-BSD. Driver has been in quite a lot in development (read the link) but I cant see anywhere that driver was copied from Linux. Bunch of gradual improvements and Google SOC projects. Originally used UFS driver as "base" with Linux-specific things bolted on to it.

            UDF driver appeared 2002 in FreeBSD 5, written by volunteers using public file system specification. Not a thing to do with Linux that I can see.

            Dunno about Thunderbolt 1/2 other that they work but I have read that development of newer Thunderbolt drivers is being done by Intel.

            XFS driver you are somewhat correct about. Read-only driver written using original IRIX-to-Linux port as a guide. BSD licensed. So it's not the same.

            All components (including tools) in the base OS have to be permissively licensed. Just taking something from Linux, modifying it and plugging it in wont work. It was a reason why BSD's sat using GCC 4.2 for years while there were much newer GCC versions out. And jumped at Clang/LLVM when opportunity arose.
            GPL stuff can & is used for user software, even tools - but in tools case they have slightly different names (gmake not make for example) in order to avoid conflicts with OS's own tools.
            Last edited by aht0; 24 February 2023, 07:44 AM.

            Comment


            • Originally posted by Rallos Zek View Post
              Using ZFS on Linux is a no go until it's a real Linux filesystem. Too many bugs and corner cases that will never be fixed until its codebase is in the linux kernel tree. ZFS is for Solaris neckbreads and FreeBSD cucks.
              Being in-tree has not prevented btrfs from having an impressive number of bugs. Every filesystem has bugs and other in-tree filesystems are no exception.

              You are generally better off using ZFS than in-tree filesystems. ZFS is developed more rigorously than in-tree filesystems.

              Comment


              • Originally posted by Berniyh View Post
                ext2, udf, thunderbolt, xfs. There's possibly a lot more on the driver's side.
                I'm sure you'll also find a lot of user space tools that were (re-)implemented independently.
                And I'm sure you'll find some drivers and tools where there was a re-implementation on the Linux side as well.

                Personally, I'm ok with that, I understand the reasoning behind that, even though I think the fight about BSD vs. GPL is cumbersome. But if you're really into that, sure, why not.
                Also, sometimes there might be more reasons to it than "just" the license.
                Interestingly, I recall hearing that FreeBSD is able to support ext2fs as a variant of UFS. More specifically, the UFS support in FreeBSD has two layers and supporting ext2fs just requires replacing one of them.

                That being said, Linux has reinvented things many times itself. ALSA to replace OSS, Linux's network stack to avoid the BSD network stack (that basically everyone else in the industry adopted), iptables instead of pf, etcetera. As for the filesystems you mentioned, FreeBSD likely could use FUSE ports of those drivers to avoid reimplementing them. As for thunderbolt, Linux and FreeBSD are equally forced to reimplement it by Apple.

                Plenty of userspace tools in FreeBSD were made before the GNU tools even existed due to BSD UNIX predating GNU, so you would need to be more specific, although I think both Linux-based OS developers and BSD-based OS developers reimplement things when they deem it necessary. OpenSSH for example was a SSH reimplementation by the OpenBSD developers. Basically all of the GNU userland tools were reimplementations of UNIX tools that already existed.
                Last edited by ryao; 27 February 2023, 02:44 PM.

                Comment


                • Originally posted by S.Pam View Post

                  I seem to remember that ext4 isn't always faster. But it was a while back. I use reflink copies (cp --reflink) extensively and that is way faster than copying on ext4.
                  Symlinks and hardlinks are also faster. In the case of SteamOS, proton will utilize symlinks when reflinks are not available, so reflink support is not a performance advantage.

                  Comment


                  • Originally posted by ryao View Post

                    Interestingly, I recall hearing that FreeBSD is able to support ext2fs as a variant of UFS. More specifically, the UFS support in FreeBSD has two layers and supporting ext2fs just requires replacing one of them.

                    That being said, Linux has reinvented things many times itself. ALSA to replace OSS, Linux's network stack to avoid the BSD network stack (that basically everyone else in the industry adopted), iptables instead of pf, etcetera. As for the filesystems you mentioned, FreeBSD likely could use FUSE ports of those drivers to avoid reimplementing them. As for thunderbolt, Linux and FreeBSD are equally forced to reimplement it by Apple.
                    Yes, of course. If there are two (or more) projects that try to achieve the same thing, it is bound to happen that they reinvent stuff, because they can't reuse code from another party. Be it due to licensing or for technical reasons (e.g. doesn't work with the present subsystems).

                    That's why I was pretty confused at first that aht0 wanted some examples for this at all.
                    The main thing I was on about with my original statement is something different though.
                    If BSD and Linux both try to achieve the same goal, here create an operating system, but BSD, for licensing reasons, can not as easily take code from Linux as Linux can take code from BSD, then it'll happen much more often on the BSD side, for pretty obvious reasons. Everything else would be a huge surprise.
                    Add to that the fact that much more people are working on Linux and the bias will be even more off here.
                    Thankfully for the BSD side, at least most of the KMS/DRM/X11 is compatible with BSDs license politics, so there they can focus on solving the technical issues in integrating that stuff.

                    Comment


                    • Originally posted by Berniyh View Post
                      Yes, of course. If there are two (or more) projects that try to achieve the same thing, it is bound to happen that they reinvent stuff, because they can't reuse code from another party. Be it due to licensing or for technical reasons (e.g. doesn't work with the present subsystems).
                      Having variety can be healthy for the overall ecosystem, so it is not necessarily a bad thing.

                      Comment

                      Working...
                      X