Announcement

Collapse
No announcement yet.

Linus Torvalds Doesn't Recommend Using ZFS On Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by k1e0x View Post
    3. You don't know what you're talking about. COW alone has nothing to do with bit-rot or uncorrectable errors. You're thinking of block checksums, and yes, they are good. COW provides other features such as snapshots, cloning and boot environments. Boot environments are pretty cool.. maybe Linux should get on that... oh wait.. ZFS is the only Linux file system that does it and we can't have *that*.

    Check this out.. FreeBSD 12 has a new command for boot environments..
    https://www.freebsd.org/cgi/man.cgi?query=bectl

    bectl create [email protected]
    bectl jail [email protected]

    You just cloned your running OS into a writable, bootable, virtual environment and spawned a shell in it and it was instant.
    You really need to stop that claim now. ostree supports XFS reflink. Ostree used in combination with XFS with relink gives this exact same feature of cloning boot environments quickly with disc space effectiveness and created checksums to locate damage. Technically any file system that supports relink could be used.

    So there are three file systems under Linux that can do boot environments. ZFS, Btrfs/LVM and Ostree/XFS. Ostree/XFS you see in Fedora project silverblue.

    Really ZFS does not have as many advantages as one would think other than its per block protection. XFS being a part CoW file system with reflink means with software like ostree on top it can pull large number of the ZFS tricks but with higher overall performance.

    Comment


    • Originally posted by oiaohm View Post

      You really need to stop that claim now. ostree supports XFS reflink. Ostree used in combination with XFS with relink gives this exact same feature of cloning boot environments quickly with disc space effectiveness and created checksums to locate damage. Technically any file system that supports relink could be used.

      So there are three file systems under Linux that can do boot environments. ZFS, Btrfs/LVM and Ostree/XFS. Ostree/XFS you see in Fedora project silverblue.

      Really ZFS does not have as many advantages as one would think other than its per block protection. XFS being a part CoW file system with reflink means with software like ostree on top it can pull large number of the ZFS tricks but with higher overall performance.
      OpenSUSE has pretty much the cutting edge of btrfs integration. From their manual.
      3.3.3 Limitations
      A complete system rollback, restoring the complete system to the identical state as it was in when a snapshot was taken, is not possible.
      There is a whole list of things that it can't restore (including the kernel) This isn't true on FreeBSD it can restore everything to the way it was before.

      I did a search for XFS writable snapshots... Google said "Did you mean ZFS writable snapshots?" I think that's telling..

      Linux does not really have this feature in anything I've seen.. I don't see it in any distros and even there is a way to do it it'd def not as clean or well implemented. It probably also does a copy so it's slow as hell.

      On FreeBSD it takes one command to add a boot environment you can select from the boot loader. No packages, no config changes. no waiting for anything to copy. It's writable, closable and transportable to another system and works even on an encrypted disk. The jail integration is just the cherry on top.. that's *two* commands.

      Use.. maybe your system upgrade went sideways and you need to get the system back up. Restore the old environment and boot production then send the failed update system to a dev system and boot the environment in a jail. After you fix and test it you can send it back.
      Last edited by k1e0x; 29 January 2020, 02:01 AM.

      Comment


      • Originally posted by k1e0x View Post
        OpenSUSE has pretty much the cutting edge of btrfs integration. From their manual.[/URL]
        Do note I write Btrfs/LVM not pure Btrfs. LVM can do the perfect roll back snapshots under the file system. That quote from manual was pure Btrfs solution.

        Originally posted by k1e0x View Post
        I did a search for XFS writable snapshots... Google said "Did you mean ZFS writable snapshots?" I think that's telling..

        On FreeBSD it takes one command to add a boot environment you can select from the boot loader. No packages, no config changes. It's writable, closable and transportable to another system.
        Basically its telling that you are tunnelled visioned.

        https://www.projectatomic.io/docs/os-updates/ Yes ostree from project atomic is also registering after 1 command in the boot loader.

        That because the XFS/ostree one is not called snapshots. Its atomic upgrades https://ostree.readthedocs.io/en/lat...omic-upgrades/. "cloning boot environments" does not have to equal snapshots.

        Please note these atomic upgrades can function without relink just not as disc effective.

        https://ostree.readthedocs.io/en/lat...ting-existing/

        What is different here is you are creating the snapshots above the file system using the VFS layer todo the bending. Really do you want your boot environment fully writable?

        ostree is slower without the relink system but will work anyhow.

        Ostree/project atomic is clone able and transportable to other systems that are not using the same file system type.


        There are three places you can do snapshoting.

        1) Block layer. LVM
        2) VFS layer (mount namespaces under Linux)
        3) File system.

        Snapshotting to be disc space effective just need a cow of some from. LVM contains a CoW. XFS contains a CoW by relink. Of course ZFS and Btrfs contain a CoW.

        k1e0x I guess the concept of a VFS layer snapshotting never crossed your mind.
        Last edited by oiaohm; 29 January 2020, 02:09 AM.

        Comment


        • Originally posted by oiaohm View Post

          Do note I write Btrfs/LVM not pure Btrfs. LVM can do the perfect roll back snapshots under the file system. That quote from manual was pure Btrfs solution.



          Basically its telling that you are tunnelled visioned.

          https://www.projectatomic.io/docs/os-updates/ Yes ostree from project atomic is also registering after 1 command in the boot loader.

          That because the XFS/ostree one is not called snapshots. Its atomic upgrades https://ostree.readthedocs.io/en/lat...omic-upgrades/. "cloning boot environments" does not have to equal snapshots.

          Please note these atomic upgrades can function without relink just not as disc effective.

          https://ostree.readthedocs.io/en/lat...ting-existing/

          What is different here is you are creating the snapshots above the file system using the VFS layer todo the bending. Really do you want your boot environment fully writable?

          ostree is slower with the relink system.

          Ostree/project atomic is clone able and transportable to other systems that are not using the same file system type.


          There are three places you can do snapshoting.

          1) Block layer. LVM
          2) VFS layer (mount namespaces under Linux)
          3) File system.

          Snapshotting to be disc space effective just need a cow of some from. LVM contains a CoW. XFS contains a CoW by relink. Of course ZFS and Btrfs contain a CoW.

          k1e0x I guess the concept of a VFS layer snapshotting never crossed your mind.
          Wow that sounds like a lot of limitations... well don't worry.. Linux will catch up someday. The technology to do this is only a decade old.. and still not in RHEL.. shame..

          Those github setup guides look fun.. but you know I think I'd just like to do:
          Step 1. Install FreeBSD
          Step 2. You already have atomic upgrades enabled, there is no step two.
          Last edited by k1e0x; 29 January 2020, 02:17 AM.

          Comment


          • Originally posted by k1e0x View Post
            Those github setup guides look fun.. but you know I think I'd just like to do:
            Step 1. Install FreeBSD
            Step 2. You already have atomic upgrades enabled, there is no step two.
            Swap FreeBSD for RHEL/Centos Atomic Host or Silverblue. Yes step 2 stays the same and you have atomic upgrades enabled outbox no matter the file system you choose.

            I don't think FreeBSD has atomic upgrades if choose some other file system other than ZFS on install. Yet the ones I listed you can choose many different file systems and still have atomic upgrades.

            I don't think ZFS will ever catch up on raw performance.
            Last edited by oiaohm; 29 January 2020, 03:03 AM.

            Comment


            • Originally posted by oiaohm View Post
              I don't think ZFS will ever catch up on raw performance.
              Hard to do when people cripple it with the benchmarks by defeating the ARC. Want to drag race a 4x mirror with a L2ARC cached on an nvme? Nahh.. I don't trust you after the ageist post. haha

              Comment


              • Originally posted by k1e0x View Post
                Hard to do when people cripple it with the benchmarks by defeating the ARC. Want to drag race a 4x mirror with a L2ARC cached on an nvme? Nahh.. I don't trust you after the ageist post. haha
                LOL Its not like its impossible for XFS to have a 4x mirror with a block cache in front of it behind it in the next Linux kernel releases. Sorry you don't have the speed.

                Yes block cache on nvme is also possible. So that does not give you a speed advantage. Its sad to watch these ZFS fans be out of date on benchmarks because they don't want to admit their ass is kicked in raw performance. Feature advantage is not as big as they want to make out either.

                Comment


                • Originally posted by oiaohm View Post

                  LOL Its not like its impossible for XFS to have a 4x mirror with a block cache in front of it behind it in the next Linux kernel releases. Sorry you don't have the speed.

                  Yes block cache on nvme is also possible. So that does not give you a speed advantage. Its sad to watch these ZFS fans be out of date on benchmarks because they don't want to admit their ass is kicked in raw performance. Feature advantage is not as big as they want to make out either.
                  ZFS only really loses in single disk SSD-like scenarios. Single disk HDDs all behave around the same unless a file system like ZFS or BTRFS+LUKS is using gzip9 or very high encryption levels and other crazy intensive stuff.

                  Used in mirrors or better with the ZIL and L2ARC on fast storage, which is what ZFS is primarily designed for, it usually does win in regards to enabled features and speed.

                  Comment


                  • Originally posted by skeevy420 View Post

                    ZFS only really loses in single disk SSD-like scenarios. Single disk HDDs all behave around the same unless a file system like ZFS or BTRFS+LUKS is using gzip9 or very high encryption levels and other crazy intensive stuff.

                    Used in mirrors or better with the ZIL and L2ARC on fast storage, which is what ZFS is primarily designed for, it usually does win in regards to enabled features and speed.
                    ZFS is really about integrity and ease of storage management. Those are it's AAA features and that is why you use it. You use it to minimize the downtime of doing backups either taking them or having to restore them in the first place. ZFS send has big advantages over rsync where as it doesn't need to spend 2 hours or more calculating the delta between two storage pools, it already knows what blocks have changed and what to send. I've heard cases where people using Bacula had their backup times exceed 24 hours making daily backups impossible. They solved that with ZFS send.

                    That being said it's flexible enough to design a storage layout for iops and speed depending on your workflow. You can get very good competitive performance with any other like product if you design the layout correctly. You can put a ZIL on Optane too in FreeBSD.

                    Comment


                    • Originally posted by skeevy420 View Post
                      ZFS only really loses in single disk SSD-like scenarios. Single disk HDDs all behave around the same unless a file system like ZFS or BTRFS+LUKS is using gzip9 or very high encryption levels and other crazy intensive stuff.
                      That is not exactly true. ZFS loses in single HDD solutions to XFS as well.

                      Originally posted by skeevy420 View Post
                      Used in mirrors or better with the ZIL and L2ARC on fast storage, which is what ZFS is primarily designed for, it usually does win in regards to enabled features and speed.
                      This is what I call bias bench-marking you see ZFS with ZIL and L2ARC but then they don't give xfs one cache options either then claim win.

                      https://www.redhat.com/en/blog/impro...mance-dm-cache

                      Yes dm-cache and bcache and other solutions like it really do speed up xfs a lot. Mirrors + cache options with xfs in perform do normally beat zfs with ZIL and L2ARC at least now. There has been a recent change.

                      Do notice a warm cache under XFS works out to ~4x faster. What is roughly the same boost you get by enabling L2ARC to ZFS expect you are starting off slower. So ZFS with ZIL and L2ARC on does not catch up to XFS with Cache on. In fact the performance difference gets wider not narrower in XFS advantage. Only reason ZFS in benchmarks with L2ARC wins over XFS is those benchmarking are basically not giving XFS a cache.

                      So yes the thing you complain about with normal file system benchmarks being unfair because they don't allow L2ARC you see those attempting to sell ZFS do the reverse where they don't give XFS or any other file system any of the other caching options.

                      Really ZFS without cache is not unfair in fact its giving ZFS a better chance than having to face off against XFS with cache. Basically fair competition here if you have the solid state drive for cache you should compare all file systems setup to use it that way and the file system does not need to have cache feature to have a block level cache under it.

                      Its surprising to a lot of people how poor the L2ARC and ZIL in fact performs when you compare it to other cache options. Having data integrity stuff does not come free.

                      Originally posted by k1e0x View Post
                      ZFS is really about integrity and ease of storage management. Those are it's AAA features and that is why you use it. You use it to minimize the downtime of doing backups either taking them or having to restore them in the first place. ZFS send has big advantages over rsync where as it doesn't need to spend 2 hours or more calculating the delta between two storage pools, it already knows what blocks have changed and what to send. I've heard cases where people using Bacula had their backup times exceed 24 hours making daily backups impossible. They solved that with ZFS send.
                      ZFS send is a good feature. That there is currently not a good replication for that. But not all workloads you need this integrity and replication like postgresql database with their WAL don't need ZFS send or file system integrity so IO performance is more important and their own backup system provides that stuff.

                      Basically ZFS features that effect IO make it not the most suitable for particular workloads.

                      Originally posted by k1e0x View Post
                      You can get very good competitive performance with any other like product if you design the layout correctly. You can put a ZIL on Optane too in FreeBSD.
                      Its about time you stop this lie. You can put caching under XFS on Optane as well and see insane performance boosts. If you objective is IOPS ZFS never wins.

                      Something to wake up to iomap change in Linux kernel is a major one as this allows the VFS layer to put request straight to the block layer if the block map information from the file system is already got and in the iomap.

                      Why does XFS not have data block checksumming or compression simple. Does it make any sense when you were planning to allow the VFS layer to bipass the file system layer. Basically this model change means the file system driver is only there to process the file system metadata. So in this model compression/checksum are either in the VFS or the block layer.


                      Comment

                      Working...
                      X