Announcement

Collapse
No announcement yet.

Fast Dedup Coming To OpenZFS For Overhauling Deduplication Capability

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Fast Dedup Coming To OpenZFS For Overhauling Deduplication Capability

    Phoronix: Fast Dedup Coming To OpenZFS For Overhauling Deduplication Capability

    The folks at iXsystems and Klara are contributing Fast Dedup support to upstream OpenZFS and beginning to roll out this improved deduplication support within TrueNAS SCALE starting next month...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    I am sure this wont at all lead to possible data loss... I mean the best way to dedupe is just remove the files all together

    I am sure they will test this to death, but still any rewrites of this kind of code terrifies me.

    Comment


    • #3
      Having corrupted two ZFS pools permanently (even with a separate zlog) using the current deduplication implementation, I wouldn't dare use this.

      Comment


      • #4
        I am genuine interested in what is the use case for post-hoc deduplication; backups and general containers (immutable trees) can be done a priori or in software, and media are rarely bit-identical even for the perceptually same content. Multi-tenant storage systems seem to make sense, but certainly not with encryption that is probably/hopefully standard now...

        Comment


        • #5
          I find the background of their slide template layout distracting.

          Comment


          • #6
            I'd rather see a dedup method that's based on block cloning rather than some big dedicated dedup table loaded into the memory. Just occasionally run a task that scans for duplicate blocks, similar to some of the methods available for Btrfs.

            Comment


            • #7
              Originally posted by Chugworth View Post
              I'd rather see a dedup method that's based on block cloning rather than some big dedicated dedup table loaded into the memory. Just occasionally run a task that scans for duplicate blocks, similar to some of the methods available for Btrfs.
              Wait for https://github.com/openzfs/zfs/pull/15393 to get merged, duperemove should then work with some alterations to make it work without FIEMAP

              Comment


              • #8
                During the recent data corruption problems with ZFS I discovered that neither BTRFS or ZFS have overall architectural documents, and "verification" is accomplished via individual developers running random scripts over and over until other developers agree the code is okay to release. So I'd be very, very, careful incorporating such a massive change into the codebase.

                Don't get me wrong, I greatly appreciate the work of the developers of both systems, and still run ZFS myself. However my recommendations that they step back and appoint a lead architect, develop architectural documents, and real verification systems incorporating targeted and fuzz testing were rebuffed, at times with baffling vitriol, so it appears that all advanced Linux filesystems are pretty much a toss up as far as reliability is concerned. I hate to say that, but until someone gets serious about organization and testing it's the unfortunate truth.

                Comment


                • #9
                  Originally posted by muncrief View Post
                  so it appears that all advanced Linux filesystems are pretty much a toss up as far as reliability is concerned. I hate to say that, but until someone gets serious about organization and testing it's the unfortunate truth.
                  I hate to say that, but it appears you have no clue. What ZFS data corruption has anything common with Linux and btrfs? ZFS isn't Linux file system. It's from unreliable slowlaris and used in FreeBSD. There were data corruption bugs in NTFS, whatever crap macOS was using, so what's your point? I got it! There are no bugs in not released super zeta file system. You can use that instead.
                  Last edited by Volta; 15 February 2024, 06:41 PM.

                  Comment


                  • #10
                    Originally posted by Volta View Post

                    I hate to say that, but it appears you have no clue. What ZFS data corruption has anything common with Linux and btrfs? ZFS isn't Linux file system. It's from unreliable slowlaris and used in FreeBSD. There were data corruption bugs in NTFS, whatever crap macOS was using, so what's your point? I got it! There are no bugs in not released super zeta file system. You can use that instead.
                    Of course there will always be bugs in software Volta. But without organization, architectural documents, and targeted and fuzz testing there will be more. But I understand that suggesting such things makes some people angry, so I'm happy to simply throw in my two cents for people to accept or discard, and leave it at that.

                    Comment

                    Working...
                    X