Announcement

**zexelon** · 15 February 2024, 11:35 AM

I am sure this wont at all lead to possible data loss... I mean the best way to dedupe is just remove the files all together

I am sure they will test this to death, but still any rewrites of this kind of code terrifies me.

**Avamander** · 15 February 2024, 12:15 PM

Having corrupted two ZFS pools permanently (even with a separate zlog) using the current deduplication implementation, I wouldn't dare use this.

**mb_q** · 15 February 2024, 03:49 PM

I am genuine interested in what is the use case for post-hoc deduplication; backups and general containers (immutable trees) can be done a priori or in software, and media are rarely bit-identical even for the perceptually same content. Multi-tenant storage systems seem to make sense, but certainly not with encryption that is probably/hopefully standard now...

**pWe00Iri3e7Z9lHOX2Qx** · 15 February 2024, 04:14 PM

I find the background of their slide template layout distracting.

**Chugworth** · 15 February 2024, 04:44 PM

I'd rather see a dedup method that's based on block cloning rather than some big dedicated dedup table loaded into the memory. Just occasionally run a task that scans for duplicate blocks, similar to some of the methods available for Btrfs.

**fong38** · 15 February 2024, 05:05 PM

Originally posted by Chugworth View Post

I'd rather see a dedup method that's based on block cloning rather than some big dedicated dedup table loaded into the memory. Just occasionally run a task that scans for duplicate blocks, similar to some of the methods available for Btrfs.

Wait for https://github.com/openzfs/zfs/pull/15393 to get merged, duperemove should then work with some alterations to make it work without FIEMAP

**muncrief** · 15 February 2024, 06:02 PM

During the recent data corruption problems with ZFS I discovered that neither BTRFS or ZFS have overall architectural documents, and "verification" is accomplished via individual developers running random scripts over and over until other developers agree the code is okay to release. So I'd be very, very, careful incorporating such a massive change into the codebase.

Don't get me wrong, I greatly appreciate the work of the developers of both systems, and still run ZFS myself. However my recommendations that they step back and appoint a lead architect, develop architectural documents, and real verification systems incorporating targeted and fuzz testing were rebuffed, at times with baffling vitriol, so it appears that all advanced Linux filesystems are pretty much a toss up as far as reliability is concerned. I hate to say that, but until someone gets serious about organization and testing it's the unfortunate truth.

**Volta** · 15 February 2024, 06:38 PM

Originally posted by muncrief View Post

so it appears that all advanced Linux filesystems are pretty much a toss up as far as reliability is concerned. I hate to say that, but until someone gets serious about organization and testing it's the unfortunate truth.

I hate to say that, but it appears you have no clue. What ZFS data corruption has anything common with Linux and btrfs? ZFS isn't Linux file system. It's from unreliable slowlaris and used in FreeBSD. There were data corruption bugs in NTFS, whatever crap macOS was using, so what's your point? I got it! There are no bugs in not released super zeta file system. You can use that instead.

**muncrief** · 15 February 2024, 06:48 PM

Originally posted by Volta View Post

I hate to say that, but it appears you have no clue. What ZFS data corruption has anything common with Linux and btrfs? ZFS isn't Linux file system. It's from unreliable slowlaris and used in FreeBSD. There were data corruption bugs in NTFS, whatever crap macOS was using, so what's your point? I got it! There are no bugs in not released super zeta file system. You can use that instead.

Of course there will always be bugs in software Volta. But without organization, architectural documents, and targeted and fuzz testing there will be more. But I understand that suggesting such things makes some people angry, so I'm happy to simply throw in my two cents for people to accept or discard, and leave it at that.

Announcement

Fast Dedup Coming To OpenZFS For Overhauling Deduplication Capability

Fast Dedup Coming To OpenZFS For Overhauling Deduplication Capability

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment