Announcement

**Volta** · 05 March 2024, 06:24 PM

Originally posted by bezirg View Post

That kind of sucks. I was hoping that I can get some feature-parity with zfs by using some-journaling-filesystem+dm-vdo+dm-crypt+mdraid but it doesn't look like it's safe enough.

Use btrfs then. Safety and zfs in one sentence doesn't go well.

**bezirg** · 05 March 2024, 06:31 PM

Originally posted by Volta View Post

Use btrfs then. Safety and zfs in one sentence doesn't go well.

No inline deduplication unfortunately for btrfs

**ATLief** · 05 March 2024, 09:15 PM

That’s really exciting! LVM has supported VDO for a while now so it should be pretty easy to start using.

**Magissia** · 06 March 2024, 02:37 AM

Originally posted by Volta View Post

Use btrfs then. Safety and zfs in one sentence doesn't go well.

Since when?

**aviallon** · 06 March 2024, 03:41 AM

Originally posted by bezirg View Post

No inline deduplication unfortunately for btrfs

Honestly, I prefer "offline" dedup like what you have with btrfs (or even xfs).
It works very well, and you avoid an entire army of issues.

**patrakov** · 06 March 2024, 08:02 AM

A dm-vdo target can be backed by up to 256TB of storage

Today this can be seen as quite tight. Someone with enough money can easily put 40 SSDs in an enclosure and present this to a server as a JBOD and make a software RAID5. Let's say they use 8TB SSDs (yes I know bigger SSDs exist, yet I have not seen any in real use in datacenters), that's already 320 TB of raw capacity.

**S.Pam** · 06 March 2024, 10:58 AM

Originally posted by Anux View Post

Of course there is much more work to be done before the data actually reaches your block dev. This is only intended for raids with batteries and servers/PCs with UPSs.

I'm not sure if it respects write barriers and if they could even help here. For normal home use a filesystem that includes all those features might be a more robust solution.

Write barriers are incredibly important even on the server with ups system. Without barriers you have no idea if the data ends up on disk in the correct order if there's a problem like shutdown, lost write due to signalling/pci/sas/sata issues or what else. If barriers aren't supported, I would not use it no matter what hardware I have.

**S.Pam** · 06 March 2024, 11:01 AM

Originally posted by bezirg View Post

No inline deduplication unfortunately for btrfs

Inline deduplication isn't as great as you might believe. On the contrary with user-space/online deduplication tools where you can focus deduplication on data that benefits from it.

**oiaohm** · 06 March 2024, 11:25 AM

Originally posted by patrakov View Post

Today this can be seen as quite tight. Someone with enough money can easily put 40 SSDs in an enclosure and present this to a server as a JBOD and make a software RAID5. Let's say they use 8TB SSDs (yes I know bigger SSDs exist, yet I have not seen any in real use in datacenters), that's already 320 TB of raw capacity.

https://access.redhat.com/documentat...l_volumes_on_r hel/lvm-vdo-requirements_deduplicating-and-compressing-logical-volumes-on-rhel#examples-of-vdo-requirements-by-physical-size_lvm-vdo-requirements

The 256TB of backing storage has a reason. Turns out live de-duplicating data has quite a bit of overhead on cpu and it gets worse as the size increases.

Please note Yes redhat 69 G ram (yes its the real number keep mind out gutter) of memory usage for 246TB of physical storage behind VDO is better than the ZFS 1 to 20G of ram per 1TB of storage. Yes the 20G per TB is to have ZFS de-duplication ram backed so high performance.

Yes it will be nice to support

You start seeing something when you do ram/storage maths VDO.
69G/256TB= ~0.27 G per TB.
27G/100TB=~0.27G per TB.
14G/50TB=~0.28G per TB
3G/10TB=~0.30G per TB
0.472G/1TB=~0.47G per TB
This is way better than ZFS but it also progressive gets better the larger the volume up to 256 you might think but those were all the best case values.

69G/101TB=~0.68G per TB.
27G/51TB=~0.52G per TB.
14G/11TB=~1.27G per TB.
3G/2TB=~1.5G per TB
0.472G/0.01TB=~47.2G per TB.

Worst case values this is where you see the problem. Notice the 69G/101TB is worse than 27G/51TB. Yes the next block above 256TTB will show reduced efficiency with VDO until you get get far enough away.

Spiting 320 TB of capacity in 2 will require less ram and CPU processing per TB of storage to operate VDO than if you expand VDO to support 320TB. Yes 256TB where redhat stopped with VDO is that this is the start of where gains of making the file system larger with de-duplication and compression stop with VDO. Yes it something to take into account that if Redhat or other developers alter VDO to allow larger if you are only just in the next larger there will be zero advantage compared to splitting the array in 2.

Yes the bottom worst case says doing a stack of small VDO partitions is a really bad idea. Yes you want partition sizes close to the max per level with VDO so it highest efficiency on ram and cpu usage.

VDO design has some interesting limitations.

There is something else to consider BTRFS/XFS out of band duplication like duperemove. Lets say you have a spike in load so needing the ram tool like duperremove can be killed for the time of high load so letting the memory be used. Something like VDO and ZFS online de-duplication on have it on you are stuck with the memory usage come hell or high water.

Really I would love to see some better middle ground between online de-duplication and out of band duplication. As in a online file system de-duplication that you can start up and shutdown on demand.

**bezirg** · 06 March 2024, 12:06 PM

Originally posted by S.Pam View Post

Inline deduplication isn't as great as you might believe. On the contrary with user-space/online deduplication tools where you can focus deduplication on data that benefits from it.

Inline deduplication does not waste write cycles of your ssd flash. Otherwise, you write extra on the flash and then remove the duplicate (wasted lifespan of ssd)

Announcement

Linux DM-VDO "Virtual Data Optimizer" Preparing To Land In The Upstream Kernel

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment