Announcement

Collapse
No announcement yet.

DM VDO "Virtual Data Optimizer" Merged For Linux 6.9

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DM VDO "Virtual Data Optimizer" Merged For Linux 6.9

    Phoronix: DM VDO "Virtual Data Optimizer" Merged For Linux 6.9

    As a follow-up to the article earlier this month around DeviceMapper's Virtual Data Optimizer (VDO) preparing to be upstreamed, it was successfully merged today by Linus Torvalds as the newest shiny feature of Linux 6.9...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    What the hell is this?



    So let me get this straight. You could point VDO to your nvme's block device, it spits out a virtual VDO block device which you format with anything, then any data you shove into that device gets deduped and compressed regardless of the filesystem you use? Neat. The compression used is hiops (c) (TM) something something marketing nonsense. Whatever the compression algorithm actually is it packs 4KiB blocks into 4KiB chunks, "Up to 14 blocks can be packed into a single 4KB segment"

    Permabit Technology Corporation has announced the release of HIOPS compression, the first inline data…


    Would be interesting to see how the dedupe and compression compares against btrfs and bcachefs.

    Comment


    • #3
      NICE! Nice nice nice.

      Looking forward to try this without tainting my kernel.

      Comment


      • #4
        Code:
        man lvmvdo
        It already has great user-space management.

        Comment


        • #5
          The more I read, the less I understand. Can someone explain in two sentences (you can make them longer with commas ) for a layman what this is? I mean what application does it do? A new compression algorithm supported by the Kernel, that can be applied to archives or filesystems in example? I have no idea what this is.

          Comment


          • #6
            Originally posted by byteabit View Post
            The more I read, the less I understand. Can someone explain in two sentences (you can make them longer with commas ) for a layman what this is? I mean what application does it do? A new compression algorithm supported by the Kernel, that can be applied to archives or filesystems in example? I have no idea what this is.
            it does mainly 2 things:

            * compress your data on the fly before they go to the disk (saving space and increasing speed if your disk is slow and your CPU is fast)
            * deduplicate data on the fly: if you have many files sharing the same content (or, part of the same content) they are written just once on your disk (again, saving space and time if the disk is slow and the cpu is fast)

            all of this completely transparently, ie: independently by the filesystem you use and the applications you use.
            Last edited by cynic; 13 March 2024, 04:44 PM.

            Comment


            • #7
              Originally posted by byteabit View Post
              The more I read, the less I understand. Can someone explain in two sentences (you can make them longer with commas ) for a layman what this is? I mean what application does it do? A new compression algorithm supported by the Kernel, that can be applied to archives or filesystems in example? I have no idea what this is.
              The DeviceMapper sits between the kernel and file system and can offer some advanced features to all file systems instead of implementing them per file system. In this instance it's LZ4 compression and deduplication.

              As an OpenZFS user I have mixed feelings about this. On the one hand, deduplicating and LZ4ing all the data is usually always great thing and deduplicaiton is a known ZFS weakness, but, on the other hand, I've never been a fan of using DM for one thing, LVM for another, and different file systems for other things when ZFS can do all of that in one package with a hell of a lot more flexibility.

              Does anyone know how this handles double compression? The times when this will be doing LZ4 and then BTRFS will do Zstd-3. It knows not to do compression twice, right? Will it use LZ4 like OpenZFS uses it for early abort in those cases?

              Comment


              • #8
                So I was reading the man pages of this and I stand corrected on the deduplication part I got excited about:

                The deduplication index requires additional memory which scales with the size of the deduplication window. For dense indexes, the index requires 1GB of RAM per 1 TB of window.​
                That's not any better than ZFS

                Deduplicating data is a very resource-intensive operation. It is generally recommended that you have at least 1.25 GiB of RAM per 1 TiB of storage when you enable deduplication. Calculating the exact requirement depends heavily on the type of data stored in the pool.
                That's the ZFS man page

                Comment


                • #9
                  Originally posted by cynic View Post
                  * compress your data on the fly [...]
                  * deduplicate data on the fly [...]
                  all of this completely transparently. [...]
                  Ah, that's much more clear! So this is similar to what Windows offered with the compression at filesystem level for NTFS, going back to Windows XP (it's been a while...). So I know the concept of it at high level. That's neat. If you say its filesystem independent, do you mean its independent if its EXT4 or ZFS or any other filesystem?? How would this even work?

                  One concern I would have with this is, if the files are kind of dependent on the Kernel itself, maybe the Kernel version.

                  Comment


                  • #10
                    Originally posted by byteabit View Post

                    Ah, that's much more clear! So this is similar to what Windows offered with the compression at filesystem level for NTFS, going back to Windows XP (it's been a while...). So I know the concept of it at high level. That's neat. If you say its filesystem independent, do you mean its independent if its EXT4 or ZFS or any other filesystem?? How would this even work?

                    One concern I would have with this is, if the files are kind of dependent on the Kernel itself, maybe the Kernel version.
                    The Device Mapper layer is between the disk layer and the file system layer. You create a DM volume and then put a regular file system on that. That regular file system can be ZFS, Ext4, or anything else. It's like using an LVM or LUKS pool. The file system does its thing but before it goes to the disk it passes through the DM and it does encryption, compression, deduplicaiton, etc before finally writing to the disk.

                    A picture is worth a thousand words:

                    Comment

                    Working...
                    X