Announcement

Collapse
No announcement yet.

DM VDO "Virtual Data Optimizer" Merged For Linux 6.9

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by ATLief View Post

    I've thought about making a guide like that a few times, but the "right" order or combination entirely depends on the situation. Even things that are almost always a mistake (like putting DM-Cache above DM-Crypt) have situations where they make sense. Projects like ZFS and Stratis have to make the same choices, but they do it by (presumably) picking values that work well in most situations and only exposing a subset of those choices to the end-user. That of course makes it easier to use and less prone to error, but also creates situations where those technologies don't work as well as they could.

    I'm definitely not an expert, but I'd be happy to give some recommendations about a particular use-case if you're interested. I'm also going to reply to your previous comment in a bit.
    That would be super cool if you could. Honestly I think just 2 or 3 examples would cover 80%+ of typical cases.

    RAID + Integrity + Encryption + Snapshots would be a good baseline to start with.

    Compression, volume management, and ideally tiering would round things out nicely. Or in more concrete terms, say someone wanted to replicate the RAID 10 setup they had with ZFS that used encryption and compression (they basically get the integrity / volume management / snapshot capabilities by default).

    The other important thing would be, what voodoo you need to perform when a disk dies and you need to replace it. I.e. example commands to work through the layers, replace the disk, and get things back out of a degraded state and into good shape.

    Comment


    • #22

      Originally posted by pWe00Iri3e7Z9lHOX2Qx View Post

      Can you share what good tools exist for managing the soup of potential layers besides Stratis (which AFAIK doesn't actually support a bunch of them)?
      Just to make sure everyone's on the same page: Device-Mapper "layers" need to be activated after each reboot in the same way that filesystems need to mounted. The management tools just modify the persistent metadata which is later used by the software that activates the layers. DM layers can be nested by formatting the block device corresponding to a previously-activated layer as the backing device for a not-yet-activated layer. Most backing device types are automatically activated, but DM-Crypt backing devices (with or without DM-Integrity functionality) need to be added to /etc/crypttab in the same way that filesystems need to be added to /etc/fstab.

      The CLI utility for LVM (`lvm`) is great and can handle most of the backing device management; basically everything except for encryption and non-RAID integrity. There are a few obscure features that can only be set in its config files (like DM-Zero and DM-Error segments), but otherwise everything has a nice command with clear documentation. It supports all of the VDO stuff, by the way.

      Although LVM has native support for DM-Cache, there's a bug in the kernel that clears the read cache every time you reboot if your root filesystem is on the cached backing device. In that case, use `bcache` (different than BCacheFS) for tiered storage. The interface is different than LVM, but you can modify the same parameters.

      The CLI utility `cryptsetup` is great for managing DM-Crypt backing devices and can optionally include DM-Integrity functionality. There's a separate CLI utility called IntegritySetup that can manage DM-Integrity backing devices without DM-Crypt (and can configure a few more options than CryptSetup), but it's currently a bit broken.

      TL;DR: Use LVM for everything except encryption and integrity, which you should instead manage with CryptSetup. Due to a bug in the kernel, you should use BCache (different than BCacheFS) instead of DM-Cache if your root filesystem will be within the cached backing device.
      Last edited by ATLief; 14 March 2024, 11:38 PM. Reason: Added note about DM-Cache bug and BCache alternative

      Comment


      • #23
        Unless I'm missing something, the situation is a little more complex. If you go into this expecting to get more space out of your device as with a filesystem supporting compression natively, I really doubt that's the case. Most filesystems simply cannot handle a variable-sized block device, which is what DM should give you in conjunction with compression/dedup *if* you're thinking about saving space in that manner.

        Instead, this seems a bit like thin LV provisioning, where some space *may* be saved in the underlying pool of blocks, yet not actually for ordinary files. You're still creating a fixed size filesystem and you cannot really go overboard with it, because the upper block device could fill up before the filesystem. I'm not sure how such errors can be handled gracefully, but I'm thinking compression/dedup may make it all too easy to run into them. A more sensible use case seems to be along the same lines as thin provisioning.

        Or, in other words, we still have uses for filesystem-level compression. On the other hand, compression/dedup *may* improve throughput regardless of whether it saves space or not, which makes another sensible reason.

        Comment


        • #24

          Originally posted by pWe00Iri3e7Z9lHOX2Qx View Post

          That would be super cool if you could. Honestly I think just 2 or 3 examples would cover 80%+ of typical cases.

          RAID + Integrity + Encryption + Snapshots would be a good baseline to start with.

          Compression, volume management, and ideally tiering would round things out nicely. Or in more concrete terms, say someone wanted to replicate the RAID 10 setup they had with ZFS that used encryption and compression (they basically get the integrity / volume management / snapshot capabilities by default).

          The other important thing would be, what voodoo you need to perform when a disk dies and you need to replace it. I.e. example commands to work through the layers, replace the disk, and get things back out of a degraded state and into good shape.
          Sorry for the delay. I just want to reiterate that there are tons of ways to accomplish each goal, but this setup will probably work well for most people:

          DM_Layers_Diagram.jpg
          Please ask me before posting this image or advice anywhere else.
          • Each block represents a block device, and each block depends on the blocks under it
          • The lines coming out of the top are what you need to format the block devices as; if a block doesn't have such a line, it doesn't need to be formatted
          • The brackets on the left show which LVM volume group the devices should belong to (at least 2)
          • The brackets on the right show where you need to list those devices in order to automatically activate/mount them; if a block doesn't have such a bracket, it's activated automatically
          • You can of course remove any layer if you don't want that feature (for example, tiered storage)
          • If your root filesystem is going to be on layer 6/7, you'll need to boot with initramfs and use BCache instead of DM-Cache
          • If you want LVM to handle per-member integrity for you (layer 1) you'll need at least 1 redundant RAID member
          • If you want a self-healing RAID, use layer 1; note that you'll need additional integrity-checking elsewhere to protect yourself from an Evil Maid Attack
          • Instead of using layer 4, I'd highly recommend using filesystem-level checksumming; note that you'll need layer 4 for true authenticated encryption
          • Instead of creating snapshots in layer 6 (with LVM) I'd highly recommend using filesystem-level snapshots; LVM snapshots can still be useful for other things, but are extremely inefficient
          • Use filesystem-level compression if you want that
          Regarding the commands, I'd recommend looking at `man lvmraid` and `man lvmcache`. I'm definitely not saying "rtfm", but their documentation is really good.
          Last edited by ATLief; 21 March 2024, 08:57 PM.

          Comment


          • #25
            Originally posted by ATLief View Post


            Sorry for the delay. I just want to reiterate that there are tons of ways to accomplish each goal, but this setup will probably work well for most people:

            DM_Layers_Diagram.jpg
            Please ask me before posting this image or advice anywhere else.
            • Each block represents a block device, and each block depends on the blocks under it
            • The lines coming out of the top are what you need to format the block devices as; if a block doesn't have such a line, it doesn't need to be formatted
            • The brackets on the left show which LVM volume group the devices should belong to (at least 2)
            • The brackets on the right show where you need to list those devices in order to automatically activate/mount them; if a block doesn't have such a bracket, it's activated automatically
            • You can of course remove any layer if you don't want that feature (for example, tiered storage)
            • If your root filesystem is going to be on layer 6/7, you'll need to boot with initramfs and use BCache instead of DM-Cache
            • If you want LVM to handle per-member integrity for you (layer 1) you'll need at least 1 redundant RAID member
            • If you want a self-healing RAID, use layer 1; note that you'll need additional integrity-checking elsewhere to protect yourself from an Evil Maid Attack
            • Instead of using layer 4, I'd highly recommend using filesystem-level checksumming; note that you'll need layer 4 for true authenticated encryption
            • Instead of creating snapshots in layer 6 (with LVM) I'd highly recommend using filesystem-level snapshots; LVM snapshots can still be useful for other things, but are extremely inefficient
            • Use filesystem-level compression if you want that
            Regarding the commands, I'd recommend looking at `man lvmraid` and `man lvmcache`. I'm definitely not saying "rtfm", but their documentation is really good.
            Super cool, thanks for taking the time to document that!

            Comment

            Working...
            X