Announcement

Collapse
No announcement yet.

OpenZFS Could Soon See Much Better Deduplication Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by dfyt View Post
    I just want persistent cache.
    What's that? And why do you need it?

    Comment


    • #12
      Originally posted by discordian View Post
      I am always getting mixed answers to how much ram openzfs needs (from atleast 8gb to not much unless you use specific features) . Have you it on some 1-2 GB systems?
      My zfs systems are very peculiar, because they run on Optane drives and a 512/4K recordsize/sectorsize: http://www.linuxsystems.it/2018/05/o...t4-benchmarks/

      With those settings ram usage with deduplication would be WAAAAAAY higher and even compression is less effective.
      ## VGA ##
      AMD: X1950XTX, HD3870, HD5870
      Intel: GMA45, HD3000 (Core i5 2500K)

      Comment


      • #13
        Originally posted by darkbasic View Post

        My zfs systems are very peculiar, because they run on Optane drives and a 512/4K recordsize/sectorsize: http://www.linuxsystems.it/2018/05/o...t4-benchmarks/

        With those settings ram usage with deduplication would be WAAAAAAY higher and even compression is less effective.
        General case 32k record size is good because it gives the codec some context to get a good ratio.

        For dedupe, larger record size = lower RAM usage, until the record size approaches the median file size.

        For massive multi-hosted VPN, 4k record size is the best. If you're expanding 32k and using 4k of it, that's pretty hard on both the pagecache and CPU cache. Throwing away a bit of storage capacity so that you don't tank when things are getting difficult, is a good tradeoff.

        Comment


        • #14
          Originally posted by linuxgeex View Post

          General case 32k record size is good because it gives the codec some context to get a good ratio.

          For dedupe, larger record size = lower RAM usage, until the record size approaches the median file size.

          For massive multi-hosted VPN, 4k record size is the best. If you're expanding 32k and using 4k of it, that's pretty hard on both the pagecache and CPU cache. Throwing away a bit of storage capacity so that you don't tank when things are getting difficult, is a good tradeoff.
          That's exactly my findings. In fact I ended up using 4K recordsize for VMs and 32K for the rest of the system.
          ## VGA ##
          AMD: X1950XTX, HD3870, HD5870
          Intel: GMA45, HD3000 (Core i5 2500K)

          Comment


          • #15
            Originally posted by timofonic View Post

            What's that? And why do you need it?
            With ZFS you can allocate a SSD as a cache device for your storage like a "hybrid" drive. Sadly on Linux thus far you lose that cache when you reboot. On BSD's it persists between reboots.

            Comment


            • #16
              Originally posted by darkbasic View Post

              That's exactly my findings. In fact I ended up using 4K recordsize for VMs and 32K for the rest of the system.
              Ah I read your article now and I see you're using a 480G Optane drive.

              480G / 4k blocks *320b per block = 38G of content-addressable hash table entries, so you're losing about 8% of the storage volume space to the dedupe metadata, worst case. But if you have 10 instances and 30% of those 10 instances is duplicated content then you're probably winning because you'll get back 27% of the instance storage, and save 27% of the pagecache memory per instance too, KSM notwithstanding. And although you can't keep the dedupe table in memory, optane is fast so the lookups won't kill you.
              Last edited by linuxgeex; 09-20-2019, 09:37 AM.

              Comment


              • #17
                Well that's interesting. I like ZFS, but the way it currently handles deduplication is one of my biggest complaints about it. That, and no reflink support, but those are nearly the same issue.

                Comment


                • #18
                  Originally posted by discordian View Post
                  I am always getting mixed answers to how much ram openzfs needs (from atleast 8gb to not much unless you use specific features) . Have you it on some 1-2 GB systems?
                  For nearly a year I ran FreeBSD with a ZFS root on a virtual cloud instance that had 512MB of memory (yes, megabytes). I even added a block storage device into the pool to provide additional space. Sure the system wasn't the best performer, but there were no reliability issues. I eventually stopped using that system, but it would have kept running fine if I had left it.

                  Comment


                  • #19
                    For kicks and giggles I compiled zfs 0.8.1 to run on my Netgear R7000 router with 256MB ram. It actually runs. I was able to read off my USB 4TB backup drive through samba also running on the same device. Could look at raws and stream video (occasional hiccup. Depends on bitrate). Could work in a pinch if I needed some backup data.

                    Comment


                    • #20
                      Originally posted by discordian View Post
                      I am always getting mixed answers to how much ram openzfs needs (from atleast 8gb to not much unless you use specific features) . Have you it on some 1-2 GB systems?
                      Simple answer (excluding dedup):
                      No more than any other filesystem.

                      The reason this is confusing is because ZFS heavy relies on it's ARC cache to do a lot of it's magic. That cache is only used for performance. So on systems with limited ram ZFS's performance will suck. Slower than most other filesystems at least. Another reason for the confusion is Sun had large memory requirements in it's original documentation for it. (presumably for performance and dedup reasons and it being targeted to enterprise) That documentation is still out there.. but it relates to Oracles closed source forked version. Not the open one so please avoid it. Do searches for "OpenZFS" or "FreeBSD ZFS" and it's more in line with what you are using.
                      Last edited by k1e0x; 09-20-2019, 10:25 PM.

                      Comment

                      Working...
                      X