Announcement

Collapse
No announcement yet.

OpenZFS Could Soon See Much Better Deduplication Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bubbles_by_day
    replied
    "Performant" is not an adjective. It's a noun meaning a performer in a play.

    Yes, language constantly evolves, new words are added and others fade away every day. And all words are just "made up". Usually when a word is "officially recognized", it has already been in use for quite a while. One dictionary has even added "performant". I get it. I'm usually the first to point all that out when someone says "that's not a word".

    But this word in particular drives me absolutely ape-sh*t like no other word in spoken or written language. Mainly because it became popular among wanna-be-nerd corporate brownnosing ladder-climber types with no actual human talent other than thinking they sound sophisticated. Then it trickled down into an epidemic of stupidity. (Yes, I get it, nowadays not all uses are stupid.)

    For the love of all that is holy, stop using that stupid f***ing word.

    Leave a comment:


  • rleigh
    replied
    Originally posted by timofonic View Post
    Can ZFS be used as Swap too?
    Yes. Create a ZVOL and then run mkswap on it. Works perfectly.

    Leave a comment:


  • timofonic
    replied
    Originally posted by dfyt View Post

    With ZFS you can allocate a SSD as a cache device for your storage like a "hybrid" drive. Sadly on Linux thus far you lose that cache when you reboot. On BSD's it persists between reboots.
    I see.

    Can ZFS be used as Swap too? What about persistent computing and never need to close apps unless misbehave?

    Leave a comment:


  • k1e0x
    replied
    Originally posted by discordian View Post
    I am always getting mixed answers to how much ram openzfs needs (from atleast 8gb to not much unless you use specific features) . Have you it on some 1-2 GB systems?
    Simple answer (excluding dedup):
    No more than any other filesystem.

    The reason this is confusing is because ZFS heavy relies on it's ARC cache to do a lot of it's magic. That cache is only used for performance. So on systems with limited ram ZFS's performance will suck. Slower than most other filesystems at least. Another reason for the confusion is Sun had large memory requirements in it's original documentation for it. (presumably for performance and dedup reasons and it being targeted to enterprise) That documentation is still out there.. but it relates to Oracles closed source forked version. Not the open one so please avoid it. Do searches for "OpenZFS" or "FreeBSD ZFS" and it's more in line with what you are using.
    Last edited by k1e0x; 20 September 2019, 10:25 PM.

    Leave a comment:


  • lancethepants
    replied
    For kicks and giggles I compiled zfs 0.8.1 to run on my Netgear R7000 router with 256MB ram. It actually runs. I was able to read off my USB 4TB backup drive through samba also running on the same device. Could look at raws and stream video (occasional hiccup. Depends on bitrate). Could work in a pinch if I needed some backup data.

    Leave a comment:


  • Chugworth
    replied
    Originally posted by discordian View Post
    I am always getting mixed answers to how much ram openzfs needs (from atleast 8gb to not much unless you use specific features) . Have you it on some 1-2 GB systems?
    For nearly a year I ran FreeBSD with a ZFS root on a virtual cloud instance that had 512MB of memory (yes, megabytes). I even added a block storage device into the pool to provide additional space. Sure the system wasn't the best performer, but there were no reliability issues. I eventually stopped using that system, but it would have kept running fine if I had left it.

    Leave a comment:


  • Chugworth
    replied
    Well that's interesting. I like ZFS, but the way it currently handles deduplication is one of my biggest complaints about it. That, and no reflink support, but those are nearly the same issue.

    Leave a comment:


  • linuxgeex
    replied
    Originally posted by darkbasic View Post

    That's exactly my findings. In fact I ended up using 4K recordsize for VMs and 32K for the rest of the system.
    Ah I read your article now and I see you're using a 480G Optane drive.

    480G / 4k blocks *320b per block = 38G of content-addressable hash table entries, so you're losing about 8% of the storage volume space to the dedupe metadata, worst case. But if you have 10 instances and 30% of those 10 instances is duplicated content then you're probably winning because you'll get back 27% of the instance storage, and save 27% of the pagecache memory per instance too, KSM notwithstanding. And although you can't keep the dedupe table in memory, optane is fast so the lookups won't kill you.
    Last edited by linuxgeex; 20 September 2019, 09:37 AM.

    Leave a comment:


  • dfyt
    replied
    Originally posted by timofonic View Post

    What's that? And why do you need it?
    With ZFS you can allocate a SSD as a cache device for your storage like a "hybrid" drive. Sadly on Linux thus far you lose that cache when you reboot. On BSD's it persists between reboots.

    Leave a comment:


  • darkbasic
    replied
    Originally posted by linuxgeex View Post

    General case 32k record size is good because it gives the codec some context to get a good ratio.

    For dedupe, larger record size = lower RAM usage, until the record size approaches the median file size.

    For massive multi-hosted VPN, 4k record size is the best. If you're expanding 32k and using 4k of it, that's pretty hard on both the pagecache and CPU cache. Throwing away a bit of storage capacity so that you don't tank when things are getting difficult, is a good tradeoff.
    That's exactly my findings. In fact I ended up using 4K recordsize for VMs and 32K for the rest of the system.

    Leave a comment:

Working...
X