A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • bug77
    replied
    Originally posted by Vistaus View Post

    On SSD's too? 'Cause I'm still using ext4, but I consider switching to XFS on my next reinstall (unless there are conversion tools to do it right now?) if it's also fit for SSD's.
    I've used XFS before and while it did the job nicely, when I messed up the partition table, I found out the hard way there are far fewer utilities out there than will rescue an XFS partition compared to EXT4. XFS's main selling point seems to be about support for huge files and you don't have many of those at home. Still, the file system worked fine (the messed up partition table was entirely my doing).

    Leave a comment:


  • DrYak
    replied
    Originally posted by cjcox View Post
    But is periodic "scrubbing" a "good" answer? Maybe it's "good enough"? Most RAID subsystems do periodic scrubbing.
    Yes and no.
    Yes, periodic scrubing on a new-gen filesystem (BTRFS, ZFS and BcacheFS once that hit mainstream) is "good enough".
    And no, it's not on most RAID subsystems.

    Bit rot isn't a drive simply die (where the redundancy in a RAID system (hence the initial R) is good enough). Bit rot is the data getting silently corrupted, a few bit being accidentally flipped, beyond what the built-in ECC system inside the hardware can recover.
    Because then, the redundancy doesn't matter, both drives are still up (in a RAID1 configuraiton, or more drives in RAID5), you just end up with two copies of the data.
    Two different copies of the data with no real way for telling which is which.

    And in these cases the key critical difference between RAID and new-gen filesystems is the checksumming. On RAID1 or DUP stored BTRFS (or multiple copies in BcacheFS or whatever it was called on ZFS), if one copy fails the checksum, you know it's bad and you know you need to get the other one.
    A scrub on these system will control everything against its checksums and will attempt to rebuild from another copy if there's one available.
    (or, failing that, immediately signal to you when something is wrong, so you could immediately attempt some mitigiation, like trying to reach out for a backup).

    That's something which is impossible with most RAID (the sole exception would be a non-degraded RAID-6 : you got 2 parities for each data strip. You could theoretically try to have a majority vote out of the 3 copy to try-to-guess which is the correct version in case of disagreement - frankly I haven't checked if linux' mdadm actually does it).

    Originally posted by jrch2k8 View Post
    ZFS is meant to be manually optimized for each volume workload and its also dependent on the amount of drives, bus, scheduler, RAM, etc.
    Sorry, but wasn't one of the big advantage that ZFS touted over BTRFS that it can auto-detect and auto handle some corner cases? (Like a big quantity of random writes - e.g.: databases, VM images, torrent - and autotune the CoW ?)
    I'm more a BTRFS guy, so my impressions might be wrong.

    Originally posted by jrch2k8 View Post
    but for whatever reason Michael keep including ZFS on this tests again and again in this conditions
    because that's what is available in his tools? The benchmarking tools are opensource, by the way: Feel free to help carve decent tests for ZFS...

    Originally posted by jrch2k8 View Post
    For most readers here all they get is "Ext4 is faster hence ZFS is broken or buggy"
    I think by now people have more or less learned that CoW and Log-structured filesystems are a different beast.
    You either go for EXT4 if you care only about raw speed.
    Or go BTRFS and co if you want the extra features.

    Originally posted by jrch2k8 View Post
    (spoiler ZoL is among the fastest ZFS implementations and is very very enterprise ready as well)
    And with very few exceptions, isn't built-in by default into most Linux distributions.
    Meanwhile BTRFS is available even on smartphones (Sailfish OS).

    Originally posted by jacob View Post
    Quite frankly am I the only one who doesn't share this fascination with ZFS?
    Sys admins who were used to big commercial Unix (e.g.: with a background in Solaris).
    Sys admins who wanted a new-gen filesystem, back when ZFS was the only game in town with lots of deployement and have already been stress-tested, BTRFS wasn't stable yet and BacheFS wasn't event been invented yet.

    Originally posted by carewolf View Post
    ZFS is among the fastest filesystems
    No it's not.
    New-gen filesystems (CoW or Log-structured with checksums) are at a different point on the features vs. performance scale of compromis than inplace-writing.

    Originally posted by carewolf View Post
    BTRFS will just eat all your data instead of a single file if it hits bitrot in its metadata. Not really an improvement.
    *Any* file system that has extensive damage in its structure will lose files and you'll need to whip out the file-carving tools.
    new-Gen filesystem by using checksuming are just better at gracefully erroring out.
    Compare BTRFS (gives you detailed error on the log and denies you access) vs. EXT4 (directories full of corrupted mojibake instead of actual files)
    You could lose file, no matter if it's in BTRFS/ZFS or if it's in EXT4/XFS.

    The new-gen filesystem at least have an advantage.
    On spinning rust media, BTRFS can use "dup"(*) or on multiple devices (no matter the type) it can use "RAID1", so if the metadata is corrupted, you can at least recover it from a different copy.

    Compared to classic RAID1, modern FS have two other advantages:
    - RAID1 is whole device. You either copy everything twice or not. BTRFS and ZFS are metadata vs. data, so it's possible to only keep the metadata with redundancy ("dup" is the default BTRFS behaviour on HDDs). BTRFS is working on per-subvolume settings (new data written in home could be RAID1 while new data writtin in root could be RAID0), so you can spend the extra disk space only on (the more critical) structure if so you wish.
    - As mentioned above: checksums. If the drive isn't dead but just got its content corrupted, it's easier to know which copy is the correct one.
    - it's CoW, there might be an older non-garbage-collected version that's still good. That simply doesn't exist on in-place writing FS.

    ---

    (*) in theory, there's nothing preventing you from specifying "dup" on flash media, but as other have pointed out, the flash translation layer on a SSD will coalesce the write together and they'll end up being written on the same flash group, so if one copy goes bust, you actually lost both copies at the same time.


    Leave a comment:


  • system32
    replied
    ZFS can't be directly compared with EXT4.

    ZFS has everything EXT4 has, and also includes:
    • Combined File System & Volume Manager (with RAIDZ)
    • Manages shares (NFS & SMB) (on Solaris not sure of ZoL)
    • encryption
    • compression
    • de-duplication
    • snapshots
    • Check summing
    • Streaming (for backup)
    • Zetta Byte capacity vs 16TB
    Due to CoW & Check summing, in some benchmarks, ZFS will be slower than simple file systems.

    ZFS feature set is extensive and is production ready (was released 2005).

    Leave a comment:


  • DanglingPointer
    replied
    Meh...

    I've been using btrfs since 2013 on several servers of 10TB+ and haven't had issues! in fact it save my bacon several times!
    Two of them on "dreaded" RAID5. As long as you know what you're doing and if you're going RAID5 you have a UPS solution, you're fine.

    Now... to the haters and their inevitable response to my statement...

    Leave a comment:


  • Dedobot
    replied
    Actually I'm more interested in Solaris 11.4 ZFS with th latest SRU vs. ZoL.

    Leave a comment:


  • Dedobot
    replied
    Atime=off and sync=disabled or optane ssd for slog device. The outcome will be completely different.

    Leave a comment:


  • cthart
    replied
    We had huge problems with ZFS and rsync with millions of small files. Many kernel threads, many seemingly dealing with the in-memory cache would be spawned and the load on the server would go through the roof. Zero problems with EXT4. We reverted all of our Proxmox hosts to using LVM thin pools for VM storage and our VMs to using EXT4.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by jacob View Post
    Quite frankly am I the only one who doesn't share this fascination with ZFS?
    It's massively overrated by the community as one of those things that makes you superior because the masses don't use it and you can feel smarter than them by being one of the elites who do.

    That said, this is precisely the type of thing that I think really makes a lot of sense for Canonical to work on. The feature itself is largely done, and just needs a ton of polish and integration work to make it useful to regular users. They can do that and provide some real value, and most importantly, a selling point for Ubuntu to stand out in the crowd of distros and cause people to actively choose them rather than just hoping to stay as the standard default distro. It's a smart move.
    Last edited by smitty3268; 17 October 2019, 01:34 AM.

    Leave a comment:


  • polarathene
    replied
    Originally posted by cjcox View Post
    Just me, but IMHO, for ZFS, you really need the muitiple disk aspect of it.
    That's less flexible compared to BTRFS right? Besides flexibility in capacity, you can't as easily use it straight away or something? I don't know much about ZFS, just recall that the more disks, the longer it takes to expand a pool with new storage?

    Originally posted by stormcrow View Post
    The other is that for some strange reason some of my GOG games, and I don't remember which ones, would inexplicably crash when I was using XFS for the drive they were installed on. No effin clue why, but changing it to ext4 and all was fine.
    I recall similar issues with Steam few years back(though they've fixed it since iirc. Was something like XFS using inode64 by default, and that caused some problems with data access by certain software some how. There may have been some other issue like that where EXT4 works but XFS doesn't, can't recall.

    Originally posted by jacob View Post
    Quite frankly am I the only one who doesn't share this fascination with ZFS? Its performance is absolutely dreadful
    It's not dreadful in a proper setup, the additional features over other filesystems also make it worthwhile. It can do tiered storage which BTRFS cannot presently. That is hot data is cached on faster storage layers, such that you can have HDDs for bulk cold storage and redundancy along with a SATA/NVMe SSD or something faster/low-latency like Optane 905P or RAM disk for your hot/frequent data.

    If Michael did a proper benchmark showcasing that, you'd see other filesystems as absolutely dreadful in performance.

    Originally posted by carewolf View Post

    BTRFS will just eat all your data instead of a single file if it hits bitrot in its metadata. Not really an improvement.
    If you have multiple copies of metadata like RAID1(single HDD is fine afaik, but SSD needs to be more than one disk since SSDs can optimize storage under the hood to share the same physical blocks/pages), that shouldn't happen?

    When was the last case you can link to of this happening on a properly configured and maintained system? Only issues I've seen in past year or so is due to users enabling non-default features that usually aren't stable and cautioned against by the BTRFS wiki/docs. Plenty of reports of users that say BTRFS saved their data that would otherwise have been lost on other filesystems. More often than not, the data is recoverable on BTRFS.

    Leave a comment:


  • adanisch
    replied
    I agree with the comments related to ZFS not being able to shine in a one-disk setup. Here is some test data related to ZFS with multiple disk set ups, next time maybe test against ext4 using a RAID10 or something along those lines. https://calomel.org/zfs_raid_speed_capacity.html

    Leave a comment:

Working...
X