Originally posted by Vistaus
View Post
A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD
Collapse
X
-
-
-
Originally posted by cjcox View PostBut is periodic "scrubbing" a "good" answer? Maybe it's "good enough"? Most RAID subsystems do periodic scrubbing.
Yes, periodic scrubing on a new-gen filesystem (BTRFS, ZFS and BcacheFS once that hit mainstream) is "good enough".
And no, it's not on most RAID subsystems.
Bit rot isn't a drive simply die (where the redundancy in a RAID system (hence the initial R) is good enough). Bit rot is the data getting silently corrupted, a few bit being accidentally flipped, beyond what the built-in ECC system inside the hardware can recover.
Because then, the redundancy doesn't matter, both drives are still up (in a RAID1 configuraiton, or more drives in RAID5), you just end up with two copies of the data.
Two different copies of the data with no real way for telling which is which.
And in these cases the key critical difference between RAID and new-gen filesystems is the checksumming. On RAID1 or DUP stored BTRFS (or multiple copies in BcacheFS or whatever it was called on ZFS), if one copy fails the checksum, you know it's bad and you know you need to get the other one.
A scrub on these system will control everything against its checksums and will attempt to rebuild from another copy if there's one available.
(or, failing that, immediately signal to you when something is wrong, so you could immediately attempt some mitigiation, like trying to reach out for a backup).
That's something which is impossible with most RAID (the sole exception would be a non-degraded RAID-6 : you got 2 parities for each data strip. You could theoretically try to have a majority vote out of the 3 copy to try-to-guess which is the correct version in case of disagreement - frankly I haven't checked if linux' mdadm actually does it).
Originally posted by jrch2k8 View PostZFS is meant to be manually optimized for each volume workload and its also dependent on the amount of drives, bus, scheduler, RAM, etc.
I'm more a BTRFS guy, so my impressions might be wrong.
Originally posted by jrch2k8 View Postbut for whatever reason Michael keep including ZFS on this tests again and again in this conditions
Originally posted by jrch2k8 View PostFor most readers here all they get is "Ext4 is faster hence ZFS is broken or buggy"
You either go for EXT4 if you care only about raw speed.
Or go BTRFS and co if you want the extra features.
Originally posted by jrch2k8 View Post(spoiler ZoL is among the fastest ZFS implementations and is very very enterprise ready as well)
Meanwhile BTRFS is available even on smartphones (Sailfish OS).
Originally posted by jacob View PostQuite frankly am I the only one who doesn't share this fascination with ZFS?
Sys admins who wanted a new-gen filesystem, back when ZFS was the only game in town with lots of deployement and have already been stress-tested, BTRFS wasn't stable yet and BacheFS wasn't event been invented yet.
Originally posted by carewolf View PostZFS is among the fastest filesystems
New-gen filesystems (CoW or Log-structured with checksums) are at a different point on the features vs. performance scale of compromis than inplace-writing.
Originally posted by carewolf View PostBTRFS will just eat all your data instead of a single file if it hits bitrot in its metadata. Not really an improvement.
new-Gen filesystem by using checksuming are just better at gracefully erroring out.
Compare BTRFS (gives you detailed error on the log and denies you access) vs. EXT4 (directories full of corrupted mojibake instead of actual files)
You could lose file, no matter if it's in BTRFS/ZFS or if it's in EXT4/XFS.
The new-gen filesystem at least have an advantage.
On spinning rust media, BTRFS can use "dup"(*) or on multiple devices (no matter the type) it can use "RAID1", so if the metadata is corrupted, you can at least recover it from a different copy.
Compared to classic RAID1, modern FS have two other advantages:
- RAID1 is whole device. You either copy everything twice or not. BTRFS and ZFS are metadata vs. data, so it's possible to only keep the metadata with redundancy ("dup" is the default BTRFS behaviour on HDDs). BTRFS is working on per-subvolume settings (new data written in home could be RAID1 while new data writtin in root could be RAID0), so you can spend the extra disk space only on (the more critical) structure if so you wish.
- As mentioned above: checksums. If the drive isn't dead but just got its content corrupted, it's easier to know which copy is the correct one.
- it's CoW, there might be an older non-garbage-collected version that's still good. That simply doesn't exist on in-place writing FS.
---
(*) in theory, there's nothing preventing you from specifying "dup" on flash media, but as other have pointed out, the flash translation layer on a SSD will coalesce the write together and they'll end up being written on the same flash group, so if one copy goes bust, you actually lost both copies at the same time.
Leave a comment:
-
-
ZFS can't be directly compared with EXT4.
ZFS has everything EXT4 has, and also includes:- Combined File System & Volume Manager (with RAIDZ)
- Manages shares (NFS & SMB) (on Solaris not sure of ZoL)
- encryption
- compression
- de-duplication
- snapshots
- Check summing
- Streaming (for backup)
- Zetta Byte capacity vs 16TB
ZFS feature set is extensive and is production ready (was released 2005).
Leave a comment:
-
Meh...
I've been using btrfs since 2013 on several servers of 10TB+ and haven't had issues! in fact it save my bacon several times!
Two of them on "dreaded" RAID5. As long as you know what you're doing and if you're going RAID5 you have a UPS solution, you're fine.
Now... to the haters and their inevitable response to my statement...
Leave a comment:
-
-
Actually I'm more interested in Solaris 11.4 ZFS with th latest SRU vs. ZoL.
Leave a comment:
-
-
Atime=off and sync=disabled or optane ssd for slog device. The outcome will be completely different.
Leave a comment:
-
-
We had huge problems with ZFS and rsync with millions of small files. Many kernel threads, many seemingly dealing with the in-memory cache would be spawned and the load on the server would go through the roof. Zero problems with EXT4. We reverted all of our Proxmox hosts to using LVM thin pools for VM storage and our VMs to using EXT4.
Leave a comment:
-
-
Originally posted by jacob View PostQuite frankly am I the only one who doesn't share this fascination with ZFS?
That said, this is precisely the type of thing that I think really makes a lot of sense for Canonical to work on. The feature itself is largely done, and just needs a ton of polish and integration work to make it useful to regular users. They can do that and provide some real value, and most importantly, a selling point for Ubuntu to stand out in the crowd of distros and cause people to actively choose them rather than just hoping to stay as the standard default distro. It's a smart move.Last edited by smitty3268; 17 October 2019, 01:34 AM.
Leave a comment:
-
-
Originally posted by cjcox View PostJust me, but IMHO, for ZFS, you really need the muitiple disk aspect of it.
Originally posted by stormcrow View PostThe other is that for some strange reason some of my GOG games, and I don't remember which ones, would inexplicably crash when I was using XFS for the drive they were installed on. No effin clue why, but changing it to ext4 and all was fine.
Originally posted by jacob View PostQuite frankly am I the only one who doesn't share this fascination with ZFS? Its performance is absolutely dreadful
If Michael did a proper benchmark showcasing that, you'd see other filesystems as absolutely dreadful in performance.
Originally posted by carewolf View Post
BTRFS will just eat all your data instead of a single file if it hits bitrot in its metadata. Not really an improvement.
When was the last case you can link to of this happening on a properly configured and maintained system? Only issues I've seen in past year or so is due to users enabling non-default features that usually aren't stable and cautioned against by the BTRFS wiki/docs. Plenty of reports of users that say BTRFS saved their data that would otherwise have been lost on other filesystems. More often than not, the data is recoverable on BTRFS.
Leave a comment:
-
-
I agree with the comments related to ZFS not being able to shine in a one-disk setup. Here is some test data related to ZFS with multiple disk set ups, next time maybe test against ext4 using a RAID10 or something along those lines. https://calomel.org/zfs_raid_speed_capacity.html
Leave a comment:
-
Leave a comment: