Announcement

Collapse
No announcement yet.

A Quick Look At EXT4 vs. ZFS Performance On Ubuntu 19.10 With An NVMe SSD

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by jacob View Post
    Quite frankly am I the only one who doesn't share this fascination with ZFS?
    It's massively overrated by the community as one of those things that makes you superior because the masses don't use it and you can feel smarter than them by being one of the elites who do.

    That said, this is precisely the type of thing that I think really makes a lot of sense for Canonical to work on. The feature itself is largely done, and just needs a ton of polish and integration work to make it useful to regular users. They can do that and provide some real value, and most importantly, a selling point for Ubuntu to stand out in the crowd of distros and cause people to actively choose them rather than just hoping to stay as the standard default distro. It's a smart move.
    Last edited by smitty3268; 17 October 2019, 01:34 AM.

    Comment


    • #32
      We had huge problems with ZFS and rsync with millions of small files. Many kernel threads, many seemingly dealing with the in-memory cache would be spawned and the load on the server would go through the roof. Zero problems with EXT4. We reverted all of our Proxmox hosts to using LVM thin pools for VM storage and our VMs to using EXT4.

      Comment


      • #33
        Atime=off and sync=disabled or optane ssd for slog device. The outcome will be completely different.

        Comment


        • #34
          Actually I'm more interested in Solaris 11.4 ZFS with th latest SRU vs. ZoL.

          Comment


          • #35
            Meh...

            I've been using btrfs since 2013 on several servers of 10TB+ and haven't had issues! in fact it save my bacon several times!
            Two of them on "dreaded" RAID5. As long as you know what you're doing and if you're going RAID5 you have a UPS solution, you're fine.

            Now... to the haters and their inevitable response to my statement...

            Comment


            • #36
              ZFS can't be directly compared with EXT4.

              ZFS has everything EXT4 has, and also includes:
              • Combined File System & Volume Manager (with RAIDZ)
              • Manages shares (NFS & SMB) (on Solaris not sure of ZoL)
              • encryption
              • compression
              • de-duplication
              • snapshots
              • Check summing
              • Streaming (for backup)
              • Zetta Byte capacity vs 16TB
              Due to CoW & Check summing, in some benchmarks, ZFS will be slower than simple file systems.

              ZFS feature set is extensive and is production ready (was released 2005).

              Comment


              • #37
                Originally posted by cjcox View Post
                But is periodic "scrubbing" a "good" answer? Maybe it's "good enough"? Most RAID subsystems do periodic scrubbing.
                Yes and no.
                Yes, periodic scrubing on a new-gen filesystem (BTRFS, ZFS and BcacheFS once that hit mainstream) is "good enough".
                And no, it's not on most RAID subsystems.

                Bit rot isn't a drive simply die (where the redundancy in a RAID system (hence the initial R) is good enough). Bit rot is the data getting silently corrupted, a few bit being accidentally flipped, beyond what the built-in ECC system inside the hardware can recover.
                Because then, the redundancy doesn't matter, both drives are still up (in a RAID1 configuraiton, or more drives in RAID5), you just end up with two copies of the data.
                Two different copies of the data with no real way for telling which is which.

                And in these cases the key critical difference between RAID and new-gen filesystems is the checksumming. On RAID1 or DUP stored BTRFS (or multiple copies in BcacheFS or whatever it was called on ZFS), if one copy fails the checksum, you know it's bad and you know you need to get the other one.
                A scrub on these system will control everything against its checksums and will attempt to rebuild from another copy if there's one available.
                (or, failing that, immediately signal to you when something is wrong, so you could immediately attempt some mitigiation, like trying to reach out for a backup).

                That's something which is impossible with most RAID (the sole exception would be a non-degraded RAID-6 : you got 2 parities for each data strip. You could theoretically try to have a majority vote out of the 3 copy to try-to-guess which is the correct version in case of disagreement - frankly I haven't checked if linux' mdadm actually does it).

                Originally posted by jrch2k8 View Post
                ZFS is meant to be manually optimized for each volume workload and its also dependent on the amount of drives, bus, scheduler, RAM, etc.
                Sorry, but wasn't one of the big advantage that ZFS touted over BTRFS that it can auto-detect and auto handle some corner cases? (Like a big quantity of random writes - e.g.: databases, VM images, torrent - and autotune the CoW ?)
                I'm more a BTRFS guy, so my impressions might be wrong.

                Originally posted by jrch2k8 View Post
                but for whatever reason Michael keep including ZFS on this tests again and again in this conditions
                because that's what is available in his tools? The benchmarking tools are opensource, by the way: Feel free to help carve decent tests for ZFS...

                Originally posted by jrch2k8 View Post
                For most readers here all they get is "Ext4 is faster hence ZFS is broken or buggy"
                I think by now people have more or less learned that CoW and Log-structured filesystems are a different beast.
                You either go for EXT4 if you care only about raw speed.
                Or go BTRFS and co if you want the extra features.

                Originally posted by jrch2k8 View Post
                (spoiler ZoL is among the fastest ZFS implementations and is very very enterprise ready as well)
                And with very few exceptions, isn't built-in by default into most Linux distributions.
                Meanwhile BTRFS is available even on smartphones (Sailfish OS).

                Originally posted by jacob View Post
                Quite frankly am I the only one who doesn't share this fascination with ZFS?
                Sys admins who were used to big commercial Unix (e.g.: with a background in Solaris).
                Sys admins who wanted a new-gen filesystem, back when ZFS was the only game in town with lots of deployement and have already been stress-tested, BTRFS wasn't stable yet and BacheFS wasn't event been invented yet.

                Originally posted by carewolf View Post
                ZFS is among the fastest filesystems
                No it's not.
                New-gen filesystems (CoW or Log-structured with checksums) are at a different point on the features vs. performance scale of compromis than inplace-writing.

                Originally posted by carewolf View Post
                BTRFS will just eat all your data instead of a single file if it hits bitrot in its metadata. Not really an improvement.
                *Any* file system that has extensive damage in its structure will lose files and you'll need to whip out the file-carving tools.
                new-Gen filesystem by using checksuming are just better at gracefully erroring out.
                Compare BTRFS (gives you detailed error on the log and denies you access) vs. EXT4 (directories full of corrupted mojibake instead of actual files)
                You could lose file, no matter if it's in BTRFS/ZFS or if it's in EXT4/XFS.

                The new-gen filesystem at least have an advantage.
                On spinning rust media, BTRFS can use "dup"(*) or on multiple devices (no matter the type) it can use "RAID1", so if the metadata is corrupted, you can at least recover it from a different copy.

                Compared to classic RAID1, modern FS have two other advantages:
                - RAID1 is whole device. You either copy everything twice or not. BTRFS and ZFS are metadata vs. data, so it's possible to only keep the metadata with redundancy ("dup" is the default BTRFS behaviour on HDDs). BTRFS is working on per-subvolume settings (new data written in home could be RAID1 while new data writtin in root could be RAID0), so you can spend the extra disk space only on (the more critical) structure if so you wish.
                - As mentioned above: checksums. If the drive isn't dead but just got its content corrupted, it's easier to know which copy is the correct one.
                - it's CoW, there might be an older non-garbage-collected version that's still good. That simply doesn't exist on in-place writing FS.

                ---

                (*) in theory, there's nothing preventing you from specifying "dup" on flash media, but as other have pointed out, the flash translation layer on a SSD will coalesce the write together and they'll end up being written on the same flash group, so if one copy goes bust, you actually lost both copies at the same time.


                Comment


                • #38
                  Originally posted by Vistaus View Post

                  On SSD's too? 'Cause I'm still using ext4, but I consider switching to XFS on my next reinstall (unless there are conversion tools to do it right now?) if it's also fit for SSD's.
                  I've used XFS before and while it did the job nicely, when I messed up the partition table, I found out the hard way there are far fewer utilities out there than will rescue an XFS partition compared to EXT4. XFS's main selling point seems to be about support for huge files and you don't have many of those at home. Still, the file system worked fine (the messed up partition table was entirely my doing).

                  Comment


                  • #39
                    Originally posted by DrYak View Post
                    Sorry, but wasn't one of the big advantage that ZFS touted over BTRFS that it can auto-detect and auto handle some corner cases? (Like a big quantity of random writes - e.g.: databases, VM images, torrent - and autotune the CoW ?)
                    I'm more a BTRFS guy, so my impressions might be wrong.


                    because that's what is available in his tools? The benchmarking tools are opensource, by the way: Feel free to help carve decent tests for ZFS...


                    I think by now people have more or less learned that CoW and Log-structured filesystems are a different beast.
                    You either go for EXT4 if you care only about raw speed.
                    Or go BTRFS and co if you want the extra features.


                    And with very few exceptions, isn't built-in by default into most Linux distributions.
                    Meanwhile BTRFS is available even on smartphones (Sailfish OS).
                    Well, i have used ZFS since Solaris 10 days and i have never heard of of auto tuning, i know you can tune ZFS to any scenario in a ridiculously fine grained fashion but never automatically, i think you may be wrong about the AUTO part.

                    Automate reproducible tests for ZFS is near impossible(unless a top notch dev offer some insides that i don't have) and my issue is not with his tools but the fact of using them on ZFS regardless(is not my first post about this and i've gone to long lengths before, im just tired to keep at it).

                    Well, phoronix is the land of people that are fast to run their mouth about things they don't understand but very glacial slow to actually try to understand or accept why are they wrong. So i wouldn't bet money on it but ok

                    This is a fair point but i did give BTRFS few tries over the years and i still sometimes do but it always fail me one way or the other for my use cases, for example:
                    • RAID 5/50/6/60 is still very nuclear on BTRFS, it may work or it may eat you data and kill you kitten
                    • Is very slow on big LUNs specially on PostgreSQL(last tried with PG10/lin5.1), at least compared to ZFS but it may be related to the first issue since i didn't test with RAID1(which i think is the strongest one on BTRFS atm)
                    • I don't think it works ok on NVME(as with several drives), i can't prove it since the logs say nothing but on nvme i just noticed some services get random huge latency spikes and the only difference is ZFS vs BTRFS but well i may be wrong(i also didn't bother much to go the extra mile i just nuked the server and went back to ZFS to test another stuff <-- was a test server ofc).
                    • In general i believe BTRFS lacks flexibility in the volume/snapshot department as well but this may be subjective depending on what you do.
                    • BTRFS and virtualization are not very good friends.
                    Don't get me wrong, i do believe for desktop/small servers BTRFS is sufficient and stable enough, specially considering is a lot simpler than ZFS and give similar features but for Enterprise stuff or really important stuff i do believe ZFS is without peers, i simply trust it with my life(that does not imply i bypass having proper backups) since the solaris 10 era i never have a failure with data/hardware loss that ZFS couldn't recover from completely or a workload i couldn't find a way to optimize the living bejesus out of it, damn, even today i have a client with an old server that has lost in the last 10 years 23 of its original 24 hard drives(is implied i've been replacing the damaged drives for new ones and resilvering, of course) but have never lost a bit of data.

                    Comment


                    • #40
                      Originally posted by jrch2k8 View Post
                      Please Michael again, stop using ZFS on this benchmarks if you don't have the time to set it properly or at least bearably, you are only hurting ZFS because the average phoronix user don't have enough context to understand why those result are so horrible or why this setup is so hilariously wrong and will never show any real world performance or benefit for using ZFS in the first place.
                      I'm going to support these benchmarks because I think they provide an excellent service that is needed right now. The word "default" is under-appreciated throughout your response.

                      This is ZFS going into a shipping, standard, desktop distribution. By the numbers, FAR more people are going to be using this default "hilariously wrong setup" than a properly tuned version, and I think that needs exploration.

                      It is perfectly true that no one should pick single-disk-ZFS for a performance benefit over Ext4 - and therefore it's very important to measure that difference so people are informed.

                      There could be many examples (let's pick photographers as one, with the example in this thread) who run Linux due to their appreciation of the open-source tools, and now they see there's a supported filesystem option which we all tend to agree is pretty darn good against bitrot, and they want to try it. However, if Canonical isn't providing them tools to tune it like you say it must be done, then their installation won't be tuned unless they really feel like digging into it.

                      Comment

                      Working...
                      X