Announcement

Collapse
No announcement yet.

Optane SSD RAID Performance With ZFS On Linux, EXT4, XFS, Btrfs, F2FS

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by pomac View Post

    Is it? On my ryzen, even more complex raid6 is still IO bound because:
    [ 0.180194] raid6: avx2x4 gen() 30855 MB/s

    So, give me a medium that gives me more than 30.8 GB/s.

    Note, pcie and memory will be an issue WAY before there is an issue with raid.

    Especially with raid1 - ANY read test on a raid one disk will be faster than no raid, the cpu will be idling waiting for IO anyway.
    Still "more complex" than not running that code. That said, the numbers that you see in dmesg is not real actual performance numbers, they are from syntetic benchmarks running on hot cached data (and on a small subset of the codepath of the entire md device) for the only reason to determine which algorithm to choose for each raid level, they say absolutely nothing about the expected performance of the md device in the real world.
    Last edited by F.Ultra; 20 June 2019, 03:29 PM.

    Comment


    • #22
      Did I miss ZFS mirror or ZFS stripe? I understand RAIDZ is common, but not with just 2 disks. I'd think the mirror or strip configuration, or even just a single 900P, would be a better judge within the great mix of filesystems tests.

      Comment


      • #23
        Originally posted by Drizzt321 View Post
        Did I miss ZFS mirror or ZFS stripe? I understand RAIDZ is common, but not with just 2 disks. I'd think the mirror or strip configuration, or even just a single 900P, would be a better judge within the great mix of filesystems tests.
        There is the single ZFS 900p in the results.
        Michael Larabel
        https://www.michaellarabel.com/

        Comment


        • #24
          Originally posted by Michael View Post

          There is the single ZFS 900p in the results.
          Oh there it is, thanks. Somehow missed that.

          Still would be nice to see mirror (RAID1) and stripe (RAID0) comparisons. Was done for the rest of them, why not for ZFS?

          Comment


          • #25
            Originally posted by Drizzt321 View Post

            Oh there it is, thanks. Somehow missed that.

            Still would be nice to see mirror (RAID1) and stripe (RAID0) comparisons. Was done for the rest of them, why not for ZFS?
            Only so much time in a day and when seeing ZFS being hit or miss, cut my losses at that point. There will be more ZFS in a separate article probably in July.
            Michael Larabel
            https://www.michaellarabel.com/

            Comment


            • #26
              Originally posted by F.Ultra View Post

              Still "more complex" than not running that code. That said, the numbers that you see in dmesg is not real actual performance numbers, they are from syntetic benchmarks running on hot cached data (and on a small subset of the codepath of the entire md device) for the only reason to determine which algorithm to choose for each raid level, they say absolutely nothing about the expected performance of the md device in the real world.
              Which was my point, the "overhead" is what you see there, the rest is IO and buses.

              For raid1, you DO get a better read performance. For raid0 you DO get a better write performance etc etc

              The added latency for the higher modes is all the writes.

              The actual algorithm it self is nothing compared to just leaving the cpu, the point is that the "overhead" is small.

              Comment


              • #27
                Originally posted by Michael View Post

                Only so much time in a day and when seeing ZFS being hit or miss, cut my losses at that point. There will be more ZFS in a separate article probably in July.
                Excuses excuses :P

                Fair enough. Although to be honest, I'd be more interested in seeing the benchmarks on FreeBSD, as that's what I run at home...

                Actually, OT, I don't think I see a recent FreeBSD bhyve benchmarks, that'd be something interesting for me, at least, to see. Maybe as part of a broader, home lab/single server Virtualization overview? I'm sure there's plenty of folks wanting to do single server home VM lab stuff, so a roundup of what's free, F/OSS, or low-cost could be very interesting. With Linux or FreeBSD/*BSD host.

                Comment


                • #28
                  phoronix any idea how much this ZFS performance changes if the two Optane drives are mirrored using ZFS? I will definitely be using this configuration to reduce the chance of data loss... And I hope it won't reduce performance.
                  EDIT: oops, I see you've used raidz to mirror the 2 Optanes.
                  Last edited by deusexmachina; 20 June 2019, 08:01 PM.

                  Comment


                  • #29
                    Originally posted by jrch2k8 View Post

                    Hi, Michael could you publish your ZFS configurations? those result are way too atrocious for my liking taking into account i can reach some of those results with spinning disks instead of SSDs, so i assume you are using a single default POOL with no Volumes with whatever Ubuntu include as "defaults" which are in no way right for benchmarking.

                    ZFS should never be used on the Bare POOL with defaults values.

                    some helpful commands to debug that performance:

                    zpool status -v
                    zfs list
                    zfs get all

                    also this one could help to see if you have multi queue active on all disk

                    cat /sys/block/your_drive_here/queue/scheduler

                    also did you create a RAID0 with ZFS? i mean something akin to zpool create -f [new pool name] /dev/sdx /dev/sdy? because that is the worst scenario possible for ZFS and honestly the one scenario where no one should use ZFS for because you get 0 as ZERO data protection but 100% of the overhead since each disk has to write Metadata and checksum while waiting for the other disk for the same which translate in ZERO scaling, you can add 100 drives in stripe and your top speed will never be more than -/+10% of the fastest single disk in the best case scenario but in the real world the more drives you add to the stripe the worst the performance will be.

                    Caveat:
                    I do understand that you are benchmarking the Out-Of-The-Box settings on scenarios that regular user should be familiar with, i do, but ZFS is not and never was meant for desktops or OOB settings, ZFS is/was designed specifically to be optimized per volume for whatever you need as is often the case on Enterprise hence the defaults are the worst case scenario settings OOB for 99% of the tasks that a regular user will need and specially for benchmarking.

                    If you post some of that relevant data i have no problem giving you a hand getting some basics right to improve your ZFS numbers, you also have several jewels on the Internet like archwiki and percona sites.

                    https://wiki.archlinux.org/index.php/ZFS (the basics done right)
                    http://open-zfs.org/wiki/Performance_tuning (the medium level optimizations)
                    https://www.percona.com/blog/2018/05...fs-performance (some high level percona magic )

                    Also you need a kernel patch to bring back hardware acceleration on ZFS if you don't have it

                    Thank you very much for your hard work
                    +1
                    Think the above would make ZFS competitive in the throughput benchmarks? I'm planning to use two SSDs in raidz to prevent dataloss in the case that one dies. Nice to see the latency is low. I'm planning to be running multiple VMs on a root ZFS system.

                    Comment


                    • #30
                      Originally posted by pomac View Post

                      So, give me a medium that gives me more than 30.8 GB/s.
                      How many disks are providing that read perf? And is there any practical usage for reading at such a rate?(I guess so for very enterprise like setups, just curious what workloads that use so much bandwidth are actually doing)

                      Comment

                      Working...
                      X