Announcement

Collapse
No announcement yet.

Linux RAID Benchmarks With EXT4 + XFS Across Four Samsung NVMe SSDs

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux RAID Benchmarks With EXT4 + XFS Across Four Samsung NVMe SSDs

    Phoronix: Linux RAID Benchmarks With EXT4 + XFS Across Four Samsung NVMe SSDs

    Last week I offered a look at the Btrfs RAID performance on 4 x Samsung 970 EVO NVMe SSDs housed within the interesting MSI XPANDER-AERO. In this article are some EXT4 and XFS file-system benchmark results on the four-drive SSD RAID array by making use of the Linux MD RAID infrastructure compared to the previous Btrfs native-RAID benchmarks. Tests were done on the Linux 4.18 kernel to provide the latest stable look at the XFS/EXT4 MD RAID performance with these four powerful Samsung 970 EVO 250GB NVMe solid-state drives.

    http://www.phoronix.com/vr.php?view=26783

  • #2
    Are there really no drawbacks to running a 4x*4 over a 16x link? I know the data/cmd runs separately on each pcie serdes physet... But on the host side? Interrupt handling, iomapping/translation, DMA entrypoints? All at one host point? Or does each pcie channel have a full set of handling features, regardless of it is a 1x, 4, 8, 16?

    Implementation dependent?

    Comment


    • #3
      Originally posted by milkylainen View Post
      Are there really no drawbacks to running a 4x*4 over a 16x link? I know the data/cmd runs separately on each pcie serdes physet... But on the host side? Interrupt handling, iomapping/translation, DMA entrypoints? All at one host point? Or does each pcie channel have a full set of handling features, regardless of it is a 1x, 4, 8, 16?

      Implementation dependent?
      Since each disk has a read speed of 3400 MB/s, the four disks will top at 13600 MB/s.
      Having into account that PCIE3 at x16 tops at 16128, i think there's no saturation of the BUS.
      I'm assuming that the disks can't read and write at the same time, and i used the read speeds because those are higher...

      Comment


      • #4
        I remember reading somewhere that putting a heatsink and cooling over the memory chips on a SSD, can have a negative impact in performance. The controller, on the other hand, will benefit from it.

        Comment


        • #5
          Originally posted by nomadewolf View Post
          I'm assuming that the disks can't read and write at the same time,
          The ATA interfaces like SATA and PATA are all half-duplex. But I thought NVMe was full-duplex, just like SAS? Is this not true?

          Comment


          • #6
            Originally posted by nomadewolf View Post

            Since each disk has a read speed of 3400 MB/s, the four disks will top at 13600 MB/s.
            Having into account that PCIE3 at x16 tops at 16128, i think there's no saturation of the BUS.
            I'm assuming that the disks can't read and write at the same time, and i used the read speeds because those are higher...
            I was not talking about bus saturation and header overhead. You know.. There is more to bus typologies than raw transfer speed...
            PCIe is usually talked about as simplex speed for a duplex bus. PCIe 3.0 16x does approx. 16G/s is in each direction. It's a full duplex point to point bus.
            And the disc controllers do transfer reads and writes at the same time. Or preferably, Inbound vs outbound data + cmd.
            Really does not matter much to the bus if the disc is reading or writing.

            My question was in regard to the resources needed to transfer the data. IO-translation, DMA, IRQ handling (Inbound memory writes or not)... etc.

            Comment


            • #7
              Originally posted by torsionbar28 View Post
              The ATA interfaces like SATA and PATA are all half-duplex. But I thought NVMe was full-duplex, just like SAS? Is this not true?
              Since PCIe is full duplex it would be rather stupid for NVMe not to utilize it. (I'm pretty sure NVMe is full duplex)
              SATA is a stupid spec. It has bidirectional diff pairs but it is half duplex due to backwards compatibility.
              Thus remaining half duplex in spite of the resources.
              SAS utilizes approx the same technology but it is full duplex as you mentioned.

              Comment


              • #8
                It would be interesting to have a set of control tests to compare against. Run the same tests of the various filesystems on a RAM disk.

                Comment


                • #9
                  I set up an array with a balanced X,Y,X,Y configuration to the connected CCX units (two slots on expansion card, two on motherboard) and had some interesting results.
                  I'll test X,X,Y,Y as a configuration as well, I think it will be faster.
                  Not all tests ran for me, likely because I was in a chrooted environment.
                  https://openbenchmarking.org/result/...RA-1808205RA08

                  Comment


                  • #10
                    @Michael:

                    I've mentioned it before, but please consider adding a RAID-10 config where you do a "far" layout with "--layout=f2", which results in the MD code being able to stripe reads from 4 disks (instead of just 2) with no penalty to writes because flash storage has no seek penalty.

                    With the available bandwidth from NVMe drives, this option should be *beastly*. I use it for my 2-disk RAID1E mirrors and it delivers 1GB/s+ read bandwidth from older SSD drives. RAID-0 read speed with mirrored redundancy is a very nice feature.

                    Comment

                    Working...
                    X