Announcement

Collapse
No announcement yet.

4-Disk Btrfs Native RAID Performance On Linux 4.10

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    I was wondering if there's a way to determine if the SATA controller on this board is holding things back at all. I looked up the board at MSI's site but while they like to brag about its M.2 bandwidth and the like, there's nothing about how many PCI-e lanes are assigned to the SATA controller.

    If you have time, just out of curiousity, what speeds do you get if you use "dd" or whatever other tool to read from all four SSD block devices (no filesystem) at the same time? Do they each get the same speed as reading from a single one?

    Comment


    • #22
      Originally posted by ldo17 View Post

      That’s an interesting comment, considering that underlying SSD implementations most likely behave like a log-structured filesystem (i.e. nothing but journal).
      That's a different thing entirely. With a traditional journaling filesystem (not log structured) each update operation (file write, creation, deletion etc) involves writing metadata into the journal, which is at a fixed location on the disk. That means that there are a couple blocks which are constantly written over and over and over, which reduces the lifetime of a SSD quite a bit.

      Comment


      • #23
        SSDs use a Flash Translation Layer to spread writes out over erase blocks. So writing the same block isn't a problem.

        It was a problem on old CF cards, but it has been better than that for a while.

        Comment


        • #24
          ZFS could be a nice comparison too, as it offers most of the featured btrfs offers without being that slow!
          and let's face it btrfs is slow compared to almost any other fs

          Comment


          • #25
            Originally posted by Zan Lynx View Post
            SSDs use a Flash Translation Layer to spread writes out over erase blocks.
            That’s what I meant by “underlying SSD implementation”.

            Comment


            • #26
              Originally posted by waxhead View Post
              RAID5:
              Will spread the data over ALL the disks in the pool, however 1/4th of the total space (4 disks remember) is used for special data called parity that can be used to reconstruct any of the data on the other disks. If the parity is lost it can simply recalculate the parity from the working disks. And since BTRFS does checksum it can know if the data is reliable before reconstructing parity / lost data. You can loose one disk and still reconstruct data.

              RAID6:
              Same as RAID5. but will use 2/4th of the total space (again 4 disks remember) for this special data called parity. This means you can loose two disks and still reconstruct data.

              Word of advice: Raid5/6 in BTRFS is a brilliant way to test your backups. In other words it is NOT production ready and marked as UNSTABLE right now.
              BTRFS allows different profiles on data and metadata (data about data).

              In BTRFS terms the usage of the RAID name is a bit wrong since the redundancy is basically a mix of copies, stripes and parities.
              For completeness' sake :
              raid 5 will spread data on all the disks in the pool, and you get to use (TOTAL_CAPACITY - SIZE_OF_LARGEST_DISK)
              With raid6 it's (TOTAL_CAPACITY - 2*SIZE_OF_LARGEST_DISK)
              In reality I found that the best is to have two (three) of the largest disks.

              Agreed on the advice part.

              Comment


              • #27
                Originally posted by waxhead View Post
                There seems to be some confusion to what BTRFS' *native* RAID actually is.... and just for the record BTRFS RAID is no the same as MD Raid or any other hardware raid for that matter. So allow me to explain (a bit simple) what modes BTRFS support support and how they work. Let's pretend you have 4 disks in a pool , all part of a single filesystem.

                SINGLE:
                Will keep ONE copy of the data on ANY disk regardless of how many disks that exist in the pool.

                DUP:
                Will keep TWO copies of the data on THE SAME disk, regardless of how many disks that exists in the pool

                RAID0
                Will spread the data over ALL disks in the pool. So for simplicity sake: Think of this as having a 4 MB file, 1MB will be stored per disk.

                RAID1:
                Will keep TWO copies of the data on DIFFERENT disks (you can loose one disk), regardless of how many disks that exists in the pool.

                RAID10:
                Will spread the data over HALF the number of disks in the pool, and duplicate it e.g. keep a copy on the other half of the disks that exist in the pool.

                RAID5:
                Will spread the data over ALL the disks in the pool, however 1/4th of the total space (4 disks remember) is used for special data called parity that can be used to reconstruct any of the data on the other disks. If the parity is lost it can simply recalculate the parity from the working disks. And since BTRFS does checksum it can know if the data is reliable before reconstructing parity / lost data. You can loose one disk and still reconstruct data.

                RAID6:
                Same as RAID5. but will use 2/4th of the total space (again 4 disks remember) for this special data called parity. This means you can loose two disks and still reconstruct data.
                Something like this should be on btrfs wiki. Simple and well explained.

                Comment


                • #28
                  Originally posted by Geopirate View Post
                  Wait, did they already fix the raid 5/6 issues?
                  My thoughts exactly. Last I heard, there still were some serious (data corruption ?) issues and the developers strongly advised to not use it.

                  Comment


                  • #29
                    Is it me, or these benchmarks only serve to ilustrate how poorly RAID has been implemented?

                    Comment


                    • #30
                      Originally posted by Spacefish View Post
                      ZFS could be a nice comparison too, as it offers most of the featured btrfs offers without being that slow!
                      and let's face it btrfs is slow compared to almost any other fs
                      What?!? ZFS is a LOT slower than btrfs, on every metric.

                      Comment

                      Working...
                      X