Announcement

Collapse
No announcement yet.

Btrfs Adds Degenerate RAID Support, Performance Improvements With Linux 5.15

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by mdedetrich View Post

    RAID5/6 is fine for ZFS which fixes the RAID 5/6 write hole. Also the main issue about RAID 5/6 (i.e. rebuilds potentially causing a cascading failure with the RAID system) is almost a non issue for modern enterprise NAS drives so its pretty much FUD at this point.
    Says who?
    Are all the new raid solutions to prevent simultaneous failing of SSDs pointless?
    .. and how about all the papers that evaluate the risk of a second drive failure as a function of HDD size?

    Comment


    • #22
      Originally posted by billyswong View Post
      Would there ever someone create something RAID5/6-like filesystem/array for SSD? In my understanding the concept of RAID5/6 is to maximize the available storage volume, and keep things transparent and operational when there is one/two-drive-loss. if SSDs are put into an old style RAID1/10/5/6 array, all drives will likely wear out at the same rate and fail, in time interval, too close together for even hot spare to save the game. I don't understand why people talk as if RAID6 is outdated for SSD but RAID1/10 are still okay. From what I learnt so far they are all the same in face of the new properties of SSD.
      SSDs are not 'that' predictable in failure. If they were, smart would tell you the date/time when they will fail..
      I use raid 1 or 10 with SSDs because it provides uptime if something goes wrong with a drive but replace them below a certain "health" status or TBW.
      Raid 5/6 is problematic because you put a huge load on the degraded array after disk failure. It has been shown that this can serve as a silver bullet (or catalyst if you will) for a second drive failure. This is true for SSD and large HDD.

      Comment


      • #23
        Originally posted by mppix View Post

        SSDs are not 'that' predictable in failure. If they were, smart would tell you the date/time when they will fail..
        I use raid 1 or 10 with SSDs because it provides uptime if something goes wrong with a drive but replace them below a certain "health" status or TBW.
        Raid 5/6 is problematic because you put a huge load on the degraded array after disk failure. It has been shown that this can serve as a silver bullet (or catalyst if you will) for a second drive failure. This is true for SSD and large HDD.
        But doesn't SSD wears out by writing and okay for reading? At least that's what I thought. The "huge load" you mentioned seems the same between RAID10 and RAID5/6 if my math is correct. While RAID10 only asks for a total read through of one corresponding drive when rebuilding, and RAID5/6 asks for a total read through of all other drives, the actual stress is the same for any particular drive(s) involved. I haven't complete the calculation but it is hard for me to believe RAID6 to be significantly more dangerous than RAID10 for SSD. If the failure trigger chance of pure read is close to 0 for SSD, RAID6 shall be safer than RAID10 for SSD. (For high drive count HDD array, one may go for RAID60 + hot spare..)

        Some Googling give me a paper on Differential RAID https://www.microsoft.com/en-us/rese...d-reliability/ but there doesn't seem any mainstream implementation in production.

        Comment


        • #24
          Originally posted by xfcemint View Post
          I think that the "degenerate RAID" feature is very usefull for home users of RAID1 arrays (small and medium desktops), where it assists in replacing the failed disk drives (where it is assumed that a user needs to go buy a new drive on drive failure). Unfortunately, the btrfs developers seem to have concentrated their effors on RAID0 arrays first.

          AFAIK, on RAID1 btrfs arrays, the "degenerete" mode still allows only read-only access. This is unlike most hardware RAID controllers and unlike mdadm software RAID. So, when a drive in btrfs raid1 fails, then a user needs to run to the nearest hardware store to buy a replacement (OS cannot function well in read-only mode).
          That's degraded mode, not degenerate. Different thing.

          Comment


          • #25
            Originally posted by billyswong View Post

            But doesn't SSD wears out by writing and okay for reading? At least that's what I thought. The "huge load" you mentioned seems the same between RAID10 and RAID5/6 if my math is correct. While RAID10 only asks for a total read through of one corresponding drive when rebuilding, and RAID5/6 asks for a total read through of all other drives, the actual stress is the same for any particular drive(s) involved. I haven't complete the calculation but it is hard for me to believe RAID6 to be significantly more dangerous than RAID10 for SSD. If the failure trigger chance of pure read is close to 0 for SSD, RAID6 shall be safer than RAID10 for SSD. (For high drive count HDD array, one may go for RAID60 + hot spare..)

            Some Googling give me a paper on Differential RAID https://www.microsoft.com/en-us/rese...d-reliability/ but there doesn't seem any mainstream implementation in production.
            We don't have to rehash the entire "is raid5 really bad?" conversation. There is enough information out there. You can, but I don't know any company that recommends raid5 anymore (more often than not issuing warnings)
            https://www.reddit.com/r/sysadmin/co...ended_for_any/

            Also for SSD specifically, if you have pcie 4 (and future pcie 5) arrays, you can get reasonably close to the memory bandwidth. Then software raid, especially the raid 5/6 variety, can become a bottleneck (and there is not really something like pcie hardware raid).
            Last edited by mppix; 01 September 2021, 03:31 PM.

            Comment


            • #26
              Originally posted by mppix View Post

              We don't have to rehash the entire "is raid5 really bad?" conversation. There is enough information out there. You can, but I don't know any company that recommends raid5 anymore (more often than not issuing warnings)
              https://www.reddit.com/r/sysadmin/co...ended_for_any/
              Thats because the premise is misleading. Since hard drives (at least until the whole Chia thing) were getting so cheap, you were typically better off doing RAID10 since it also has better performance.

              However if you are optimizing for hard drive space at the cost of some performance (i.e. you are doing some sought of a cold storage solution) then zraid1/2/3 is still superior (of course depending on vdev size).

              You can go to the truenas/ixsystems forums and there have been people running these setups for over a decade without any problems.

              Also a lot of the information about "recommending RAID5/6" is quite outdated and hard drives today are different to what they were back then. Furthermore if you are using systems like ARC/Level2 arc this reduces the wear and tear on the hard drives (sometimes quite significantly if properly optimized).

              Evidently you didn't read the link you posted earlier, because its not a clear cut no. As was stated there, if you are using SAS (rather than SATA) and you have drives with good URE, its a very different story.

              Originally posted by mppix View Post
              Also for SSD specifically, if you have pcie 4 (and future pcie 5) arrays, you can get reasonably close to the memory bandwidth. Then software raid, especially the raid 5/6 variety, can become a bottleneck (and there is not really something like pcie hardware raid).
              RAID5/6 is not for SSD's because you are getting the worst of both worlds. The point of RAID 5/6 is to optimize for storage while at least having some redundancy. If you are already using SSD's then the implication is that you are already not optimizing for storage space but something else.

              P.S. if you have issues with cascading SSD's (or hard drives) then mix your batches, i.e. don't put all disks from the same batch into a same system. Ideally you would mix and match batches from different companies (assuming they fulfill your specifications) to prevent these issues.
              Last edited by mdedetrich; 01 September 2021, 07:12 PM.

              Comment


              • #27
                Originally posted by mdedetrich View Post
                Thats because the premise is misleading. Since hard drives (at least until the whole Chia thing) were getting so cheap, you were typically better off doing RAID10 since it also has better performance.

                However if you are optimizing for hard drive space at the cost of some performance (i.e. you are doing some sought of a cold storage solution) then zraid1/2/3 is still superior (of course depending on vdev size).

                You can go to the truenas/ixsystems forums and there have been people running these setups for over a decade without any problems.

                Also a lot of the information about "recommending RAID5/6" is quite outdated and hard drives today are different to what they were back then. Furthermore if you are using systems like ARC/Level2 arc this reduces the wear and tear on the hard drives (sometimes quite significantly if properly optimized).

                Evidently you didn't read the link you posted earlier, because its not a clear cut no. As was stated there, if you are using SAS (rather than SATA) and you have drives with good URE, its a very different story.
                I think we are saying similars thing with different arguments. I am looking at it from the angle that RAID is for high(er) availability, i.e. RAID is not a backup.
                I read the link of course and I liked it _because_ it includes the discussion (you can find many others).
                The point is RAID 5 (and increasingly 6) is slowly falling out of favor in the server domain and I don't know of a storage vendor that would describe it as a technology for the future (or even recommend it for a new installation).
                For home users, I am also not sure IF raid 5/6 brings much to the table nowadays. You need a backup anyway. If you have a backup, say in a cloud, it can take less time to download the backup than resilver an array (but you have no access to data vs. slow access to data).
                I think everyone will need to make up their mind if RAID 5/6 is the correct solution for them.

                Originally posted by mdedetrich View Post
                RAID5/6 is not for SSD's because you are getting the worst of both worlds. The point of RAID 5/6 is to optimize for storage while at least having some redundancy. If you are already using SSD's then the implication is that you are already not optimizing for storage space but something else.

                P.S. if you have issues with cascading SSD's (or hard drives) then mix your batches, i.e. don't put all disks from the same batch into a same system. Ideally you would mix and match batches from different companies (assuming they fulfill your specifications) to prevent these issues.
                I agree with this.

                Comment


                • #28
                  Originally posted by mppix View Post
                  I think we are saying similars thing with different arguments. I am looking at it from the angle that RAID is for high(er) availability, i.e. RAID is not a backup.
                  I read the link of course and I liked it _because_ it includes the discussion (you can find many others).
                  The point is RAID 5 (and increasingly 6) is slowly falling out of favor in the server domain and I don't know of a storage vendor that would describe it as a technology for the future (or even recommend it for a new installation).
                  For home users, I am also not sure IF raid 5/6 brings much to the table nowadays. You need a backup anyway. If you have a backup, say in a cloud, it can take less time to download the backup than resilver an array (but you have no access to data vs. slow access to data).
                  I think everyone will need to make up their mind if RAID 5/6 is the correct solution for them.
                  Well I will just finish off with these points
                  • If RAID 5/6 was pointless in enterprise server storage then ZFS wouldn't have even bothered with it (and by far the biggest demographic for ZFS is high class enterprise storage)
                  • I would only ever advocate RAID 5/6 when using ZFS (they called it zraid1/2/3). Its the only RAID 5/6 implementation that solves the write hole. With ZRaid1/2/3 you will not have these problems, especially if you use an LSI HBA (which you should have in general for software raid solutions including BTRFS).
                  • Due to how complicated it is to implement RAID 5/6 (in addition to the previous point about the write hole) its not surprising that certain vendors such as Dell advice against it, because tbh most of these machines use hardware raid HBA's which historically have been terrible when it comes to RAID 5/6 implementation (which I can understand where all of the pain comes from). This is compounded by the fact that hardware raid ties the hard drives to the HBA controller.
                  As a home user I would actually argue that ZFS zraid1/2/3 is perfect moreso then enterprise deployments which tend to hyper specialize, typically home users are more budget conscious and usually you have limited space for the typical home user NAS setups that you have. If you buy/build a 6 bay NAS, losing half of that for redundancy is massively overkill for home users, Z-RAID1 is perfect for this usecase.

                  Its really a shame that BTRFS didn't solve this issue and its unlikely they will solve it because it requires a change to the on disk format. In all honesty its not surprising though, contrary to ZFS which was meticulously planned and designed to be correct before it was released (and it shows, at the time ZFS was released it was like a future technology from aliens) BTRFS was rushed because of "reasons" (Linux needed CoW really bad? Losing market share/competition with ZFS/OpenZFS?) .

                  I would even say that if BTRFS doesn't end solving the RAID 5/6 write hole they should just remove it as a RAID option, although from what I have heard it now displays a big red warning which is better than nothing.

                  Comment


                  • #29
                    Originally posted by mdedetrich View Post
                    Well I will just finish off with these points
                    • If RAID 5/6 was pointless in enterprise server storage then ZFS wouldn't have even bothered with it (and by far the biggest demographic for ZFS is high class enterprise storage)
                    • I would only ever advocate RAID 5/6 when using ZFS (they called it zraid1/2/3). Its the only RAID 5/6 implementation that solves the write hole. With ZRaid1/2/3 you will not have these problems, especially if you use an LSI HBA (which you should have in general for software raid solutions including BTRFS).
                    • Due to how complicated it is to implement RAID 5/6 (in addition to the previous point about the write hole) its not surprising that certain vendors such as Dell advice against it, because tbh most of these machines use hardware raid HBA's which historically have been terrible when it comes to RAID 5/6 implementation (which I can understand where all of the pain comes from). This is compounded by the fact that hardware raid ties the hard drives to the HBA controller.
                    I would argue that this thread is not about ZFS.
                    However, for context: ZFS is old and was designed in the HDD era when raid 5/6 was more relevant. Today, if you load up an AMD EPYC with 24 nvme drives, ZFS itself is the bottleneck.
                    ZFS is also _by_far_ not the only enterprise storage solution, especially for bulk storage that goes beyond one server, when we start talking about scale-out network filesystems.

                    Originally posted by mdedetrich View Post
                    As a home user I would actually argue that ZFS zraid1/2/3 is perfect moreso then enterprise deployments which tend to hyper specialize, typically home users are more budget conscious and usually you have limited space for the typical home user NAS setups that you have. If you buy/build a 6 bay NAS, losing half of that for redundancy is massively overkill for home users, Z-RAID1 is perfect for this usecase.
                    I believe the most common home-NAS are 2 or 4 bay, where raid 5/6 does not really make sense.
                    For 6 and 8 bay home-NAS, it is a bit of a different story. You may prefer size. However, my take would be that with the disk sizes in 2021, you may prefer raid 10, especially for SATA HDD. Then, you have at least a chance to saturate a 1GBe line (considering also the mediocre computation power of today's home-NAS).
                    Also, I don't know if ZFS is that common for home-NAS with Synology (primarily) using BTRFS and qnap ext4.

                    Originally posted by mdedetrich View Post
                    Its really a shame that BTRFS didn't solve this issue and its unlikely they will solve it because it requires a change to the on disk format. In all honesty its not surprising though, contrary to ZFS which was meticulously planned and designed to be correct before it was released (and it shows, at the time ZFS was released it was like a future technology from aliens) BTRFS was rushed because of "reasons" (Linux needed CoW really bad? Losing market share/competition with ZFS/OpenZFS?) .
                    ZFS started as a product by a large company.
                    BTRFS is a open-source project with free contributions. Crowd sourcing a project implies less "direction" and development is done publicly for everyone to see.
                    I don't think Linux "desperately needs" either because you can get largely the same functionality with a "mdadm+LVM+ext4/xfs" stack that tends to outperform both of them.

                    Originally posted by mdedetrich View Post
                    I would even say that if BTRFS doesn't end solving the RAID 5/6 write hole they should just remove it as a RAID option, although from what I have heard it now displays a big red warning which is better than nothing.
                    I agree with this. I assume there is some hope that someone steps up solving it but it does not look like there is enough interest.
                    Last edited by mppix; 02 September 2021, 04:56 PM.

                    Comment


                    • #30
                      I heard from developers only only yesterday that they're going to work on the raid56 issue start current set of zoned storage patches. I really hope it's true

                      Comment

                      Working...
                      X