Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite
It turns out the RAID5 and RAID6 code for the Btrfs file-system's built-in RAID support is faulty and users should not be making use of it if you care about your data.
There has been this mailing list thread since the end of July about Btrfs scrub recalculating the wrong parity in RAID5. The wrong parity and unrecoverable errors has been confirmed by multiple parties. The Btrfs RAID 5/6 code has been called as much as fatally flawed -- "more or less fatally flawed, and a full scrap and rewrite to an entirely different raid56 mode on-disk format may be necessary to fix it. And what's even clearer is that people /really/ shouldn't be using raid56 mode for anything but testing with throw-away data, at this point. Anything else is simply irresponsible."
So hopefully you aren't making use of any Btrfs RAID 5/6 support as it turns out to be in very bad shape and may even be ifdef'ed out of the mkfs code. Unfortunately it could take some time to fix especially with the potential for a format change being necessary to address the problem. The RAID56 wiki page has already been updated so users don't accidentally try one of these Btrfs RAID levels.
Coincidentally, I'm in the middle of some Btrfs RAID tests right now but will now be limited to 0/1/10 for the four SSDs.
There has been this mailing list thread since the end of July about Btrfs scrub recalculating the wrong parity in RAID5. The wrong parity and unrecoverable errors has been confirmed by multiple parties. The Btrfs RAID 5/6 code has been called as much as fatally flawed -- "more or less fatally flawed, and a full scrap and rewrite to an entirely different raid56 mode on-disk format may be necessary to fix it. And what's even clearer is that people /really/ shouldn't be using raid56 mode for anything but testing with throw-away data, at this point. Anything else is simply irresponsible."
So hopefully you aren't making use of any Btrfs RAID 5/6 support as it turns out to be in very bad shape and may even be ifdef'ed out of the mkfs code. Unfortunately it could take some time to fix especially with the potential for a format change being necessary to address the problem. The RAID56 wiki page has already been updated so users don't accidentally try one of these Btrfs RAID levels.
Coincidentally, I'm in the middle of some Btrfs RAID tests right now but will now be limited to 0/1/10 for the four SSDs.
117 Comments