Announcement

**duby229** · 09 July 2017, 09:11 AM

Originally posted by starshipeleven View Post

No but it shows that work is being done and that it will become usable sooner rather than later.

No it does not. What it shows is another broken promise. How many more years will it take? When will people finally realize that BTRFS is fundamentally too bloated? Features that don't work and take years to develop and then never do work, ever, have hurt btrfs' reputation permanently. It's a dead horse. It has been for the last year. Time to start over and develop a new COW filesystem with minimal feature sets as the primary feature.

**waxhead** · 09 July 2017, 09:31 AM

Alrighty then ....

Many on this forum seems to have forgotten that the very purpose of BTRFS is that it unlike other filesystems will regurgitate your data if it ate it.
Yes, there are horror stories about people loosing their filesystem and for something that WAS experimental this is very likely to happen anyway.
After about kernel 4.4 or so things started to really get in good shape. Now certain features of BTRFS is working quite well and most of the focus now are to fix things instead of implementing shiny stuff.

I have used BTRFS for years now as a simple replacement for ext4. I do not use any of the fancy features of BTRFS like subvolumes, snapshots etc. What I care about is the redundancy features in BTRFS that make sure that the data you put in is the data you get out. If you educate yourself a bit (and are a bit paranoid) you will soon find out what configuration are safe to use or not.

Basic BTRFS usage is in many ways better than a "regular" filesystem like ext4 for example. Why ? simply because if you corrupt something in ext4 or xfs then yes, the filesystem structure may be repairable, but the data will be corrupt. BTRFS will thanks to checksumming all data report back what file was chewed on and if you are lucky enough to run a redundant configuration like DUP, RAID1, RAID10 then BTRFS will fix your broken data.

Now, md raid (mdadm) is rock solid and very reliable. I have not read any horror stories about people loosing an entire array unless they seriously does not know what they are doing. md raid simply works and is fantastic for what it is designed for. Now there are some drawbacks, but that is not any fault of md raid.

You need disks of the same size, raid is designed to protect against disk failure (not bit-rot)
raid5 will detect a bit failure, but it will not be able to correct it since it does not know what drive the corruption originated from.
If the entire disk where the corruption originated from is lost , then ironically you are able to restore your lost data since you know what data (disk) is bad.

By the way ... raid5 works by XOR'ing data for those who does not know (see here for a great little explanation of how it works : http://blog.open-e.com/how-does-raid-5-work/ )

What btrfs does is that it checksums every piece of data it write. So if it need to write "hello world!, how are you today" it will add a little bit more information so it will know if the data is correct or not. This means that btrfs will always know what data have been chewed on and therefore can correct it by either using a good copy , or using good parity for reconstruction. BTRFS does NOT yet compute checksums for it's parity , however md raid does (to my knowledge) not do that either so in principle it is just as safe. ( I am not saying that you should use BTRFS RAID5/6 for now, it still have a long way to go ).

Both btrfs and md raid is deterministic. e.g. it will only read data once of the drive(s) it needs to read from. This mean that neither md raid or btrfs will verify that redundant data is correct on read (unless the read fails in which case the other copy will be tried out). Now let's pretend that copy B is broken, if data is read from copy A always you would now know that copy B is broken UNLESS copy A is gone toast. Therefore you have to periodically run a scrub regardless if it is btrfs or md raid to make sure that you data is in good shape.

The difference here is that btrfs will detect data corruption while md raid will not.

That being said , my most important data is on md raid 6 with ext4 as the filesystem. The only reason I have not migrated to BTRFS yet is that BTRFS' raid 5/6 is unstable and not well tested. When BTRFS raid5/6 gains more traction I will be happy to migrate one of our systems to btrfs metadata raid1 or raid10 and data raid 5/6. Why? well, metadata is data about the filesystem and raid1 is stable. I am willing to risk loosing a file or two as long as the filesystem structure itself is intact.

A last word of advice is - follow the BTRFS mailing list. There you will see that those who run into problems either run on the latest shiny stuff, have weird configurations, bad hardware or does not understand how things work. The short version is : BTRFS is in fact better than you think it was

**torsionbar28** · 09 July 2017, 10:45 AM

Originally posted by starshipeleven View Post

md raid and any other RAID responds to errors reported by the hardware itself, if the hardware does not report errors then it's all fine for it.

True, however if an attempt to read a block fails, it will be reported by the hardware as an error. If a bad block exists that never gets read - it has not yet caused any problem. Nightly background scrub of the array will perform a read on all blocks, so that they all get exercised (i.e. read, remapped if necessary, and corrected) regularly. Yes btrfs and zfs add checksums to the mix, which is a good enhancement, but the one thing I don't understand about this is... (see below).

Originally posted by starshipeleven View Post

btrfs move all the responsibility for data integrity from the hardware's firmware ECC features to the OS instead, so it catches silent errors that the hardware let slip through, and is overall more consistent than firmwares.

if btrfs and zfs add checksums to the data, ensuring integrity via software rather than via hardware, why do many of the prominent projects using zfs and btrfs recommend SAS controllers and enterprise grade disks? Shouldn't they be recommending the cheapest no-name SATA controllers, and the crappiest white-label generic sata drives? Why do they recommend more reliable SAS controllers and more reliable URE 1 in 10^15 drives if there is really no longer a correlation between hardware reliability and data integrity?

**numacross** · 09 July 2017, 12:25 PM

Originally posted by torsionbar28 View Post

if btrfs and zfs add checksums to the data, ensuring integrity via software rather than via hardware, why do many of the prominent projects using zfs and btrfs recommend SAS controllers and enterprise grade disks? Shouldn't they be recommending the cheapest no-name SATA controllers, and the crappiest white-label generic sata drives? Why do they recommend more reliable SAS controllers and more reliable URE 1 in 10^15 drives if there is really no longer a correlation between hardware reliability and data integrity?

It's because of reliability of the hardware itself. SAS disk are, supposedly, made out of more reliable components (which is not true most of the time, but shh it's marketing voodoo

).

On a protocol level SAS drives are reporting errors immediately to the controller while most consumer-grade SATA drives have a variable report time or don't report them at all (!). This is called SCT ERC - SMART Command Transport Error Recovery Control. You can check if your drive supports it with smartctl -a | grep SCT.

SAS drives are also able to be used with redundant controllers, while SATA require special interposer boards.

Another problem with consumer drives is the firmware. For example WD Green series tends to automatically park the headers on a short timer. While this saves energy in a desktop environment it's very bad for constant use because of mechanical wear. The power management features are usually convoluted and not user-controllable in such drives.

Lastly there aren't many pure SATA controllers available on the market since virtually all SAS HBAs do work with SATA drives anyway.

All in all using SAS HBAs with SATA "raid" or "NAS" drives (WD Red, RE, etc.) is good enough for ZFS to be as reliable as a hardware RAID solution (if not more because of other ZFS properties)

**starshipeleven** · 09 July 2017, 01:53 PM

Originally posted by duby229 View Post

No it does not. What it shows is another broken promise. How many more years will it take? When will people finally realize that BTRFS is fundamentally too bloated? Features that don't work and take years to develop and then never do work, ever, have hurt btrfs' reputation permanently. It's a dead horse. It has been for the last year. Time to start over and develop a new COW filesystem with minimal feature sets as the primary feature.

What is this BTRFS you are talking about? On btrfs the situation is improving steadily.

**gbcox** · 09 July 2017, 01:53 PM

Originally posted by starshipeleven View Post

fixed

Note that in the wiki the one that is "mostly ok" is just one of the features of RAID5/6, the other feature is still marked as "unstable".

So what exactly does "mostly ok" mean to a person who is using that "feature" and then they have an issue? Saying they are making progress in fixing bugs, fine... but "mostly ok" is a ridiculous thing to say for feature status. It either works or it doesn't. Another way to look at it is it's been a year, and basically it's still not fixed and who knows when it will be deemed "production ready". At this rate, BcacheFS will be completed before BTRFS.

**gbcox** · 09 July 2017, 02:08 PM

Originally posted by duby229 View Post

No it does not. What it shows is another broken promise. How many more years will it take? When will people finally realize that BTRFS is fundamentally too bloated? Features that don't work and take years to develop and then never do work, ever, have hurt btrfs' reputation permanently. It's a dead horse. It has been for the last year. Time to start over and develop a new COW filesystem with minimal feature sets as the primary feature.

Agreed. Many people have already moved on to XFS. BcacheFS has the best chance of being the next big thing.

**starshipeleven** · 09 July 2017, 02:24 PM

Originally posted by torsionbar28 View Post

True, however if an attempt to read a block fails, it will be reported by the hardware as an error.

That's just one of the possible errors. There are also silent errors, caused by disk controller mangling stuff, or RAID controller mangling stuff.

if btrfs and zfs add checksums to the data, ensuring integrity via software rather than via hardware, why do many of the prominent projects using zfs and btrfs recommend SAS controllers and enterprise grade disks?

Dunno, the most prominent project I know is FreeNAS and last time I checked they were recommending used SAS RAID controllers crossflashed to dumb SAS controllers and had no specific recommendation on drives.

Really it's not a matter of causing more corruption, it's a matter of wasting your time to troubleshoot lockups. ZFS and btrfs can't fix a crap card that drops disks from the array at random (and triggering disk failure alarms and wasting your time to investigate what's up).

That's mainly due to the general crappyness of Sata controller cards (Sata hardware raid are somewhat better but heh, not worth it), and due to the fact that SAS cards allow easy long-distance connection to SAS expanders/multiplexers while the same thing on Sata is a clusterfuck, has lower performance, requires specific controllers and adapters, and whatever.

Sata ports from the chipset are 100% OK, stuff on Sata controller cards or secondary controllers bolted on the mobo is usually cheap crap that has the tendecy to fail or locks up if put under serious workloads like decent software RAID arrays in a half-serious setup.

For 80-ish bucks you can get a used SAS RAID card with 8 ports that is reliable, has a heatsink and all. You cross-flash it to become a dumb SAS controller, and it's pretty sure it won't blow up in the worst possible moment. Really, you are making a multi-TB array in a server-grade mobo with ECC ram that costed you a 500 bucks or (much) more, it's reasonable to want a smooth experience.

**starshipeleven** · 09 July 2017, 02:34 PM

Originally posted by gbcox View Post

So what exactly does "mostly ok" mean to a person who is using that "feature" and then they have an issue? Saying they are making progress in fixing bugs, fine...

RAID 5/6 is not declared stable and the wiki also states that (a few lines below the RAID5/6 still shows "unstable"), so people using it know the risks.

but "mostly ok" is a ridiculous thing to say for feature status. It either works or it doesn't.

It's a way to say that it should be in far better shape, although something might still be missing. Really, it's a feature in fucking development, what's wrong with giving heads up every once in a while?

People on mailing list were complaining about not getting updates in the wiki, this is an answer to that.

Another way to look at it is it's been a year, and basically it's still not fixed and who knows when it will be deemed "production ready". At this rate, BcacheFS will be completed before BTRFS.

Yeah, it's also been a year and BcacheFS is still not fixed and who knows when it will be deemed "production ready". At this rate, btrfs will be completed before BcacheFS.

**starshipeleven** · 09 July 2017, 02:36 PM

Originally posted by gbcox View Post

Agreed. Many people have already moved on to XFS. BcacheFS has the best chance of being the next big thing.

Yeah, everyone has moved to XFS because it's clearly an equivalent of btrfs. Can you please stop believing blindly that post of BcacheFS author?

Announcement

Btrfs RAID 5/6 Support Is "Mostly OK" With Linux 4.12

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment