Announcement

**lsatenstein** · 11 July 2020, 09:49 AM

For the desktop, I chose a file system that was efficient, space wise, and cpu wise. My decision led me to xfs.
For the past few years, I have consistantly installed xfs, and have not had issues with it's use.
For example. Set aside a 20gig partition.
Format it with btrfs, then look at available space.
Format the same space with ext4
Format the same space with xfs

Do a 15gig write/read test with each format.

Test crash recovery.

For SSDs, choose the one you feel most comfortable to use.

**waxhead** · 21 February 2021, 08:03 PM

A bit slow to respond , but what the heck. As valid now as it was back then.

Originally posted by Space Heater View Post

You seem to forget that I clearly stated the common case is a single disk system. In this case, btrfs will only be able to tell the user corruption occurred, the warnings are not actionable by the user. Most users do not have good backups, so the ability to detect corruption is strongly countered by btrfs being more likely to lose your data in the event of disk failure.

BTRFS does by default create a DUP(licate) storage profile for metadata on single disks systems. E.g. you should in theory be less likely to loose your filesystem and/or get screwed up metadata resulting in garbage data with BTRFS compared to other filesystems. You can even choose DUP for your data so if you are preared to waste half your disk space you increase the likelihood of recovering from a corruption.

Originally posted by Space Heater View Post

File system durability is the foundation, it doesn't matter how many extra features another file system has if in the end you are sacrificing some of its ability to not corrupt data. Not to mention that end users will not be exposed to the advanced features of btrfs, and therefore few if any will take advantage of its features.

So compare BTRFS with Ext4 for example. Ext4 may return garbage data without you even noticing. I think that most people will expect that if they save a file containing 1,2,3,4 they should be able to get back 1,2,3,4 when they read that file back. With non-checksumming filesystems you may get 1,2,255,4 back - that simply do not happen with checksumming filesystems.

Originally posted by Space Heater View Post

I'm glad you're not denying that you don't have empathy for users, and so there's not much more to discuss about btrfs being ready as a default. You've openly said that anyone not using an LTS kernel should expect/deserve data loss, that's not how kernel development works at all and that's certainly not how Fedora delivers kernels to their users.

Kernel development and Fedora is the same thing. Bleeding edge. And in its nature not as well tested as for example LTS kernels. It is the same with purchasing a brand new redesigned car, you should not be surprised if there is a few recalls since despite the testing a few things was simply not good enough.

Originally posted by Space Heater View Post

As for your dubious claim about me, saying a file system handles failure worse than ext4, and has worse data recovery abilities than ext4 is not some personal insult to the developers, it's reality. You're unable to respond to what I'm saying and citing other than to dismiss it and then say I'm being mean to btrfs developers for citing an academic paper and pointing out common shortcomings users run into. Further, btrfs developers have received multiple reports about its unfriendly behavior and the response from them has been radio silence, I'd say that's in line with a lack of empathy for end users.

Compared to BTRFS and/or ZFS for example Ext4 is not even designed to handle data corruption. I have been following the BTRFS mailing list or years and I never seen any unfriendly behavior unless people are unfriendly on the list. Your claims are simply untrue.

Originally posted by Space Heater View Post

Why do you have zero empathy for the ext4 developers that worked hard to make their file system more durable? How do other file system developers manage to avoid dropping their users to a recovery shell as much as btrfs?

I have nothing against ext4 developers. I use ext4 on a mdraid6 setup simply because it is the best combo for that job. Ext4 is an excellent filesystem , but it is simply not designed to handle certain tasks that newer filesystems do. If I recall correctly ext4 recently got checksums for metadata which is very nice. The reason you are dropped to a recovery shell with BTRFS may simply be because it warns you something other filesystems ignore / can't detect.

Originally posted by Space Heater View Post

Yes I have written code all by myself, have you?

Plenty. Started with C64 basic back in the late 80's, have been programming: Basic, Amos, AmigaE, MC68k assembly, 6502 assembly, Pascal, PHP, C and probably a couple of other languages as well. Have coded at least one popular disk monitoring utility which is quite popular (Windows), a few database engines, monitoring/testing software with web interface and even a music editor (ProTracker clone) and mostly tools for own personal usage.

Originally posted by Space Heater View Post

Yeah I'm sure that any sane user should blame themselves when a file system loses their data. You're divorced from reality if you think most users have backups, regardless of whether or not they should have backups. Once you wrap your head around that you will realize how non-trivial it is for a file system to lose data, especially a file system that would be used by default. You are not the average user.

Well BTRFS does NOT loose data easily - that is the entire point. If you do not backup your valuable data it is simply not valuable for you. Non checksumming filesystems can return garbage without you even noticing. Besides Fedora aims to use BTRFS by default now anyway....

**Space Heater** · 22 February 2021, 01:08 AM

Originally posted by waxhead View Post

A bit slow to respond , but what the heck. As valid now as it was back then.

BTRFS does by default create a DUP(licate) storage profile for metadata on single disks systems.E.g. you should in theory be less likely to loose your filesystem and/or get screwed up metadata resulting in garbage data with BTRFS compared to other filesystems. You can even choose DUP for your data so if you are preared to waste half your disk space you increase the likelihood of recovering from a corruption.

No, btrfs does *not* create duplicate metadata by default on SSDs (which is clearly the common case now). If you had bothered to read the paper I linked or actually looked at the man page for mkfs.btrfs where it clearly states "Default on a single device filesystem is DUP, unless an SSD is detected, in which case it will default to single" you wouldn't be saying silly stuff like this.

A relevant quote from the original paper I linked: "Btrfs maintains two independent data structures for each directory entry for enhanced performance, but upon failure of one, does not use the other for recovery."

Originally posted by waxhead View Post

So compare BTRFS with Ext4 for example. Ext4 may return garbage data without you even noticing. I think that most people will expect that if they save a file containing 1,2,3,4 they should be able to get back 1,2,3,4 when they read that file back. With non-checksumming filesystems you may get 1,2,255,4 back - that simply do not happen with checksumming filesystems.

Checksumming doesn't help for the typical case of 1 drive beyond letting the user know that there are problems with their data as there is no redundant sources to correct the issue. We've been over this already, the problem is that btrfs is the worst when it comes to resiliency and data recovery in the single drive use case when compared to other standard linux filesystems.

Originally posted by waxhead View Post

Kernel development and Fedora is the same thing. Bleeding edge. And in its nature not as well tested as for example LTS kernels. It is the same with purchasing a brand new redesigned car, you should not be surprised if there is a few recalls since despite the testing a few things was simply not good enough.

This is simply making excuses for shortcomings of btrfs, and confusing bugs due to being in development with systemic design issues. Btrfs has been around for at least a decade now, let's stop pretending that it is at all ok for it to be a bleeding edge filesystem. Also, for the record Fedora releases are actually pretty stable and their goal is to not be bleeding edge unless you're running rawhide.

Originally posted by waxhead View Post

Compared to BTRFS and/or ZFS for example Ext4 is not even designed to handle data corruption.

No one is saying that ext4 is designed to checksum and correct data corruption, the point is that it is more resilient in the face of corruption that has already occurred. I'll even quote the paper for you "We notice potentially fatal omissions in error detection and recovery for all file systems except for ext4."

Originally posted by waxhead View Post

I have been following the BTRFS mailing list or years and I never seen any unfriendly behavior unless people are unfriendly on the list. Your claims are simply untrue.

You clearly don't read the btrfs mailing list, Neal Gompa is still trying to get a corrupted btrfs fs to mount with help from Josef Bacik. It also came out that btrfs is not suitable for 32-bit systems due to page->index potentially overflowing due to metadata growing despite the fs being less than 16TB.

Originally posted by waxhead View Post

I have nothing against ext4 developers. I use ext4 on a mdraid6 setup simply because it is the best combo for that job. Ext4 is an excellent filesystem , but it is simply not designed to handle certain tasks that newer filesystems do. If I recall correctly ext4 recently got checksums for metadata which is very nice. The reason you are dropped to a recovery shell with BTRFS may simply be because it warns you something other filesystems ignore / can't detect.

You are dropped to a recovery shell because systemd (and gnome etc.) cannot currently handle the rootfs being read-only. Compound this with the fact that btrfs is *more* likely to run into problems with flaky hardware due to dynamic metadata.

Originally posted by waxhead View Post

Well BTRFS does NOT loose data easily - that is the entire point.

Its use of dynamic metadata means that when hardware failure occurs it is more likely to become unmountable and for data to not be recoverable. Please actually look at my citation that compared ext4, btrfs, and xfs and found that btrfs was the worst of the bunch.

Another relevant quote from the paper: "Btrfs, which is a production grade file system with advanced features like snapshot and cloning, has good failure detection mechanisms, but is unable to recover from errors that affect its key data structures, partially due to disabling metadata replication when deployed on SSDs."

Originally posted by waxhead View Post

If you do not backup your valuable data it is simply not valuable for you. Non checksumming filesystems can return garbage without you even noticing. Besides Fedora aims to use BTRFS by default now anyway....

You're just repeating yourself because after all this time you still do not understand what I am saying. Maybe take another few months to actually read Evaluating File System Reliability on Solid State Drives.

**waxhead** · 22 February 2021, 07:52 PM

Originally posted by Space Heater View Post

No, btrfs does *not* create duplicate metadata by default on SSDs (which is clearly the common case now). If you had bothered to read the paper I linked or actually looked at the man page for mkfs.btrfs where it clearly states "Default on a single device filesystem is DUP, unless an SSD is detected, in which case it will default to single" you wouldn't be saying silly stuff like this.

A relevant quote from the original paper I linked: "Btrfs maintains two independent data structures for each directory entry for enhanced performance, but upon failure of one, does not use the other for recovery."

Checksumming doesn't help for the typical case of 1 drive beyond letting the user know that there are problems with their data as there is no redundant sources to correct the issue. We've been over this already, the problem is that btrfs is the worst when it comes to resiliency and data recovery in the single drive use case when compared to other standard linux filesystems.

This is simply making excuses for shortcomings of btrfs, and confusing bugs due to being in development with systemic design issues. Btrfs has been around for at least a decade now, let's stop pretending that it is at all ok for it to be a bleeding edge filesystem. Also, for the record Fedora releases are actually pretty stable and their goal is to not be bleeding edge unless you're running rawhide.

No one is saying that ext4 is designed to checksum and correct data corruption, the point is that it is more resilient in the face of corruption that has already occurred. I'll even quote the paper for you "We notice potentially fatal omissions in error detection and recovery for all file systems except for ext4."

You clearly don't read the btrfs mailing list, Neal Gompa is still trying to get a corrupted btrfs fs to mount with help from Josef Bacik. It also came out that btrfs is not suitable for 32-bit systems due to page->index potentially overflowing due to metadata growing despite the fs being less than 16TB.

You are dropped to a recovery shell because systemd (and gnome etc.) cannot currently handle the rootfs being read-only. Compound this with the fact that btrfs is *more* likely to run into problems with flaky hardware due to dynamic metadata.

Its use of dynamic metadata means that when hardware failure occurs it is more likely to become unmountable and for data to not be recoverable. Please actually look at my citation that compared ext4, btrfs, and xfs and found that btrfs was the worst of the bunch.

Another relevant quote from the paper: "Btrfs, which is a production grade file system with advanced features like snapshot and cloning, has good failure detection mechanisms, but is unable to recover from errors that affect its key data structures, partially due to disabling metadata replication when deployed on SSDs."

You're just repeating yourself because after all this time you still do not understand what I am saying. Maybe take another few months to actually read Evaluating File System Reliability on Solid State Drives.

Well if you are going to be pedantic so be it - I said that BTRFS does by default crate DUP profiles on single DISK systems.

The paper you refer to run tests on BTRFS and the tests are done on kernel 4.17 which is NOT an LTS kernel and btrfs profs is of version v4.4. I realize that you are not too focused on LTS kernels , but it is still interesting to see how the study is done.

So, with your example of 1 device. Yes, I am perfectly aware that BTRFS can't correct corruptions on a single DISK system unless there is a redundanc copy available. This is possible on HDD's and also on certain SSD's if you choose to rebalance to DUP. But that is not the scope of your discussion.

More importantly - the study says that BTRFS is the ONLY filesystem that detects all I/O and corruption events. E.g. does not return garbage if something goes wrong.

Regarding your quote from the paper where "we notice potential fatal omission in error detection...". Do notice that the ext4 section says clearly that "ext4 may incur slient errors, and not notify the user about the errors". Read errors may completely remove items, write errors can run into an infinite loop. While ext4 may be more resillient e.g. the filesystem will "work" almost what bad things happen you still have no idea if you can trust the data on the filesysem as it may return garbage for you.

And yes, I read the mailing list. Mr. Gompa have gotten lots of help - I would say that Mr. Bacik has been quite friendly to him. And when was it BTRFS' fault that other software does not handle a read only fs? BTRFS goes read only to prevent further dataloss/damage and to allow you to rescue your data.

And yes again , I have to repeat myself because you still pretend that I do not understand what you are saying. The paper you refer to use a rather old kernel in BTRFS terms spesifically. I am not saying that other filesystems are bad and need to go away , I am merely stating that BTRFS is NOT as bad as it may seem. Lots of sanity checks have been done AFTER kernel 4.17 .

**Space Heater** · 22 February 2021, 09:46 PM

Originally posted by waxhead View Post

Well if you are going to be pedantic so be it - I said that BTRFS does by default crate DUP profiles on single DISK systems.

That's not being pedantic at all, the vast majority of users are running single SSD systems and they do not have duplicated metadata as you were clearly trying to imply.

Originally posted by waxhead View Post

The paper you refer to run tests on BTRFS and the tests are done on kernel 4.17 which is NOT an LTS kernel and btrfs profs is of version v4.4.

No one said it was an LTS kernel, it was likely the current kernel at the time the paper was written.

Originally posted by waxhead View Post

I realize that you are not too focused on LTS kernels , but it is still interesting to see how the study is done.

So, with your example of 1 device. Yes, I am perfectly aware that BTRFS can't correct corruptions on a single DISK system unless there is a redundanc copy available. This is possible on HDD's and also on certain SSD's if you choose to rebalance to DUP. But that is not the scope of your discussion.

More importantly - the study says that BTRFS is the ONLY filesystem that detects all I/O and corruption events. E.g. does not return garbage if something goes wrong.

Yes, and as stated before, the error detection is nice but ultimately not very actionable. The fact that this is coupled with btrfs being the worst at actually remaining available to the user (e.g. being mountable without intervention) after said data corruption occurs, it's generally not a good trade off for single disk users. Not to mention btrfs was found to be the worst in terms of data recovery by the authors. So in the end we have that btrfs can tell the user when problems happen, but then it's worse at dealing with the problems and worse at salvaging data from these problems compared to ext4 and xfs.

Originally posted by waxhead View Post

Regarding your quote from the paper where "we notice potential fatal omission in error detection...". Do notice that the ext4 section says clearly that "ext4 may incur slient errors, and not notify the user about the errors". Read errors may completely remove items, write errors can run into an infinite loop. While ext4 may be more resillient e.g. the filesystem will "work" almost what bad things happen you still have no idea if you can trust the data on the filesysem as it may return garbage for you.

Right ext4 is not perfect by any means, the point is that a filesystem remaining in a working state is absolutely critical for average users who are not going to be able to deal with performing full system backups and restores, and do not know how to deal with a sudden drop to a recovery shell. This is absolutely the common case, and for them it is better to gain access to their system along with *some* corrupted data rather than having an unmountable fs and therefore unbootable system that they end up blowing away with a fresh install (resulting in zero data recovered). The entire point is that ext4 and xfs can keep chugging which at least gives the average user a chance to easily boot and recover data, btrfs dropping them to a recovery shell on boot is effectively wiping the data for most non-technical users.

Originally posted by waxhead View Post

And yes, I read the mailing list. Mr. Gompa have gotten lots of help - I would say that Mr. Bacik has been quite friendly to him.

Then you see that he is needing to get kernel developers involved to write special btrfs-progs and kernel patches just to get the filesystem to mount (and last I checked he still hasn't been able to mount the filesystem). I do not see issues like this posted frequently on the ext4 or xfs mailing lists, likewise on the OpenZFS issue tracker and I don't think it's entirely a coincidence or confirmation bias.

Originally posted by waxhead View Post

And when was it BTRFS' fault that other software does not handle a read only fs? BTRFS goes read only to prevent further dataloss/damage and to allow you to rescue your data.

Btrfs going read-only whenever it encounters a problem is pretty unique among filesystems. Compare it to ZFS which has much better behavior in terms of allowing users to gracefully boot into a degraded mode (without special kernel arguments) and attempt to salvage their data. The fact that btrfs' default behavior is not currently compatible with the rest of the standard Linux desktop ecosystem is a problem for users, shifting blame elsewhere doesn't change the effective result.

Originally posted by waxhead View Post

And yes again , I have to repeat myself because you still pretend that I do not understand what you are saying. The paper you refer to use a rather old kernel in BTRFS terms spesifically. I am not saying that other filesystems are bad and need to go away , I am merely stating that BTRFS is NOT as bad as it may seem. Lots of sanity checks have been done AFTER kernel 4.17 .

When you ignore arguments about filesystem availability being important and instead blame non-technical users for not having backups you are either not understanding the issue or you are being intellectually dishonest. I prefer to think you just don't understand.

**waxhead** · 23 February 2021, 08:51 AM

Originally posted by Space Heater View Post

1. That's not being pedantic at all, the vast majority of users are running single SSD systems and they do not have duplicated metadata as you were clearly trying to imply.

2. No one said it was an LTS kernel, it was likely the current kernel at the time the paper was written.

3. Yes, and as stated before, the error detection is nice but ultimately not very actionable. The fact that this is coupled with btrfs being the worst at actually remaining available to the user (e.g. being mountable without intervention) after said data corruption occurs, it's generally not a good trade off for single disk users. Not to mention btrfs was found to be the worst in terms of data recovery by the authors. So in the end we have that btrfs can tell the user when problems happen, but then it's worse at dealing with the problems and worse at salvaging data from these problems compared to ext4 and xfs.

4. Right ext4 is not perfect by any means, the point is that a filesystem remaining in a working state is absolutely critical for average users who are not going to be able to deal with performing full system backups and restores, and do not know how to deal with a sudden drop to a recovery shell. This is absolutely the common case, and for them it is better to gain access to their system along with *some* corrupted data rather than having an unmountable fs and therefore unbootable system that they end up blowing away with a fresh install (resulting in zero data recovered). The entire point is that ext4 and xfs can keep chugging which at least gives the average user a chance to easily boot and recover data, btrfs dropping them to a recovery shell on boot is effectively wiping the data for most non-technical users.

5. Then you see that he is needing to get kernel developers involved to write special btrfs-progs and kernel patches just to get the filesystem to mount (and last I checked he still hasn't been able to mount the filesystem). I do not see issues like this posted frequently on the ext4 or xfs mailing lists, likewise on the OpenZFS issue tracker and I don't think it's entirely a coincidence or confirmation bias.

6. Btrfs going read-only whenever it encounters a problem is pretty unique among filesystems. Compare it to ZFS which has much better behavior in terms of allowing users to gracefully boot into a degraded mode (without special kernel arguments) and attempt to salvage their data. The fact that btrfs' default behavior is not currently compatible with the rest of the standard Linux desktop ecosystem is a problem for users, shifting blame elsewhere doesn't change the effective result.

7. When you ignore arguments about filesystem availability being important and instead blame non-technical users for not having backups you are either not understanding the issue or you are being intellectually dishonest. I prefer to think you just don't understand.

1-3: Well for those running HDD's it is a different matter, but if you are on a SSD that does some deduplication / store the duplicate data in the same section then yes, you are absolutely at risk for metadata corruption. Personally I would much rather have filesystem stop before corrupt metadata wreck havoc elsewhere. And I am not saying that the study is bad - it is just interesting to see that they decided to do this study on a non-LTS kernel, I did not see if it was a .1 or .10 update so hard to tell how "well tested" that particular kernel is. Anyway there has been lots of improvements and fixes to BTRFS since kernel 4.17. And yes, I am aware that many do not have backups , but if you value at least some of your data (I do) then a backup disk is not THAT costly anymore , and I would much rather have a filesystem that stops when something is broken. That way I can restore my backups and be happy - I guess that is a matter of taste.

4. Yes, I hear you - my point is simply that you are at risk. Copying a file may succeed, but you have no idea if the data came back good or not. Sure bad RAM for example can give the same symptom on check-summing filesystems as well, but that is a different problem.

5. Well I don't interpret this as unfriendly behavior. If you instead mean user friendly you got a point. I have been following the mailing list for quite a while and from my experience people there are very friendly and helpful. It is not often you see horror stories on the mailing list anymore, and those who are seem to come from rather interesting setups like BTRFS on top of some other layer or non 64bit x86 like setups. The vast majority seems to use x86 these days.

6. Yes, BTRFS is sensitive to metadata corruption, but going read only is sane. It prevents further filesystem corruption that you can get with other filesystems. What if some metadata on a non-checksumming filesystem tells you to read the middle of a file from some completely nonsense location? Using a video file as an example you may end up with adult content in the middle of your documentary about supercomputers. I see your point in recovering most of your documentary, but if this was something else, like noise gathered from a SETI project you might accidentally think you found the next WOW signal without you even knowing that file and/or filesystem is bad in the first place.

7. No I am not ignoring them. I simply try to get across that if you have data you value you should have tested backups. And that it is good that a filesystem stops to signal to you that something bad happen. Hopefully making you check your hardware before restoring from backups. Having non-technical users believe that the files they managed to copy is ok just because it copied is not particularly brilliant the way I see it. Having good uptime is not necessarily the same as having a reliable system.

**Space Heater** · 23 February 2021, 02:55 PM

Originally posted by waxhead View Post

6. Yes, BTRFS is sensitive to metadata corruption, but going read only is sane. It prevents further filesystem corruption that you can get with other filesystems.

Do you have any evidence that ZFS (OpenZFS of course) is more likely to lose data than btrfs, despite both being CoW and ZFS not going read-only as easily/frequently as btrfs? If anything I see more issues being posted about btrfs problems and not being able to recover data than I do with ZFS. Being more prone to unrecoverable metadata corruption (by design) is a serious issue for any file system.

**waxhead** · 25 February 2021, 05:01 AM

Originally posted by Space Heater View Post

Do you have any evidence that ZFS (OpenZFS of course) is more likely to lose data than btrfs, despite both being CoW and ZFS not going read-only as easily/frequently as btrfs? If anything I see more issues being posted about btrfs problems and not being able to recover data than I do with ZFS. Being more prone to unrecoverable metadata corruption (by design) is a serious issue for any file system.

I don't have much experience with ZFS other than reading about it and trying to set it up once on FreeBSD.

Announcement

There's A Proposal To Switch Fedora 33 On The Desktop To Using Btrfs

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment