Originally posted by coder
View Post
Announcement
Collapse
No announcement yet.
Btrfs In Linux 6.8 Transitions Metadata Processing To Using Folios
Collapse
X
-
- Likes 3
-
Originally posted by CommunityMember View PostWith almost any change, there are almost always edge cases that degrade performance. Sometimes those can be addressed later, and sometimes the edge cases are acknowledged to be just edge cases (and rare in real world) and accepted in order to gain the improvements in other (more common) cases.
I'm just curious whether my understanding was incorrect, because I had been expecting folios to be a significant net win for performance.
Comment
-
Originally posted by varikonniemi View PostWhy was this merged with few percent performance degradation? The old model was not even deprecated!
Somehow i get the feeling that this change is needed to push forward their work of finally fixing up the raid hole in the future. But if it's so, it should not be merged before to hide the nastiness the raid hole fix necessitates. Or alternatively, if the "large folio" work coming will fix the performance issue, it should have been merged all at once.
I doubt folios would be much help fixing the "raid"5/6 write hole. Folios are (as far as I understand) "just" fancy pages, which minimum PAGE_SIZE large, the power of two and aligned to power of two and all data is contiguous, plus some other magic witchcraft that makes it useful like LRU, refcount and usecount.
As for fixing the write hole the RAID_STRIPE_TREE is the key to that. Last time I checked "RAID"5/6 in BTRFS is actually not (contrary to popular belief) copy on write. It is read-modify-write. This was done for performance reasons apparently, and I believe something was improved in kernel 6.2 to make the read-modify-write non- or less-destructive.
The ultimate fix for the "RAID"5/6 may actually require implementing something horrible, something so vile that even the most dedicated BTRFS fanboys (like me) should have nightmares about it even in full daylight. Yes, it is no more less disturbing than a GARBAGE COLLECTOR. In fact Josef Bacik has in his extent-tree v2 plans created a garbage collection tree. His blog is from 2021 so I sincerely hope he has had bad dreams about this too and luckily he dislike garbage collection as much as any other sane being, so in any case it will hopefully not be implemented like your average slow, latency hungry garbage collector but as a much smarter concept that traverse the garbage collection tree and collects a little by little unless the filesystem is nearly full, in which case it would have to empty out the garbage in larger batches.
I also agree with you that all the folio patches should ideally have been merged at once, but I can also understand that they want to tread carefully with such a change.
http://www.dirtcellar.net
Comment
-
Originally posted by waxhead View Post... a GARBAGE COLLECTOR. In fact Josef Bacik has in his extent-tree v2 plans created a garbage collection tree.
Comment
-
Originally posted by waxhead View PostThe ultimate fix for the "RAID"5/6 may actually require implementing something horrible, something so vile that even the most dedicated BTRFS fanboys (like me) should have nightmares about it even in full daylight. Yes, it is no more less disturbing than a GARBAGE COLLECTOR. In fact Josef Bacik has in his extent-tree v2 plans created a garbage collection tree. His blog is from 2021 so I sincerely hope he has had bad dreams about this too and luckily he dislike garbage collection as much as any other sane being, so in any case it will hopefully not be implemented like your average slow, latency hungry garbage collector but as a much smarter concept that traverse the garbage collection tree and collects a little by little unless the filesystem is nearly full, in which case it would have to empty out the garbage in larger batches.
edit:
Before, copygc had to periodically walk the entire extents + reflink btrees; now
it just picks the next-most-empty bucket and moves all the extents it contains.Last edited by varikonniemi; 16 January 2024, 07:29 AM.
Comment
-
Originally posted by fguerraz View PostI know it's often quoted as a BTRFS problem, but it's just a software RAID problem, it's not fixable.
This is a good read.
Incidentally the write hole do in fact also exists on "RAID"1/"RAID"10 *if* the NOCOW attribute is set as well.
And it is often quoted as a BTRFS problem because it is. BTRFS' do not handle the write hole very well due to RMW (or at least it used to - some fixes where added for "RAID"5, but not yet "RAID"6). Remember that unlike other implementations that may ignore the write hole (and thereby may introduce corrupted data) BTRFS catches the problem and should attempt to fix it.
Regardless of what filesystem and hardware solution is being used, our good friend Murphy usually ruins it all so tested, working backups are essential for data you really care about. And when you think about it , it all comes down to minimizing risk as avoidance it is usually not possible.
PS! Speaking about Murphy's law. If it is true that "Anything that can go wrong will go wrong" it actually means that the law itself will be wrong at some point, so the law is actually self contradictory which pleases me!
http://www.dirtcellar.net
Comment
Comment