Announcement

Collapse
No announcement yet.

Bcachefs Sees More Fixes For Linux 6.9-rc4, Reiterates Its Experimental Nature

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • yump
    replied
    Originally posted by Mitch View Post

    Just to add something here: Facebook has an insane number of machines in their fleets. Those machines have BTRFS as a standard filesystem and have done so for many years. I think it's the boot drive. They routinely make use of its compression, send, and receive from what I remember. Probably other features, too.

    Fedora also ships it by default. I think the BTRFS-isn't-stable arguments should have died ages ago. BTRFS is one of many super-reliable filesystems and has been for years regardless of how rough it may have been at its beginning, many, many years ago.
    Running the same 10 workloads on 100,000 machines is a lot more like 10 tests than 1,000,000. Does Facebook use compress-force? Do they use autodefrag? Quotas? Atime? Package managers that fsync a lot, and those that don't? Do they run the same filesystem for years without reformatting, with lots of churn, sometimes near full? Do they store VM images?

    Leave a comment:


  • Old Grouch
    replied
    Originally posted by DrYak View Post

    Somebody has been designing space probes as of lately I see.
    Hardly. It's standard practice in many safety-critical areas, including embedded systems. Avionics is one such area. There are many others.

    Leave a comment:


  • DrYak
    replied
    Originally posted by Old Grouch View Post
    As for failing RAM, triplicate the hardware, run the programs in lock-step, and vote on the results. That will catch a lot.
    Somebody has been designing space probes as of lately I see.

    Leave a comment:


  • DrYak
    replied
    Originally posted by Quackdoc View Post
    I've hit annoyances where I will run out of space and deleting files doesn't recover space.
    Originally posted by cynic View Post
    is there a pubblic discussion for this or did it just happen on your machine?
    ​It's an inherent property of CoW filesystem: they never overwrite stuff in place, but always first write a new updated version.
    So when deleting a file, a CoW file system needs to first write a new updated version of the metedata with the file not there anymore, and then update that as the latest version, and only then purge the data from the previous version.
    So counterintuitively, on a CoW filesystem, deleting files consumes extra space in the first step.
    Different CoW filesystems have different strategies to deal with that.

    On BTRFS, with older versions it used to be possible to paint yourself in such a corner. (e.g.: That was a failure mode on the original Jolla 1 Phone)
    More recent versions of the filesystem allocated a bit of extra reserved space to be able to get themselves out of this corner case. (e.g.: Filling the whole partition has happened to me a couple of times while doing snapshotted full-system upgrades on openSUSE Tumbleweed, and I was able to delete stuff to free space without much problems).

    Another problem is how BTRFS allocates chunks: You might need to allocate a new metadata chunk, but all the space has been allocated, and even if there's free space (e.g.: the data chunks aren't 100% full), you can't allocate more chunks of the type you need.

    Originally posted by Quackdoc View Post
    public discussion is all over the place, If you search "deleted files but still no free space" you will find a lot of results. I believe you needed to manually rebalance the disk but if you were unlucky this would still fail and you would need to add something else? can't remeber I wound up just wiping the drive since downtime was more important then the actual data itself
    So balancing:

    - used to be the approach needed to solve the "enosp" corner case, trying to free just enough space by reorganizing the content of the chunks that the deletion goes through. This nowadays isn't needed anymore thanks to reserved space dedicated of getting out of this situation.

    - is also the way to get out of "We have free space, but we actually need chunks of the other type because those are all full, and there's no chunk space left to allocate" (e.g.: you have free space on data, but ran out of metadata and out of free chunks to allocate as new metadata chunks. Or vice-versa). This releases chunk space, which can subsequently be reallocated for the needed chunks. This is usually taken care by the OS which does periodic maintenance (running scrubs and balancing) so it shouldn't happen nowadays on distro that are designed provide BTRFS as a feature, but could happen on your homegrown custom stuff if you didn't think about handling balancing as part of regular maintenance.

    - before wiping, the last ditch effort is adding temporarily a bit of extra space to the pool (add another disk partition to the filesystem), and then try again running the balancing to free chunks and then remove this extra added space aftwerward. This recovers nearly all "not enough space" troubles (I have never seen allocation problems that survive that)

    Originally posted by Mitch View Post
    Just to add something here: Facebook has an insane number of machines in their fleets. Those machines have BTRFS as a standard filesystem and have done so for many years. I think it's the boot drive. They routinely make use of its compression, send, and receive from what I remember. Probably other features, too.
    And the problem lies in the "probably other features" part.

    Facebook isn't testing RAID5/6. Absolutely all their BTRFS running hardware is running in dual drive configurations, i.e. using exclusively raid0 and raid1 modes.
    They have no interest in supporting RAID5/6 (as it's not in their use case) and are not spending much resources on it, hence these modes are still marked as unstable in the official documentation.
    (OTOH, there are clearly marked as "here by dragons" so anyone complaining about dataloss while using those had it coming).

    Originally posted by Mitch View Post
    BTRFS is one of many super-reliable filesystems and has been for years regardless of how rough it may have been at its beginning, many, many years ago.
    ...except for the parts that are still marked "unstable" or "experimental" nowadays.

    Also speaking about wide deployment: a couple of million SteamDecks running BTRFS on their root partitions is also a good example.

    The catch is: by default SteamDeck's SteamOS mounts root as read-only, so it drastically reduces the risk that somebody who has no idea about how to use BTRFS safely breaks something by accident. It will be either written to only during system upgrades (which people at Valve are competent enough to avoid breaking BTRFS) or requires the end-user to unlock the root partition - at which point it's a great filter to make sure you only have people who have some idea what they are doing.
    ​(And suprisingly most of the unofficial software that user add to their SteamDeck (flatpaks, EmuDeck, etc.) plays nicely with that locked root and stays put within the ext4 home partition. There aren't many "Copy past those instruction on the command line" that unlock root and break it around).

    Leave a comment:


  • varikonniemi
    replied
    Originally posted by npwx View Post
    "Worst case scenario you're not going to lose data, as long as you can be patient [...]"

    Everything he does seems to be special. It is still a highly experimental filesystem. Worst case is complete data loss, as for any experimental (and non-experimental) filesystem.
    In the history of filesystems i don't think any one has been released with such a bug that systematically goes and overwrites all your data to result in data loss. So that leaves one possibility: the data is there, but not accessible due to metadata being lost due to a bug. Bcachefs does not suffer from this error mode, as it's data structures can be described as self-documenting and can be recreated by scanning the device. This is why it has never suffered a bug where data loss has happened. Just temporary unavailability.
    Last edited by varikonniemi; 13 April 2024, 06:48 AM.

    Leave a comment:


  • cynic
    replied
    Originally posted by Quackdoc View Post

    public discussion is all over the place, If you search "deleted files but still no free space" you will find a lot of results. I believe you needed to manually rebalance the disk but if you were unlucky this would still fail and you would need to add something else? can't remeber I wound up just wiping the drive since downtime was more important then the actual data itself
    oh right, I got what you're talking about.

    this issue is actively worked on and I think they got good result lately (I never experienced that situation in first person because I try to avoid it).

    Leave a comment:


  • cynic
    replied
    Originally posted by timofonic
    I previously specified it was a joke.

    sorry, missed the joke clarifiation!​

    Originally posted by timofonic
    I'm neuroatypical, but your Aspie traits are over 9000

    Leave a comment:


  • Quackdoc
    replied
    Originally posted by cynic View Post
    is there a pubblic discussion for this or did it just happen on your machine?
    public discussion is all over the place, If you search "deleted files but still no free space" you will find a lot of results. I believe you needed to manually rebalance the disk but if you were unlucky this would still fail and you would need to add something else? can't remeber I wound up just wiping the drive since downtime was more important then the actual data itself

    Leave a comment:


  • woddy
    replied
    Originally posted by Quackdoc View Post

    It failed irrecoverably across multiple devices a total of I think it was 7 times within a span of 6-9 months for me, single drive devices
    With all due respect, but if we take everything that users write on social media, forums etc. as valid. the world would have ended long ago.​

    Leave a comment:


  • Old Grouch
    replied
    Originally posted by cynic View Post

    Cable failing/disk failing is something that software can detect and manage.
    Defective RAM is a completely different situation. On failing RAM your software cannot run as it is intended and cannot give any guarantee of correctness.
    I fear we are talking past each other.

    The problem is the assumption that RAM is perfect, Flash memory is not perfect, and neither are HDDs. Flash uses some pretty sophisticated algorithms to massage unreliable storage into something with a well-determined error-rate. HDDs do the same. As do CDs. However, the same is not done for RAM (except in some specialised areas), which leads to the insanity of bit errors either not being detectable, or being detectable but not reported, or the availability of reports being ignored by the operating system. The reason is that adding the appropriate code would slow down RAM markedly. As an industry, especially in consumer devices, we use performance as an excuse to ignore data integrity.

    As for failing RAM, triplicate the hardware, run the programs in lock-step, and vote on the results. That will catch a lot. And you can guarantee correctness to the level of the chance of the same error occurring in two hardware instances simultaneously, leading to an incorrect vote result. Non-zero, but low. You will need to do this even with formally-proven software because hardware exists in the real world and has an error rate independent of your logical proofs. Proving your software is 'correct' (according to its specification) tells you nothing about the hardware it runs on. A stray cosmic-ray alpha particle can ruin your day.

    Expecting a single instance of anything to run perfectly is 'brave'.

    Leave a comment:

Working...
X