Announcement

Collapse
No announcement yet.

XFS Working Towards Online Repair, Many Underlying Improvements

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by finalzone View Post

    Someone pointed XFS had more capabilities and features in IRIX which got lost when ported on Linux kernel like Guaranteed-rate I/O. Additionally, the fact that XFS had B-Tree functions pre-dating BTFS may explain why Red Hat switched to that file-system.
    Yes, XFS on Irix had some "realtime" feature. It's still mentioned in some of the XFS man pages (e.g mkfs.xfs). It appears you had to set aside some disk space just for this feature though. And nowadays disks are much faster, so it's not needed.
    As for b-trees, that's just for directory indexes. The filesystem structure is (now) pretty ordinary, just broken up into "allocation groups" to allow for parallelism on drive arrays. In BTRFS the whole filesystem is apparently one giant b-tree.

    Comment


    • #12
      Originally posted by Imroy View Post
      In BTRFS ....
      BTRFS tries (as many new implementations would logically do) to take some of the best ideas (that are not prohibited to be utilized under IP constraints) from the filesystems that came before it (stand on the shoulders of giants) within the limitations of the targeted OS's capabilities, and provide a new way. BTRFS might some day be the one, but it has not yet achieved the complete promise of a better fs for all (use cases) for all (users). And it may never be the one. Which is also OK (it apparently addresses specific issues that FB encounters, and for the 98% of the of the entire user base, fb, google, and amazon, are the entire Internet, so 1 out of 3 ain't bad?).

      Comment


      • #13
        Originally posted by finalzone View Post

        Someone pointed XFS had more capabilities and features in IRIX which got lost when ported on Linux kernel like Guaranteed-rate I/O. Additionally, the fact that XFS had B-Tree functions pre-dating BTFS may explain why Red Hat switched to that file-system.
        Many filesystems used B-Trees long before BTRFS, the thing with BTRFS is that it implements CoW-friendly B-trees.

        Comment


        • #14
          Meanwhile BTRFS...
          1) Already got all that and much more than just that. Say it got defrag.
          2) It even works - and works great. Over time I gave btrfs quite a run on all kinds of odd and flaky HW - just to see how it performs and how it could dodge the bullets. Hell, worked like a charm. It manages to correct data under nearly any circumstances (as long as there is sane copy remains ofc). Damn, it can even wing it on single storage (if you used DUP).
          3) Not to mention checksums are great as they hilight HW issues way before they would hit in the back and cause major damage.
          4) Is much more fun to manage compared to XFS + other stuff. Ain't it cool to just plug drive, show it to btrfs - voila, some extra space added. Even if you used RAID. Yes, no raid rebuild as mandatory step, etc.
          5) It can convert RAID levels on the fly. I gave it a try and managed to achieve quite funny feats, difficult or impractical otherwise.
          6) Well, I've used XFS, and compared to btrfs, it tends to damage files under imperfect conditions, even if you merely get powerloss crash or so you can get files in real crap state, like some blocks filled with zeros - that drives plenty of program nuts. And then XFS can be goddamn slow on some metadata operations. Somehow btrfs isn't fastest of the kind either, BUT I never managed to stumble on pathological slowdown cases under more or less sane real world use. Unlike it happened with XFS.

          So, yep, RH, nice try, but, honestly, this looks stupid to say the least. Pretty nice riding of dead horse though, mr Zombie.

          Comment


          • #15
            Originally posted by finalzone View Post
            Additionally, the fact that XFS had B-Tree functions pre-dating BTFS may explain why Red Hat switched to that file-system.
            it can't explain that. everyone had btrees before btrfs, nobody had cow btrees. and 13 years later still only btrfs has cow btrees. redhat had switched to python from btrfs because redhat employs python devs while all btrfs devs have left redhat

            Comment


            • #16
              Originally posted by SystemCrasher View Post
              Meanwhile BTRFS...
              it got defrag.
              Damn, it can even wing it on single storage (if you used DUP).
              Is much more fun to manage compared to XFS + other stuff.
              Defrag doesn't play well with other features iirc, such as snapshots? I think it causes some issues in combination with some other features too. Not that you can't/shouldn't use it, you just need to be aware of what can happen and to do any extra effort to fix that after the defrag if you care about avoiding the issues.

              DUP on single storage device is ok on HDD iirc, but not on SSD. The BTRFS wiki advises against it for SSD and iirc there's something about DUP for metadata as default on HDD but not SSD for this reason, as SSD products can optimize internal storage by de-duping blocks despite having duplicates on the file/block layer presented to the OS/user, which gives illusion of redundancy..

              XFS is less maintenance in general imo, BTRFS requires you to do extra maintenance unless you're using default setup from a distro like openSUSE where this is handled for you, otherwise you need to be aware of it if you want to reduce risks. But yes, BTRFS has some nice features, I've experienced problems with BTRFS setups(to be fair it's been years since my last try) and those were not pleasant, they were never an issue on XFS(not that BTRFS doesn't prevent issues that you'd be able to experience on XFS, just from experience I had more with BTRFS than XFS).

              Many of the issues that I ran into back then have been resolved since, I just haven't adopted BTRFS yet as it requires more effort on my part vs EXT4/XFS and the like to have properly setup and managed long-term without dealing with BTRFS specific troubles(which aren't likely to be hard to run into on small storage and naive usage if you don't have those maintenance scripts setup and awareness of CoW gotchas with certain workloads).

              Comment


              • #17
                Originally posted by SystemCrasher View Post
                Meanwhile BTRFS...
                1) Already got all that and much more than just that. Say it got defrag.
                2) It even works - and works great. Over time I gave btrfs quite a run on all kinds of odd and flaky HW - just to see how it performs and how it could dodge the bullets. Hell, worked like a charm. It manages to correct data under nearly any circumstances (as long as there is sane copy remains ofc). Damn, it can even wing it on single storage (if you used DUP).
                3) Not to mention checksums are great as they hilight HW issues way before they would hit in the back and cause major damage.
                4) Is much more fun to manage compared to XFS + other stuff. Ain't it cool to just plug drive, show it to btrfs - voila, some extra space added. Even if you used RAID. Yes, no raid rebuild as mandatory step, etc.
                5) It can convert RAID levels on the fly. I gave it a try and managed to achieve quite funny feats, difficult or impractical otherwise.
                6) Well, I've used XFS, and compared to btrfs, it tends to damage files under imperfect conditions, even if you merely get powerloss crash or so you can get files in real crap state, like some blocks filled with zeros - that drives plenty of program nuts. And then XFS can be goddamn slow on some metadata operations. Somehow btrfs isn't fastest of the kind either, BUT I never managed to stumble on pathological slowdown cases under more or less sane real world use. Unlike it happened with XFS.

                So, yep, RH, nice try, but, honestly, this looks stupid to say the least. Pretty nice riding of dead horse though, mr Zombie.
                10 years without proper RAID5/RAIDZ support already shows btrfs is a deadend.
                Nice try, but, honestly, this looks stupid to say the least. I'm not gonna jump to a dying horse, Mr. BTRFS.

                Comment


                • #18
                  Originally posted by polarathene View Post
                  Defrag doesn't play well with other features iirc, such as snapshots?
                  It would un-share extents, therefore, each "instance" of data would occupy it's own space, turning fully independent. There was CoW-aware defrag, but it had its own set of problems so now it reverted to unshare behavior. And if space saving due to multi-referenced extents is important, there are ways to share extents again. Say tools like jdupes could "invert" this process, finding duplicates - and sharing their extents. So running thing like that on "full" tree (i.e. that contains all subvolumes/snapshots) would do the trick. Actually it's what called "offline dedup" and could be applied beyond snapshots. There is nothing wrong to have 5 VMs from same template, sharing most blocks of their "disks". Not a snapshot, but cool enough anyway.

                  I think it causes some issues in combination with some other features too. Not that you can't/shouldn't use it, you just need to be aware of what can happen and to do any extra effort to fix that after the defrag if you care about avoiding the issues.
                  Well, using advanced features implies some use of brain and some understanding of tools in use. But I dare to think it pays for itself in case of btrfs. I had very pleasant experience this way. And unlike ZFS it wouldn't fall apart on kernel update.

                  DUP on single storage device is ok on HDD iirc, but not on SSD.
                  I've tried that on scary flash stick. And it fixed some bad CSUMs. FS that don't CSUM data would just face silent corruption. Not exactly cool, eh? On SSDs it would increase wear as it causes more writes. It something to consider if you need to write a lot of data. It also uses space twice, like RAID-1, but on single device and you can choose to use that for metadata both data and metadata. Rather funny idea, by any means. Well some systems can't have more storages added - like my laptop. And so those who advices to "buy another laptop" and "create raid" can go do it themselves, as I've got much better option for such use case...

                  The BTRFS wiki advises against it for SSD and iirc there's something about DUP for metadata as default on HDD but not SSD for this reason, as SSD products can optimize internal storage by de-duping blocks despite having duplicates on the file/block layer presented to the OS/user, which gives illusion of redundancy..
                  There're different failure modes of storages. One of them could even be "catastrophic failure" where your device totally dies - at which point DUP is useless as well. However, it wings some failure modes of some storages. I've actually tested that - and it saved me several times, preventing quite nasty FS failures (well, say, if metadata block fails to read for whatever reason, you've got a problem).

                  XFS is less maintenance in general imo,
                  XFS gives no crap about user data and their integrity, IMO. That's my experience with XFS. Even with recent kernel it still managed to put block of zeros to end of file, screwing it. Historically this kind of attitude plagued XFS for eternity. It got mostly fixed, but alas, XFS still does strange things with user data. And mere fsck isn't that huge deal. In case of btrfs scrub does checks checksums - so one gets real idea whether you can actually read all used blocks right.

                  BTRFS requires you to do extra maintenance unless you're using default setup from a distro like openSUSE where this is handled for you, otherwise you need to be aware of it if you want to reduce risks.
                  Well, advanced features may need some learning. It's pretty much possible to use btrfs like ext4 - it would do as well. Though it much less fun as it inevitably slower due to all the features and if you don't use benefits it sounds like weird tradeoff...

                  But yes, BTRFS has some nice features, I've experienced problems with BTRFS setups(to be fair it's been years since my last try) and those were not pleasant, they were never an issue on XFS(not that BTRFS doesn't prevent issues that you'd be able to experience on XFS, just from experience I had more with BTRFS than XFS).
                  Well, if you convert RAID level with btrfs and it interrupted in the middle - well, it merely resumes conversion upon reboot. I'm yet to see what feat like this could take with XFS and RH-proposed storage management techniques... oh wait, RH ppl are good at mumbling "unsupported feature/configuration"

                  those maintenance scripts setup and awareness of CoW gotchas with certain workloads).
                  CoW is considerable departure from classic approach - and it takes some re-consideration of things and pitfalls. Yet, many of these gotchas aren't even unique to btrfs. Even e.g. qcow qemu VM disks could give you similar surprises if you don't understand what "snapshot" really is and why disk would grow in size as VM and snapshot diverge. Look, nobody would do this trickery if there is no benefits to justify all that :P.

                  Comment


                  • #19
                    Originally posted by zxy_thf View Post
                    10 years without proper RAID5/RAIDZ support already shows btrfs is a deadend.
                    Well, there is RAID5. It is imperfect but it does works.

                    Nice try, but, honestly, this looks stupid to say the least. I'm not gonna jump to a dying horse, Mr. BTRFS.
                    And you look like yet another ZFS troll/zilot... yet,
                    1) ZFS isn't mainline. And that's quite a bullshit when it comes to filesystem. Especially root filesystem to put it straight. Sure, I've seen some ZFS fans suddenly "unable to mount root" after kernel update. Somehow I'm not terribly excited about this option.
                    2) Twice so if I want to run recent kernel or even give a try to -RC, etc.
                    3) Well, mainline devs wouldnt support such configuration either. Recently Mr Torvalds posted crystal clear message about all this, btw.
                    4) Have ZFS got sane mem management, after all? Anyhow intergrated with Linux kernel? Look, btrfs works even on single board computers and routers (lol) and when it comes to resource consumption it more like EXT4, not some enterprise-grade-heavy-truck.
                    5) Eh, still no reflinks? A very cool technique to my taste. I use it and like it.
                    6) I have some useful use cases for DUP. ZFS lacks that. Its enterprise grade heavy truck. If I need something else ... well, it wouldn't do. Btrfs on other hand is quite versatile and scalable. And not inclined on enterprise heavy lifting as the one and the only use case (honestly, screw it).

                    Comment


                    • #20
                      Originally posted by SystemCrasher View Post
                      I've tried that on scary flash stick. And it fixed some bad CSUMs. FS that don't CSUM data would just face silent corruption. Not exactly cool, eh? On SSDs it would increase wear as it causes more writes. It something to consider if you need to write a lot of data. It also uses space twice, like RAID-1, but on single device and you can choose to use that for metadata both data and metadata.

                      XFS gives no crap about user data and their integrity, IMO. That's my experience with XFS. Even with recent kernel it still managed to put block of zeros to end of file, screwing it.
                      Just to make sure it's clear.. The internal de-dupe is by a flash controller, not all SSD(and unlikely in USB sticks afaik) are going to do this optimization, but afaik there isn't a way of knowing, it's not something you see in marketing or documented for products, so if you want to be on the safe side, I would be cautious with relying on DUP single disk SSD with BTRFS, like the wiki advises you about.

                      That's unrelated to BTRFS de-dupe feature or whatever third-party solution. DUP will let you keep two copies and use twice the space on storage, but internally it can be de-duped by the flash controller to reduce wear. That works against your interests when you're wanting the extra copy of data for fallback if something goes wrong. Your checksums are otherwise only able to realize something went wrong but not recover when that happens since the DUP data is corrupted as well.

                      BTW, with SSDs these days they have pretty decent endurance that you don't need to worry about wear from writes. Unless you're writing excessively to the storage and burning through that endurance rating, in which case the disk isn't likely intended for any valuable data but more of a scratch / cache like disk, which makes it replaceable. Such data use cases though may not have much benefit from BTRFS if they're for temporary data and performance is more important, I generally see BTRFS as more useful for archive/backup or system data(personal data should be getting backups anyway if valuable).

                      XFS has been fine for me, it's experienced multiple power cuts or kernel panics without data corruption. BTRFS is appealing for multi-volume storage, de-dupe, snapshots and rollback, transparent compression, etc. I'm just slow to adopt it as I'm not a casual user and my research suggests I'll run into issues from CoW/BTRFS if I don't plan properly, especially since I want to use those advanced features which don't always play well with each other(defrag with de-dupe and compression I think is one example).

                      Comment

                      Working...
                      X