Announcement

Collapse
No announcement yet.

XFS With Linux 6.9 Brings Online Repair Improvements

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by muncrief View Post
    And OpenZFS has worked magnificently, with one exception. Linus will simply never allow it to be incorporated into Linux so it always falls behind the latest kernel version. The claim is that its CDDL license is unacceptable, though other exceptions have been made. But there's no point in arguing about it as Linus will never change his mind.
    About time you spend some time on this argument that there have been other exceptions by doing some research.

    You find when was the last time some thing with a GPLv2 incompatible license was allowed into the mainline kernel? Last you find is firmware when firmware was in the mainline tree that 6 years ago.

    Then when you go back to something that is not firmware. ZFS of course is not firmware. This takes you back to before the year 2000 to a AIX file system that has been removed from current day Linux kernel.

    muncrief for file systems in the Linux kernel exceptions to the GPLv2 license has not been made for over 2 decades. Yes this is before ZFS exists as code.

    So really the other exceptions claim does not hold up. Since the split of firmware and main kernel into different git repositories there is no new exceptions at all being made in the mainline Linux kernel.

    At this point we can mostly say that Linus and Linus future replacement will not accept the CDDL license of OpenZFS and there is a key reason for this.

    Aside from that, individual files can be provided under a dual license, e.g. one of the compatible GPL variants and alternatively under a permissive license like BSD, MIT etc.
    The rules are set documented and agreed on by the Linux foundation members. Yes this in license rules is a lot more strict than it use to be. You use be able to dual license like GPL and closed source commercial as well but since that line was added to the Linux kernel processes that has been outlawed as well.

    Its not Just OpenZFS CDDL on the wrong side of the Linux kernel rules. Please note Linus did not write these rules for licenses. You have to look straight at the Linux Foundation legal department this is why Linux kernel lead maintainer changing will change nothing.

    By the way you see the Andrew File System mentioned in arguments this comes into the Linux kernel after 2000 but it source code in the Linux kernel has always been licensed GPL-2.0or later so Linux kernel license compatible.

    This is the problem the Linux kernel has progressively cleaned house of the exceptions. Now that mainline Linux kernel has no more exception code left you now want to add one..... See the screwed problem here for OpenZFS if it remains CDDL for the kernel parts and attempt to mainline merge.

    OpenZFS has had it own destroying data issues as well.
    Version 2.2.2 and also 2.1.14, showing that this wasn't a new issue in the latest release


    Issues with file systems is not all users will hit the issues that cases data to vaporize. The ones who don't hit the issues will claim the file system is good. The ones that hit the issues will claim it bad.

    Hard reality we don't have the tools to validate that a file system is storing data safely. This does mean that backups are critical and the process of doing backups is even more of a pain in the ass and time consuming once you start doing everything you should to reduce file system date destruction risk.

    Comment


    • #32
      Originally posted by muncrief View Post
      My goal was simply to discover silent data corruption as quickly as possible. Repairing it isn't an issue for me because of the multiple backups, just detecting it.
      Look into a tool called par2:


      It uses Reed-Solomon coding to detect and repair corruption in files. First you run it to create a .par2 recovery file. Then you periodically run it to test integrity using the par2 file. It works best if the files change rarely, because you need to re-generate the recovery file.

      Comment


      • #33
        I understand it's a complex subject oiaohm , which is why I said there was no point in arguing about it in my OP. Though I must add that Linus has long expressed disdain for OpenZFS, and actually made claims that are simply not true, for example that it was no longer actively being developed, etc.

        However I agree that there are no error free filesystems, and the more complex the system, as COW filesystems certainly are, the more opportunity for error. So one must personally weigh the benefits against the risks. But no matter the choice multiple backups are essential, and the more the better. I have continuous local and cloud backups, as well as tapes, and optical, hard, and SSD disks, in a commercial storage compartment that I try to update once a year. Though I must admit that over the decades I've sometimes let the storage backups go for longer than that as my archive has grown from the initial 100 GB to 11 TB. Of course all the tapes are most certainly unreadable now, but they are labeled with the years and content and so mainly serve as mementos of days long passed.

        In any case I've simply found OpenZFS to be the best solution for detecting silent data corruption so I can restore it from backups before even the backups are lost. But I have no problem with others disagreeing, and choosing different solutions.

        Originally posted by oiaohm View Post

        About time you spend some time on this argument that there have been other exceptions by doing some research.

        You find when was the last time some thing with a GPLv2 incompatible license was allowed into the mainline kernel? Last you find is firmware when firmware was in the mainline tree that 6 years ago.

        Then when you go back to something that is not firmware. ZFS of course is not firmware. This takes you back to before the year 2000 to a AIX file system that has been removed from current day Linux kernel.

        muncrief for file systems in the Linux kernel exceptions to the GPLv2 license has not been made for over 2 decades. Yes this is before ZFS exists as code.

        So really the other exceptions claim does not hold up. Since the split of firmware and main kernel into different git repositories there is no new exceptions at all being made in the mainline Linux kernel.

        At this point we can mostly say that Linus and Linus future replacement will not accept the CDDL license of OpenZFS and there is a key reason for this.


        The rules are set documented and agreed on by the Linux foundation members. Yes this in license rules is a lot more strict than it use to be. You use be able to dual license like GPL and closed source commercial as well but since that line was added to the Linux kernel processes that has been outlawed as well.

        Its not Just OpenZFS CDDL on the wrong side of the Linux kernel rules. Please note Linus did not write these rules for licenses. You have to look straight at the Linux Foundation legal department this is why Linux kernel lead maintainer changing will change nothing.

        By the way you see the Andrew File System mentioned in arguments this comes into the Linux kernel after 2000 but it source code in the Linux kernel has always been licensed GPL-2.0or later so Linux kernel license compatible.

        This is the problem the Linux kernel has progressively cleaned house of the exceptions. Now that mainline Linux kernel has no more exception code left you now want to add one..... See the screwed problem here for OpenZFS if it remains CDDL for the kernel parts and attempt to mainline merge.

        OpenZFS has had it own destroying data issues as well.
        Version 2.2.2 and also 2.1.14, showing that this wasn't a new issue in the latest release


        Issues with file systems is not all users will hit the issues that cases data to vaporize. The ones who don't hit the issues will claim the file system is good. The ones that hit the issues will claim it bad.

        Hard reality we don't have the tools to validate that a file system is storing data safely. This does mean that backups are critical and the process of doing backups is even more of a pain in the ass and time consuming once you start doing everything you should to reduce file system date destruction risk.

        Comment


        • #34
          Originally posted by foobaz View Post

          Look into a tool called par2:


          It uses Reed-Solomon coding to detect and repair corruption in files. First you run it to create a .par2 recovery file. Then you periodically run it to test integrity using the par2 file. It works best if the files change rarely, because you need to re-generate the recovery file.
          Thank you for the information foobaz, I'd never heard of par2 before. However my data is quite diverse, and while most of it rarely changes, some of it changes quite often so I don't think it would be a viable solution for me. And now that I've figured out how to use Cachyos correctly I feel comfortable sticking with OpenZFS.

          Comment


          • #35
            What would be nice is a filesystem with pluggable/modular built-in erasure coding, so you can choose your level of paranoia. I'm not talking about RAID spreading data across several block devices, but the actual data-stream having built-in redundancy. In principle, it could be implemented as a block device layer (like dm-crypt, dm-raid, and dm-integrity) - call it dm-erasure or something - there is a python library that implements erasure codes: GitHub: pyEClib; OpenDev: pyEClib; PyPI: pyEClib, and I think someone implemented a fuse driver with it, but I can't find it now.

            Most block devices have some form of erasure coding, often Reed-Solomon implemented 'close-to' the hardware, and the parameters are chosen to meet the designed Bit-error Rate for the device. While adding a separate, new, erasure coded layer could be regarded as inefficient, it would allow an improvement to the BER. Also, an erasure-coded block device could be added to the output of a backup command, allowing you to add data redundancy/recoverability to backups.

            It would be operationally helpful if data could easily be encoded as a Fountain Code as well, which allows data recovery from any sufficiently large subset of the encoded data - in other words, you can have a large hole in your backup, or have random blocks missing up to a certain number/size, and the original data is completely recoverable.

            I'm aware of describing a lot of what Ceph, OpenIO, and Tahoe-LAFS does - but they are somewhat 'large scale', wheras I'd like to be able to do erasure coding on a single file or filesystem on a single (local) block device to improve BER.

            Comment


            • #36
              On linux, you can look into dm-integrity. That is only detection though. To be able to recover from corruption, I'd look into stuff like par2 et al. on the userland side. On the hardware side, you want ECC memory. Then you want backups (that are tested).

              I believe in defense in depth and don't trust one solution a'la zfs. Don't put your eggs all in one basket. Remember when a bittorrent client detected silent data corruption in linux? Pepperidge Farms remembers.

              Comment


              • #37
                Originally posted by muncrief View Post
                I understand it's a complex subject oiaohm , which is why I said there was no point in arguing about it in my OP. Though I must add that Linus has long expressed disdain for OpenZFS, and actually made claims that are simply not true, for example that it was no longer actively being developed, etc.
                To be correct you need to check out Linus claims more carefully because one bit is you have taken out of context.
                Oracle/Sun is no longer actively developing OpenZFS the party that hold the possible patent licenses.

                “Don’t use ZFS. It’s that simple. It was always more of a buzzword than anything else, I feel, and the licensing issues just make it a non-starter for me.” This is what Linus Torvalds said in a mailing list to once again express his disliking for ZFS filesystem


                This has a light overview. But the reality when you get into Linus comments taken in correct context I have not found where here has made invalid claim against OpenZFS.

                There is legal trouble where due to him doing GPLv2 compatible work that he cannot risk messing with CDDL licensed OpenZFS.

                muncief you need to look a little closer a lot of Linus so call disdain is just Linus repeating what the Linux Foundation legal department is telling him.

                Yes you will note there that the person quoted something out of context of Linux that appears to say that ZFS is not maintained but the author here is careful to include the own line with the context of the debate that section of text is from.

                Real maintenance from Linus point of view is having proper company funding with proper legal on patents/licenses what is like or not something openzfs currently does not have.

                There are like it or not quite a long list of issue coming from the Linux foundation legal department that we hear second hand by Linus Torvalds with a few errors added in the middle because it second hand. There are about 20 issues Linus has stated that block OpenZFS from merging into mainline. About 1/2 would be addressed by license change to something GPL compadible and the other half would mostly be solved by Orcale providing a letter saying we not sue over this like Microsoft provided for the NTFS merge.

                To make a pure GPL compatible kernel module of OpenZFS you would fairly much need Oracle to sign the legal clearance letter first. So why focus on Linus Torvalds when OpenZFS problem is really with Oracle legal department that they will not issue the letter OpenZFS need to be sure legally free and clear.

                Comment


                • #38
                  Why use a “filesystem” at all with that much data to store.
                  object storage, with error correction and bit rot detection, with 3x replicas, is the way to go.
                  minio, swarm, riak2 and others.. maybe amazon glacier for archiving if you wish..
                  $1/month per tb.
                  Google archive.. $12.30/month for 11tb
                  Or have and manage your own local backups as it seems you do now.
                  bad disk? Pull and replace.. simple.
                  upgrade? Retire and replace with larger storage.
                  easy to use and share if you want.

                  Comment


                  • #39
                    Originally posted by Radtraveller View Post
                    Why use a “filesystem” at all with that much data to store.
                    object storage, with error correction and bit rot detection, with 3x replicas, is the way to go.
                    minio, swarm, riak2 and others.. maybe amazon glacier for archiving if you wish..
                    $1/month per tb.
                    Google archive.. $12.30/month for 11tb
                    Or have and manage your own local backups as it seems you do now.
                    bad disk? Pull and replace.. simple.
                    upgrade? Retire and replace with larger storage.
                    easy to use and share if you want.
                    Google and amazon still use file systems under there solutions.

                    The google archive I can name it file systme
                    https://en.wikipedia.org/wiki/Google_File_System but yes there is a file system under S3 as well. Comes a question how well are these file systems in fact designed.

                    What do you do if you are being tax audited so unable to pay your Google/Amazon bill... See problem yet.. Local backups of important data is kind important at times.

                    There is a reason why businesses use ceph and other local cluster file system options.


                    Yes Ceph that you can setup up locally can be operating on x3 replicates by default.

                    Let say I want to achieve maximum useable capacity with data resilience on this 3 OSD nodes setup where each node contains 2x 1TB OSDs. Is it safe run 3 Ceph nodes with 2-way replication? What ar...


                    Little catch here those using Ceph since it a open source file system have documented that there can nice be combinations of sod law where 3 replicates still equal your data vaporized.

                    Yes google came up with the idea of 3 copies of data equals safe. Sorry to say lot of Ceph users in fact run as 4 and larger copies and snapshots(can be important after ransomware attack).

                    Radtraveller the reality these problems are not simple and done right data storage is not cheap. Yes lot of business end up with major data loss problems because they don't put the money into doing data storage right.

                    Comment


                    • #40
                      Originally posted by oleid View Post

                      Some time ago I had issues with some files being zeroed after a crash. That's when I switched to ext4. I guess it could be 15 years ago. But I've been told that this bug is long fixed.
                      Same here, but I'm not usre if this is not "by design". Please correct me if I'm wrong, but as far as I remember, XFS works like EXT4 data=writeback mode, which exhibits the same behavior.
                      By default, EXT4 works in slower, but safer data=ordered mode. EXT4 can be tweaked for performance, like using writeback mode and enabling async journal commits.

                      I've been using XFS briefly years ago, but apart from this issue, it almost freezed my machine once dirty memory became full (for example when copying from faster drive to a slower one).
                      Last edited by sobrus; 20 March 2024, 03:41 AM.

                      Comment

                      Working...
                      X