Announcement

Collapse
No announcement yet.

Btrfs Seems To Finally Have Failed Me On A Production System

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    The problem is if the bug gets fixed in a timely fashion then Michael can't write more sensationalist pieces about btrfs not working

    Comment


    • #32
      Originally posted by profoundWHALE View Post

      Their stupid fast boot causes so many headaches for me. If anything happens to the system while trying to shut down, it's essentially, "screw you!" and you have to reinstall. Of course, you might think that they've either ironed that out or that it doesn't happen often. It's happened to me around 3 times with Windows 8-8.1 and 2 times already with Windows 10.
      Well, I'm lucky in this regard: I've got rid of all windows around. so it can't bother me for purely technical reasons.

      Originally posted by AndyChow View Post
      I'm not surprised. BTRFS gets corrupt all the time. Especially if you balance or defrag using compression, on a raid system. You can be almost sure it's not going to finish. I opened two different questions on StackOverflow (or a sister site) that got me high rep, but no solutions.

      You can tell how stable BTRFS is based on how many times it appears in each changelog of a kernel release. "Fixes corruption problems that ..., Fixes issue from...".

      BTRFS claims that it's so much more flexible than ZFS, you can add devices and remove others on the fly, and so on. But try it! Try to remove a drive on a 5 drive raid-1 that has lzo compression! I dare you! Four days later, it still won't be finished. And it a disk breaks, try to mount in degraded mode and rebuild the array. I dare you again! It will eventually start spitting unrecoverable errors left and right. Wait, don't I have redundancy? Isn't that the point? Oh, the metadata was redundant, and the data was supposed to be redundant, but you didn't balance recently enough, and even if you mounted with data redundancy, it doesn't always work so well. Sorry. We told you to have another backup.

      Yeah, BTRFS is kind of a joke right now.
      Well, every filesystem haves its strong and weak points. And any reasonably big program inevitably contains bugs. What is the point of yelling?
      1) Got a bug? File bug report! If some case which you need is not working - you either take care of it, or GTFO. That's how software development works. It makes little point to just yell. And bug tracker works better than forums since devs may or may not read it. Bug tracker more likely to be read.

      2) Every filesystem comes with it's own corner cases, where it would perform well below average. There are no silver bullets. Everyone who claims otherwise is a fucking liar or moron. That's why Linux comes with dozen and half of various filesystems for all occasions, btw.

      3) ZFS got it's own odds and bugs. Everyone who claims otherwise is either liar or moron as well. Very simple googling could prove it, finding dozens of people who used ZFS and faced issues. So, maybe you should also tell us ZFS is a joke, too? To keep comparison fair. Say, CoW which "does not needs" defrag is a blatant marketikng bullshit. Try to do heavy CoW workloads and when if fragments to the hell ... do ... um, well, Sun's marketing bullshit does not tells what to do. They give some lame hints how to partially avoid it. But if you use CoW, it would inherently fragment due to CoW nature. Then, in ZFS you can't disable CoW for particular file where it causes lot troubles (DBs, VMs with CoW-based disks to name the few).

      4) ZFS haves no future in Linux. It's out of tree crap which will never work out of the box, thanks to silly sun licensing. So it can easily fell apart upon kernel upgrade, etc. If filesystem is not in mainline, it sounds like a trouble on the way. There is no point to yell about ZFS - most Linux people would not use it due to this reason. Not to mention crappy integration with rest of OS like cache memory not integrated with rest of kernel memory management.

      5) Actually, I got some machines running btrfs and these are running flawlessly for a while. Sure, it's better to use recent kernel to get fixes. But if we take a look on commits to any other filesystem, and kernel code in general - btrfs isn't anyhow fundamentally different from other parts. If you feel like scaredy cat, do not read commit logs - they always contain a lot of funny stuff unless software got abandoned so no commits and no scary changelogs as the result. This does not means software no longer contains bugs. btw. It usually means software is dead and people stopped using it to degree nobody steps on rare bugs. As simple as that .

      Comment


      • #33
        Originally posted by SystemCrasher View Post
        3) ZFS got it's own odds and bugs. Everyone who claims otherwise is either liar or moron as well. Very simple googling could prove it, finding dozens of people who used ZFS and faced issues. So, maybe you should also tell us ZFS is a joke, too? To keep comparison fair. Say, CoW which "does not needs" defrag is a blatant marketikng bullshit. Try to do heavy CoW workloads and when if fragments to the hell ... do ... um, well, Sun's marketing bullshit does not tells what to do. They give some lame hints how to partially avoid it. But if you use CoW, it would inherently fragment due to CoW nature. Then, in ZFS you can't disable CoW for particular file where it causes lot troubles (DBs, VMs with CoW-based disks to name the few).
        The concept of needing defragmentation is ambiguous in that it means different things to different people. So far, I have not met anyone who had a heavily fragmented ZFS filesystem suffer more than a a factor of 2 decrease in performance. However, filesystems on platforms like VMS and Windows can be so heavily fragmented that the system is unusable. No, you don't need defragmentation on ZFS so long as needing defragmentation is defined as having an unusable system. If your definition is noticing a different in some benchmark, then yes, you do.

        The time that I did meet someone who had gotten performance to drop by a factor of 2, he had been using bit torrent and could have avoided the performance degration entirely had the torrent files been saved to their own dataset and copied elsewhere afterward.

        As for disabling CoW, that is a very bad idea because doing checksums without CoW requires in-place checksums. This means that any misdirected write that has a valid checksum would be considered valid data. NetApp's WAFL does this and it has data integrity issues that ZFS lacks.

        That being said, it is possible to improve database performance by setting recordsize on the dataset to match the recordsize used by the database. Failing to do this creates a boomerang effect in database benchmarks where loads on larger datasets decrease IOPS because the disk is throughput bound and larger dataset sizes reduce the cache hit-rate needed for read-modify-write of the filesystem records. The impact to performance of failing to match the recordsize on database workloads is a factor of 32 in the worst case for 8KB records.

        Originally posted by SystemCrasher View Post
        4) ZFS haves no future in Linux. It's out of tree crap which will never work out of the box, thanks to silly sun licensing. So it can easily fell apart upon kernel upgrade, etc. If filesystem is not in mainline, it sounds like a trouble on the way. There is no point to yell about ZFS - most Linux people would not use it due to this reason. Not to mention crappy integration with rest of OS like cache memory not integrated with rest of kernel memory management.
        I would argue that being out of tree is irrelevant when userland is almost entirely out of tree and yet, people still use those out of tree components. The VFS is a pluggable interface, much like the syscall table that userland uses. Being that ZFS consumes that, being out of tree makes it little worse than glibc or any other component that you need for a usable Linux system.

        That said, the concept of "no future on Linux" means different things to different people. In your case, it seems to mean that you would not use it. Being at risk for silent corruption is your perogative. However, a growing number of people and large organizations are using it and that is unlikely to change. Some users are The Weather Channel, Netflix, John Hopkins University, the US Government, etcetera. They do not care which components are in Linus' tree so much as they care about things actually working, which ZFS does rather well.
        Last edited by ryao; 07-23-2015, 12:57 PM.

        Comment


        • #34
          Originally posted by ryao View Post
          The concept of needing defragmentation is ambiguous in that it means different things to different people. So far, I have not met anyone who had a heavily fragmented ZFS filesystem suffer more than a a factor of 2 decrease in performance. However, filesystems on platforms like VMS and Windows can be so heavily fragmented that the system is unusable. No, you don't need defragmentation on ZFS so long as needing defragmentation is defined as having an unusable system. If your definition is noticing a different in some benchmark, then yes, you do.
          This looks like attempt to spread Sun's marketing bullshit rather than technical discussion. I do not really care what you've seen: this is small subsample of all possible cases, after all. And except everything else I've seen couple of ZFS pools with laughable performance, only comparable to heavily used NTFS on full drive. For very similar reason - pool got full, allocator got screwed and fragmentation exploded. ZFS can't do anything magical about it. Sometimes allocator have to do "unopimal" decisions to complete write request. The less space one haves and more fragmented free space is, the more often allocator would resort to in-situ decisions instead of optimal allocation matching request. Over time it inevitably degrades and situation is gettnig worse. This affects more or less ALL fileystems.

          So let's stop this marketing bullshit. Fragments are here. In each and every realistic fileystem woking in real-world conditions. Because when file getting rewritten or appended, it do not have to be taken as granted that allocator can fullfil request in "ideal" way, matching request. It can happen there're other data on the way, etc and so, less optimal decisions have to be made to fulfill partcular write request. Over time it degrades, since chance to find free space allowing "exact" allocation reduces over time, on long run files which are frequently modified would also get more metadata to describe allocation in all fragments scattered here and there, etc. So performance decrease is to be expected on long run.

          Sure, some filesystems resist it better. Some are worse. Yet, if one has got almost no free space, fragmentation explodes on virtually any FS, because allocator have to resort to suboptimal decisions most of time, eventually making things even worse.

          Then we should take a look what the CoW is. CoW is great invention - in some sense. If used properly, CoW gives ability to have "full" journalling - without write speed penalty common for "classic" designs! And "instant" snapshots due to the fact all data and metadata required to create snapshot are already here, so it basically matter of some formal mark.

          Yet there is no free lunch. This is great technique. However, how non-destructive write and snapshots are achieved? When one changes some data in file, old file is left as is, changed data is written somewhere else. So one can potentially access both old version and new version - depending how you're going to parse metadata. That's what called CoW - changed data are copied on write attempt. Yet, CoWed data inevitably becomes "fragment".

          So what?
          1) CoW based designs are more prone to fragmentation "on their own". Because CoW is all about making fragments on write, lol. Every time one writes to existing file, it going to be a fragment.
          2) There are workloads which are "inherently bad" for CoW. Say, VMs, where virtualization software does CoW on it's own in VM diskfile would perform very poorly, because CoWing CoW operations leads to doing same work several times and really suboptimal fragmentation. Same goes for most database designs, which were created in assumption filesystem can patch file in-place and unable to provide adequate journalling for DB needs. These assumptions do not hold for CoW. But most DB engines were created way before CoW designs appeared. That's why btrfs got NODATACOW per-file tunable. It allows to make few exceptions, omitting CoW for some files and basically allows btrfs to behave in way such DBs and VMs expect: thin layer over storage which can do in-place rewrites. Sure. there will be no journalling and snapshots anymore. However, DBs and VMs have very own notion of snapshotting and journalling, after all.

          Btw, Oracle seems to be well aware of this, so NODATACOW is very ancient option from times when Chris Mason has worked in Oracle. OTOH ZFS lacks similar tunables and I guess that was not part of design. There is nothing to deal with troublesome loads. I guess that's why Oracle no longer cares about ZFS but commits to btrfs. After all they are all about DBs.

          Then, when it comes to fragments, the only thing Sun had to offer is lame mumbling of their marketing people saying one have to put new drives to pool in timely manner, avoiding running ZFS more than about 70-80% full. To keep it more funny, it also has been one way ticket. Not really sure if ZFS finally got ability to remove drive from pool, but historically, it lacked one thing btrfs had: back references. Backrefs allow btrfs to walk drive and quickly clean it up: FS knows what these data are and how to deal with them, so it can move it away quickly, getting drive emptied. That's what allows to remove drive easily and quickly.

          And what I like about btrfs: there is real defragger. So instead of listening to marketing bullshit about adding drives (you'll be unable to remove later, lol) here we have technical solution. Should it become slow, defragger would fix that. Because it is implemented, unlike in ZFS. Nobody forces you to use it. And it wouldn't be really needed on storage which is 30% full all the time. But if things gone bad, it's really better option than re-assembling multi-disk pool from scratch. The only way to "defragment" ZFS.

          The time that I did meet someone who had gotten performance to drop by a factor of 2, he had been using bit torrent and could have avoided the performance degration entirely had the torrent files been saved to their own dataset and copied elsewhere afterward.
          Torrents are one of known methods to piss off filesystem. Somehow, they're somewhat similar to DBs and VMs in some regard: when torrent downloads block, it is basically some "random" write to middle of file. This logic also assumes filesystem is okay with in place patching of block. That's where CoW strikes back - because it is not in-place file patching anymore. This is another example of "CoW unfriendly" workload.

          As for disabling CoW, that is a very bad idea because doing checksums without CoW requires in-place checksums. This means that any misdirected write that has a valid checksum would be considered valid data. NetApp's WAFL does this and it has data integrity issues that ZFS lacks.
          NODATACOW basically turns filesystem into thin layer over storage space. It lacks most features. Think of it like if it becomes something like EXT4 or so for that particular file.

          What is the point? Ideally, from pure performance point of view, DBs and VMs would fit best on dedicated RAW partitions, being essentially some kind of "filesystem-like" and "journalling" structures, optimized for particular tasks. But there is catch. Adminds tend to hate dealing with RAW partitons. That's why DBs and VMs prefer "thin filesystems" similar to EXT4, etc - it is some compromise between easy administration and performance loss. Full blown CoW FS is exact opposite of what such DBs and VMs want. That's the real reason for NODATACOW.

          That being said, it is possible to improve database performance by setting recordsize on the dataset to match the recordsize used by the database.
          On other hand, in case of btrfs one can just tell filesystem to pike off with all these cool features and turn into simple and thin layer above storage. That's what DBs and VMs really want to see. Actually they want RAW partition, but RAW partitions really suck in terms of management unlike files (compare file resize and partition resize, for example).

          Failing to do this creates a boomerang effect in database benchmarks where loads on larger datasets decrease IOPS because the disk is throughput bound and larger dataset sizes reduce the cache hit-rate needed for read-modify-write of the filesystem records. The impact to performance of failing to match the recordsize on database workloads is a factor of 32 in the worst case for 8KB records.
          Actually, root cause is that VMs and DBs are similar to filesystems on their own and tend to do very own custom journalling, which has been created with very particular assumption of how underlying layer behaves. CoW FS happens to behave totally different ways, creating massive interference between algos in use. NODATACOW simply kills the whole reason behind this interference and exposes exactly what DBs/VMs would like to see. Once there is no reason of interference, there is no need to scratch your brain how to work it around and put two poorly compatible algos together. CoWing DBs and VMs just bad idea - end of story. For VMs it's possible to use RAW file and then do CoW and snapshotting on filesystem side, avoiding double journalling. Yet it would put some limits on VM operations and VM software would be basically unaware of it - some caveats apply. It's more tricky for DBs - most DBs heavily rely on their own custom journalling.

          I would argue that being out of tree is irrelevant when userland is almost entirely out of tree and yet, people still use those out of tree components.
          For me I would consider filesystem something mandatory. Which can't and shouldn't be external component. You see, I actually BOOT from btrfs as well. Because I like snapshots. IMHO, system drive is actually what needs snapshots most! So I can "rewind" my computers between states just like I can rewind VMs. This is very convenient. And it's not like if I want to see how my boot filesystem falls apart upon installing newer kernel, etc. ZFS does not fits this use case. Yet, I really like idea to snapshot system drive.

          The VFS is a pluggable interface, much like the syscall table that userland uses. Being that ZFS consumes that, being out of tree makes it little worse than glibc or any other component that you need for a usable Linux system.
          ...unless my OS would fail to boot on kernel upgrade. That's where I'll get stuck. This is unacceptable for me. FS have to be in mainline kernel to be considered as rootfs unless one is masochistically inclined. And I really like idea to snapshot system drive, so once something goes not the way I like it, I can at least quickly get previous system state and try again in matter of seconds, rather than doing some woefully long recovery.

          That said, the concept of "no future on Linux" means different things to different people.
          Sure. For me it means this:
          1) Would not be accepted to mainline.
          2) Would not work out of the box in most distros.
          3) As a result, it would not be widely used and would give extra headache.

          Without any obvious gain to offset all this crap. I really do not get what so cool about ZFS. It seems to be overengineered stuff from Sun which got a lot of shortcomings and most of these are really hard to address. So Sun's devteam rather preferred to "solve" hardest problems using marketing bullshit. Clearly not approach I would like to see in my system. I like Linux not because of loud marketing, but because it works for me and does it well.

          In your case, it seems to mean that you would not use it. Being at risk for silent corruption is your perogative.
          I wouldn't, because being at risk of failed boot sequence on kernel upgrade is unacceptable for me. And sometimes I do not mind to give it a try to brand-new mainline kernel, etc. And btrfs also comes with checksums. One can disable them with NODATACOW, but it's really special case one supposed to only use on few troublesome places which require extra attention anyway (DBs and VMs aren't "just files"). And filesystem checksums are good to have, sure. But these are not silver bullet.

          And of course I do understand storage techs I use. So it's not like your silly FUD attempt going to work :P.

          However, a growing number of people and large organizations are using it and that is unlikely to change. Some users are The Weather Channel, Netflix, John Hopkins University, the US Government, etcetera. They do not care which components are in Linus' tree so much as they care about things actually working, which ZFS does rather well.
          That's what I call "marketing bullshit" once more. What few of these worth of, if btrfs would work on virtually every Linux computer around and eventually supposed to became default filesystem in many distros. And they will be able to use all these snapshots, checksums, etc. And these using ZFS... it reminds me story on Apache and Yahoo using BSD. Sure, they did. But somehow their new hosts tend to run Ubuntu, lol. I guess more or less same fate awaits ZFS on Linux as well - good choice for few corporate Yahoo-like dinosaurs, sure.

          And except everything else, I've seen how mainline kernel devs are working. Having them on your side is really big win. Btrfs has got it. ZFS can't. I do not like that - I like software developed by Linux Kernel devs. They rock. And TBH I wouldn't be proud of having some DRM & proprietary fags like Netflix on my side.
          Last edited by SystemCrasher; 07-26-2015, 08:03 AM.

          Comment


          • #35
            Originally posted by AndyChow View Post
            I'm not surprised. BTRFS gets corrupt all the time. Especially if you balance or defrag using compression, on a raid system. ........
            You can tell how stable BTRFS is based on how many times it appears in each changelog of a kernel release. "Fixes corruption problems that ..., Fixes issue from...".
            There have also been corruption bugs fixed with ext4, xfs ...

            Originally posted by AndyChow View Post
            BTRFS claims that it's so much more flexible than ZFS, you can add devices and remove others on the fly, and so on. But try it! ...
            I've had no problems adding/removing devices with single, RAID0 or RAID1 profiles.

            RAID5/6 support is relatively new and there are 'warning signs' around it's use.

            Comment


            • #36
              Originally posted by SystemCrasher View Post
              This looks like attempt to spread Sun's marketing bullshit rather than technical discussion. I do not really care what you've seen: this is small subsample of all possible cases, after all. And except everything else I've seen couple of ZFS pools with laughable performance, only comparable to heavily used NTFS on full drive. For very similar reason - pool got full, allocator got screwed and fragmentation exploded. ZFS can't do anything magical about it. Sometimes allocator have to do "unopimal" decisions to complete write request. The less space one haves and more fragmented free space is, the more often allocator would resort to in-situ decisions instead of optimal allocation matching request. Over time it inevitably degrades and situation is gettnig worse. This affects more or less ALL fileystems.
              If you honestly think you are an expert on the topic, I suggest you either try writing your own filesystem or contributing to development rather than dismissing technical remarks by those who do. You will learn the difference between yourself and an actual filesystem developer very quickly. If you manage to stick with it long enough to do useful things and become a filesystem developer, your opinions will be quite different than what they are now.

              Originally posted by SystemCrasher View Post
              So let's stop this marketing bullshit. Fragments are here. In each and every realistic fileystem woking in real-world conditions. Because when file getting rewritten or appended, it do not have to be taken as granted that allocator can fullfil request in "ideal" way, matching request. It can happen there're other data on the way, etc and so, less optimal decisions have to be made to fulfill partcular write request. Over time it degrades, since chance to find free space allowing "exact" allocation reduces over time, on long run files which are frequently modified would also get more metadata to describe allocation in all fragments scattered here and there, etc. So performance decrease is to be expected on long run.

              Sure, some filesystems resist it better. Some are worse. Yet, if one has got almost no free space, fragmentation explodes on virtually any FS, because allocator have to resort to suboptimal decisions most of time, eventually making things even worse.

              Then we should take a look what the CoW is. CoW is great invention - in some sense. If used properly, CoW gives ability to have "full" journalling - without write speed penalty common for "classic" designs! And "instant" snapshots due to the fact all data and metadata required to create snapshot are already here, so it basically matter of some formal mark.

              Yet there is no free lunch. This is great technique. However, how non-destructive write and snapshots are achieved? When one changes some data in file, old file is left as is, changed data is written somewhere else. So one can potentially access both old version and new version - depending how you're going to parse metadata. That's what called CoW - changed data are copied on write attempt. Yet, CoWed data inevitably becomes "fragment".

              So what?
              1) CoW based designs are more prone to fragmentation "on their own". Because CoW is all about making fragments on write, lol. Every time one writes to existing file, it going to be a fragment.
              2) There are workloads which are "inherently bad" for CoW. Say, VMs, where virtualization software does CoW on it's own in VM diskfile would perform very poorly, because CoWing CoW operations leads to doing same work several times and really suboptimal fragmentation. Same goes for most database designs, which were created in assumption filesystem can patch file in-place and unable to provide adequate journalling for DB needs. These assumptions do not hold for CoW. But most DB engines were created way before CoW designs appeared. That's why btrfs got NODATACOW per-file tunable. It allows to make few exceptions, omitting CoW for some files and basically allows btrfs to behave in way such DBs and VMs expect: thin layer over storage which can do in-place rewrites. Sure. there will be no journalling and snapshots anymore. However, DBs and VMs have very own notion of snapshotting and journalling, after all.

              Btw, Oracle seems to be well aware of this, so NODATACOW is very ancient option from times when Chris Mason has worked in Oracle. OTOH ZFS lacks similar tunables and I guess that was not part of design. There is nothing to deal with troublesome loads. I guess that's why Oracle no longer cares about ZFS but commits to btrfs. After all they are all about DBs.

              Then, when it comes to fragments, the only thing Sun had to offer is lame mumbling of their marketing people saying one have to put new drives to pool in timely manner, avoiding running ZFS more than about 70-80% full. To keep it more funny, it also has been one way ticket. Not really sure if ZFS finally got ability to remove drive from pool, but historically, it lacked one thing btrfs had: back references. Backrefs allow btrfs to walk drive and quickly clean it up: FS knows what these data are and how to deal with them, so it can move it away quickly, getting drive emptied. That's what allows to remove drive easily and quickly.

              And what I like about btrfs: there is real defragger. So instead of listening to marketing bullshit about adding drives (you'll be unable to remove later, lol) here we have technical solution. Should it become slow, defragger would fix that. Because it is implemented, unlike in ZFS. Nobody forces you to use it. And it wouldn't be really needed on storage which is 30% full all the time. But if things gone bad, it's really better option than re-assembling multi-disk pool from scratch. The only way to "defragment" ZFS.


              Torrents are one of known methods to piss off filesystem. Somehow, they're somewhat similar to DBs and VMs in some regard: when torrent downloads block, it is basically some "random" write to middle of file. This logic also assumes filesystem is okay with in place patching of block. That's where CoW strikes back - because it is not in-place file patching anymore. This is another example of "CoW unfriendly" workload.

              NODATACOW basically turns filesystem into thin layer over storage space. It lacks most features. Think of it like if it becomes something like EXT4 or so for that particular file.

              What is the point? Ideally, from pure performance point of view, DBs and VMs would fit best on dedicated RAW partitions, being essentially some kind of "filesystem-like" and "journalling" structures, optimized for particular tasks. But there is catch. Adminds tend to hate dealing with RAW partitons. That's why DBs and VMs prefer "thin filesystems" similar to EXT4, etc - it is some compromise between easy administration and performance loss. Full blown CoW FS is exact opposite of what such DBs and VMs want. That's the real reason for NODATACOW.


              On other hand, in case of btrfs one can just tell filesystem to pike off with all these cool features and turn into simple and thin layer above storage. That's what DBs and VMs really want to see. Actually they want RAW partition, but RAW partitions really suck in terms of management unlike files (compare file resize and partition resize, for example).


              Actually, root cause is that VMs and DBs are similar to filesystems on their own and tend to do very own custom journalling, which has been created with very particular assumption of how underlying layer behaves. CoW FS happens to behave totally different ways, creating massive interference between algos in use. NODATACOW simply kills the whole reason behind this interference and exposes exactly what DBs/VMs would like to see. Once there is no reason of interference, there is no need to scratch your brain how to work it around and put two poorly compatible algos together. CoWing DBs and VMs just bad idea - end of story. For VMs it's possible to use RAW file and then do CoW and snapshotting on filesystem side, avoiding double journalling. Yet it would put some limits on VM operations and VM software would be basically unaware of it - some caveats apply. It's more tricky for DBs - most DBs heavily rely on their own custom journalling.
              People often find that ZFS outperforms btrfs, even when btrfs is configured to use hacks such as no datacow. Here is a benchmark someone put virtual machines on various storage solutions and ran benchmarks inside of them. btrfs does incredibly poorly and nodatacow makes no difference in the database benchmark:

              http://www.ilsistemista.net/index.ph...n.html?start=5

              There is some penalty to CoW, but it is not such a problem that it will manifest itself in every workload. If you configure your filesystem to use a recordsize that your workload uses, the overhead of CoW in terms of writes is zero and you do not need to sacrifice integrity guarantees to achieve it.

              Also, ZoL has had 2 releases since those benchmarks were done and each release tends to improve performance while btrfs has no such benefit on enterprise distributions where backports are rare. Consequently, the difference between ZFS and the others could have closed, although it is not inconceiveable that ZFS might actually outperform them at some point. What happens depends on the workload and the bottlenecks inside the code. There are worst case scenarios for ZFS' CoW, but real world workloads typically do not manifest them.

              As for Oracle, they are happy to sell btrfs, but their job descriptions suggest that they predominantly use their proprietary version of ZFS on Solaris to run their business, rather than btrfs on Oracle Linux:

              https://www.reddit.com/r/zfs/comment...ds_zfs_admins/

              It is my understanding that their contributions to btrfs are not anything like what they were when they purchased Sun.

              Originally posted by SystemCrasher View Post
              For me I would consider filesystem something mandatory. Which can't and shouldn't be external component. You see, I actually BOOT from btrfs as well. Because I like snapshots. IMHO, system drive is actually what needs snapshots most! So I can "rewind" my computers between states just like I can rewind VMs. This is very convenient. And it's not like if I want to see how my boot filesystem falls apart upon installing newer kernel, etc. ZFS does not fits this use case. Yet, I really like idea to snapshot system drive.

              ...unless my OS would fail to boot on kernel upgrade. That's where I'll get stuck. This is unacceptable for me. FS have to be in mainline kernel to be considered as rootfs unless one is masochistically inclined. And I really like idea to snapshot system drive, so once something goes not the way I like it, I can at least quickly get previous system state and try again in matter of seconds, rather than doing some woefully long recovery.
              This is a problem to take up with your distribution. Your distribution handling updates properly and putting ZFS into Linus' tree are separate matters. One does not imply the other. That said, I do not hear many reports of updates breaking things these days as distributions that use DKMS have hooks to call DKMS to update things whenever a new kernel is installed and build an initramfs archive. There is word of a regression here with 0.6.5 on EL7, but that should be fixed in the near future.

              Either way, Michael could not boot any kernel after what went wrong on his system. The inconvience of a kernel update not going 100% smoothly is not nearly as bad.

              Originally posted by SystemCrasher View Post
              Sure. For me it means this:
              1) Would not be accepted to mainline.
              2) Would not work out of the box in most distros.
              3) As a result, it would not be widely used and would give extra headache.

              Without any obvious gain to offset all this crap. I really do not get what so cool about ZFS. It seems to be overengineered stuff from Sun which got a lot of shortcomings and most of these are really hard to address. So Sun's devteam rather preferred to "solve" hardest problems using marketing bullshit. Clearly not approach I would like to see in my system. I like Linux not because of loud marketing, but because it works for me and does it well.
              It seems like the only thing that matters for you is #3. The reason that I like ZFS is that it has fewer headaches than any other filesystem that I have ever tried, btrfs included.

              Many people who use Windows make similar claims of seeing no obvious gain to using a Linux distribution, despite having *never* used one themselves, or trying one in a manner that was doomed to fail so that they could claim to have given it a chance. I think your feelings toward ZFS are the same.

              Originally posted by SystemCrasher View Post
              I wouldn't, because being at risk of failed boot sequence on kernel upgrade is unacceptable for me. And sometimes I do not mind to give it a try to brand-new mainline kernel, etc. And btrfs also comes with checksums. One can disable them with NODATACOW, but it's really special case one supposed to only use on few troublesome places which require extra attention anyway (DBs and VMs aren't "just files"). And filesystem checksums are good to have, sure. But these are not silver bullet.

              And of course I do understand storage techs I use. So it's not like your silly FUD attempt going to work :P.
              I doubt that given that:

              1. You practically worship nodatacow when the btrfs developers themselves advise against its use (as far as I can tell from #btrfs on freenode).
              2. You are unaware that btrfs performance can be terrible even with nodatacow.

              Checksums in pointers are the closest thing there is to a silver bullet in filesystems when dealing with silent corruption events.

              Originally posted by SystemCrasher View Post
              That's what I call "marketing bullshit" once more. What few of these worth of, if btrfs would work on virtually every Linux computer around and eventually supposed to became default filesystem in many distros. And they will be able to use all these snapshots, checksums, etc. And these using ZFS... it reminds me story on Apache and Yahoo using BSD. Sure, they did. But somehow their new hosts tend to run Ubuntu, lol. I guess more or less same fate awaits ZFS on Linux as well - good choice for few corporate Yahoo-like dinosaurs, sure.

              And except everything else, I've seen how mainline kernel devs are working. Having them on your side is really big win. Btrfs has got it. ZFS can't. I do not like that - I like software developed by Linux Kernel devs. They rock. And TBH I wouldn't be proud of having some DRM & proprietary fags like Netflix on my side.
              I would not view the ZFSOnLinux developers and Linux mainline developers as being on different sides. I have a good rapport with various mainline developers. We all believe in OSS after all.

              That said, the chances of ZoL being integrated into mainstream enterprise distributions are very high. It would not surprise me if we see an announcement regarding this from one of them within the next year.

              Comment

              Working...
              X