Announcement

Collapse
No announcement yet.

Benchmarking ZFS On FreeBSD vs. EXT4 & Btrfs On Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Errors are a fact of life.

    Originally posted by kebabbert View Post
    I havent looked at the benchmarks, but I am convinced you are correct.

    But instead of benching performance, if we benched data safety, I am sure ZFS destroys them all. BTRFS is not ready yet. I doubt it offers good data protection, read my long post above.

    Performance is not that important. Is your data safe? No? Then it is your choice to use a fast but data corrupting filesystem! I prefer to use a safe filesystem.
    That's just it -- different situations call for different tools. I read about people using ZFS for home media servers and the only ting I can think of is "overkill." Personally, I don't care if one bit gets flipped in a movie file. Seriously, one pixel, one color off, in one frame. I do believe that corruption happens more often than people think, but with GUI OS's, there's a huge chance that one pixel in a Windows Icon is going to be off. There's just so much non-critical space on a hard drive these days.

    Now, in the production server side, I could see where my work doesn't want a bit flipped in the source code and they are willing to pay the performance penalty and associated costs for ECC memory, etc. But they are sill stuck with NTFS, unless they want to triple IT spending and go with a SAN solution.

    But 100% data integrity doesn't seem possible at a reasonable price point, if you look as CERN's data. Even with ZFS and ECC, other subsystems can induce errors. I suppose you could go the three systems route and only trust trust results that two out of three agree upon but who would pay for that except for a few fringe cases? Certainly not my home media server.

    Comment


    • Originally posted by ldillon View Post
      That's just it -- different situations call for different tools. I read about people using ZFS for home media servers and the only ting I can think of is "overkill." .
      You wouldn't say that when your BTRFS or MDRAID-6 setup just utterly trashed itself.

      I benchmarked things - and then I started deliberately breaking things. ZFS survives incidents which would have you reaching for the backups in any other circumstance. On top of that it's quite fast when laid out correctly.

      It's not about bit flipping. It's about durability.

      Comment


      • Originally posted by stoatwblr View Post
        You wouldn't say that when your BTRFS or MDRAID-6 setup just utterly trashed itself.

        I benchmarked things - and then I started deliberately breaking things. ZFS survives incidents which would have you reaching for the backups in any other circumstance. On top of that it's quite fast when laid out correctly.

        It's not about bit flipping. It's about durability.
        I see that you have some pretty strong feelings about this. My point was I would't use RAID in the first place (and I've read reports of people's ZFS setups failing too).

        I shy away from RAID, etc, in home-usage scenarios and even in some smaller work environments. I've seen a few cases where people screw up things (lose data) not understanding RAID (thinking it's a backup solution) or replacing the wrong disk (on a non-server style enclosure with the blinking lights). For the home user, the data doesn't change much and they are better off with two (or more) separate disks that do a rsync (or robocopy) every 24 hours or so. This provides a backup (something many forget) and has a much lower level of complexity. The through-put of a single disk is usually enough to saturate gigabit Ethernet, so performance is generally not a concern. The odds of moving the disks to another system or OS in case a motherboard fails is also much greater (especially versus a "fake-raid" setup). You're not going to temporarily move your sexy ZFS setup to your Linux or Windows box if you have a hardware failure, whereas you can mount a disk or two.

        RAID, ZFS and other enterprise solutions have their place, to be sure, but that place isn't your home media server unless you really know what you are doing (and even then I think people do it for bragging rights or to increase their "geek creed".) RAID is a high-availability solution and that's a solution looking for a problem in most homes. If setting up an array is how someone wants to spend their time, I'm OK with that, but if you want simple, quick to implement and reliable, it's better to avoid complexity. The only thing you're loosing is HA (and a lot of headaches).

        Comment


        • Originally posted by ldillon View Post
          That's just it -- <SNIP>
          Please don't necro in this thread, let it die already, you're posting long after the discussion died, not cool to revive it again.
          There was a separate thread started anyway, + there's other active (or more recent) related threads nowadays, just search the forum.
          Last edited by jalyst; 01 October 2012, 01:50 AM.

          Comment


          • Originally posted by Michael View Post
            Most desktop users though that get excited about hearing about ZFS + Linux possibilities are running such single drive setups though, so this testing is aimed at them (like most Phoronix articles towards desktop users), and not those enterprise installations.
            I'd be surprised if that were the case at all.

            Personally I run a FreeBSD based SAN and I'd prefered ZFS on Linux which is why the stories are mildly interesting, not because I'm also a Linux desktop user.

            Comment


            • Originally posted by ldillon
              I see that you have some pretty strong feelings about this. My point was I would't use RAID in the first place (and I've read reports of people's ZFS setups failing too).
              Everything fails, the advantages of technologies is how they handle failure. If you're running a ZRAID and two disks drop, well you're screwed. The same goes with a RAID|2 with 3 disks dropping. I doubt its been failing much beyond user error though, but I accept its not outwith the realms of possibilities.

              Originally posted by ldillon
              I shy away from RAID, etc, in home-usage scenarios and even in some smaller work environments. I've seen a few cases where people screw up things (lose data) not understanding RAID (thinking it's a backup solution) or replacing the wrong disk (on a non-server style enclosure with the blinking lights). For the home user, the data doesn't change much and they are better off with two (or more) separate disks that do a rsync (or robocopy) every 24 hours or so. This provides a backup (something many forget) and has a much lower level of complexity. The through-put of a single disk is usually enough to saturate gigabit Ethernet, so performance is generally not a concern. The odds of moving the disks to another system or OS in case a motherboard fails is also much greater (especially versus a "fake-raid" setup). You're not going to temporarily move your sexy ZFS setup to your Linux or Windows box if you have a hardware failure, whereas you can mount a disk or two.
              You're correct in the sense that RAID is not a replacement for backup. Critically RAID doesn't protect against catastrophic failure such as a fire, massive power failure, or hardware fault killing numerour drives. The problem is, neither does your solution. What are you going to do with your "copy data every 24 hours backup solution" when your single disk fails fairly regularly and you keep eating up to 24 hours in lost productivity? Using raid is about minimising your exposure to this, not completely negating it. Its about spending a bit more money and time to save a lot more money and time in the long run with disk failures. Similarly, what are you going to do when a power surge takes out all the drives in your device at once? Where are your backups then? The problem with statements like "RAID is not a backup", is you should have a fair enough of what is a backup, should not suggest something that isn't a viable backup and you certainly should not suggest something that is inferior to any decent RAID solution.

              Also FreeBSD is free, thus importing your data to another system should be pretty easy assuming you know how and you have the capacity to do so. Even if you don't have another server, and you can't power the server you do have down (this would be fairly stupid by the way), a virtual machine will see you though. Of course, even if you don't know how, the disaster recovery plan that any business should have, and you should have been testing periodically, will of course see you through.

              Originally posted by ldillon
              RAID, ZFS and other enterprise solutions have their place, to be sure, but that place isn't your home media server unless you really know what you are doing (and even then I think people do it for bragging rights or to increase their "geek creed".) RAID is a high-availability solution and that's a solution looking for a problem in most homes. If setting up an array is how someone wants to spend their time, I'm OK with that, but if you want simple, quick to implement and reliable, it's better to avoid complexity. The only thing you're loosing is HA (and a lot of headaches).
              I actually run RAIDZ2 at home with cheap SATA drives. I don't do it because of geek cred, I do it because when a disk failed the other day, all I had to do was bend down, pull the bad disk*, push in a replacement, and add it to the pool. All without any downtime. Course, thats not my backup for important data if the house goes on fire, but its convenient nonetheless.

              * actually I had the same problem you mentiond above. My n40l doesn't have any blinking lights to tell me the disk failed, so I didn't actually know which one to pull. You know what I did? I read the manual.

              Comment


              • Everything is a tradeoff

                ownagefool, (I'm not going to quote for brevity's sake)

                (When I write "ZFS" below, I mean "any RAID solution." I'm not picking on ZFS.)

                My comments weren't geared toward someone with your advanced technical skills so much as the average home user. I don't think the "average home user" has 24 hours of productivity in the average day. They might have, on average, a couple of new files that are important, now that most people are using web-based email.

                No, having the second disk in the same PC isn't the idea situation. It's the cost-effective solution. You could put a second disk in a second PC, or in another building, etc, but I contend that verses the hassles, potential pitfalls, cost and complexity of other solutions, it's a good tradeoff for most home users. (unless you are a consultant that gets paid by the hour, then the more complex the better </kidding>) Less prone to problems than an external drive and a much better than nothing, which is what most people have.

                The point about "swapability" was that it doesn't take a re-install to just put the second drive in another (Linux or Windows) PC that a regular user is more likely to have around. Most people have more than one PC these days, but they are in use. Sure, you could reinstall the ZFS server if you have spare, unused hardware, but most home users don't and couldn't figure it out. But they could probably manage to move a drive (or I could talk them through it over the phone).

                "What are you going to do when a power surge takes out all the drives in your device at once?"
                So you are assuming that the home user with the ZFS setup doesn't back up to disk and has a tape backup (or something)?
                If we are comparing apples to apples, the ZFS guy has a spare disk(s) that he backs up to and that gets take out in your power surge, along with the ZFS array. (As a side bet, I'd wager that half the people running ZFS at home don't have any backup of the array at all because ZFS is so uber-reliable) In an ideal world we'd all do off-site backups but the average home user shouldn't be expect to do that. Sure, your could back up "to the cloud" as a secondary backup, but that (apples to apples) would work with the two disk setup or the ZFS setup.

                Have we talked about cost yet? If the use can only afford two disks, what solution would you suggest for the same amount of money?

                Comment


                • Originally posted by ldillon View Post
                  ownagefool, (I'm not going to quote for brevity's sake)

                  (When I write "ZFS" below, I mean "any RAID solution." I'm not picking on ZFS.)

                  My comments weren't geared toward someone with your advanced technical skills so much as the average home user. I don't think the "average home user" has 24 hours of productivity in the average day. They might have, on average, a couple of new files that are important, now that most people are using web-based email.
                  Idillon, I mean no disrespect, however you extended your comments to "some small businesses" and made some incorrect statements. You may actually know the difference, or at least know enough to learn more when you need to know, but other people reading what you said may get the wrong idea, so I felt it appropriate to put them correct or at least get you to put them in context. If you were talking purely about home users, I may not have even bothered.

                  Originally posted by ldillon View Post
                  No, having the second disk in the same PC isn't the idea situation. It's the cost-effective solution. You could put a second disk in a second PC, or in another building, etc, but I contend that verses the hassles, potential pitfalls, cost and complexity of other solutions, it's a good tradeoff for most home users. (unless you are a consultant that gets paid by the hour, then the more complex the better </kidding>) Less prone to problems than an external drive and a much better than nothing, which is what most people have.
                  Its only cost-effective in the sense that if costs less and doesn't effectively protect your data. If your data is important to you, then you need a proper backup solution. As a purely home user, its likely your data may not be important at all, then of course flying by the seat of your pants is completely OK. If the home user maintains their data is important when you do not believe is really is, quote them the real cost of a proper solution. If they back off, thats fine. Their data wasn't that important afterall.

                  Originally posted by ldillon View Post
                  "What are you going to do when a power surge takes out all the drives in your device at once?"
                  So you are assuming that the home user with the ZFS setup doesn't back up to disk and has a tape backup (or something)?
                  If their data is really worth backing up, then its worth doing correctly. If its not really worth backing up, you can do whatever you like to make them feel warm and fuzzy, but if they actually NEED their data, you need to go with full disclosure.

                  Originally posted by ldillon View Post
                  If we are comparing apples to apples, the ZFS guy has a spare disk(s) that he backs up to and that gets take out in your power surge, along with the ZFS array. (As a side bet, I'd wager that half the people running ZFS at home don't have any backup of the array at all because ZFS is so uber-reliable) In an ideal world we'd all do off-site backups but the average home user shouldn't be expect to do that. Sure, your could back up "to the cloud" as a secondary backup, but that (apples to apples) would work with the two disk setup or the ZFS setup.
                  Yep, you're 100% correct. A ZFS array is NOT a backup either. If you're not running a backup and your data is important, then you deserve a face palm. The more important, the bigger the face palm.

                  Also, full disclosure, I don't actually backup 99% of my ZFS based NAS. My files, outwith my code (I work at home and thats backed up) none of my files are really important. I have the NAS mostly as a home server for convenience, and being mostly trust worthy is good enough for me (when I lose all my files, it'll be a bummer, but its not like I'll lose any money or productivity).

                  Originally posted by ldillon View Post
                  Have we talked about cost yet? If the use can only afford two disks, what solution would you suggest for the same amount of money?
                  1. If their data isn't important (and lets be honest, most peoples data isn't important) but their uptime is then I'd just leave them with the straight up RAID1 and make them a restore disk at a push.
                  2. If teir data is important, but their uptime less so, then something like bacula (or the cloud if you're into that sorta thing) to do offsite backups (assuming they have a decent net connection).
                  3. If both are important, then combine the first two options.
                  4. If neither is important, I'd do nothing (maybe a restore disk, this at least will stop them calling you when their disk eats itself).


                  If you're taking it seriously, you're gonna want to run some checks to make sure the RAID array is clean and that it'll email errors (and label the disks!). You'll also want to make sure those backups and working. You can do that on the backup server, but you'll probably want to do a bit of work to check whether the box is up because you dont want it screaming about no backups on a PC that hasn't been turned on in 3 days. As always, the amount of effort depends on the value of the data.


                  On that point I started my career at a webhosting provider on the phones no less. Obviously we dealt with servers falling over on occasion. You'd get people calling up (on the cheapest package no less) screaming down the phone they were losing $20k per hour because their site was down. The answer was generally always the same, "why on earth are you paying for a best efforts shared pacakge at $20 a year when you're losing several times more of the cost of a proper solution".
                  Last edited by ownagefool; 01 October 2012, 09:12 PM.

                  Comment

                  Working...
                  X