Announcement

Collapse
No announcement yet.

Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by kebabbert View Post
    1) ZFS does not need ECC RAM.
    ZFS was designed with BIG HUGE systems where RAM was ECC and also registered in mind.
    The filesystem assumes that what is in RAM is always true, and this is an issue if a bitflip happens.
    I've seen many people unfortunately lose their zpools over this topic, so I'm going to try to provide as much detail as possible. If you don't want to read to the end then just go with ECC RAM. For those of you that want to understand just how destructive non-ECC RAM can be, then I'd encourage...

    You can ignore the highly unlikely case of a fixed hardware fault being hit 2 times, the data was thrashed already by the first error.

    2) ZFS does not need huge amounts of RAM. ZFS runs fine on a Raspberry Pie with 256MB RAM:
    https://github.com/hughobrien/zfs-remote-mirror
    When I usually talk of ZFS I mean RAIDZ1 or RAIDZ2, not single drives without redundancy.

    Comment


    • Hmm. BIOS doesn't report ECC. I flashed newest one, with same result. EDAC module in kernel reports missing capability.

      I've came across this on some forum:

      **********

      It does not work on AM1 platforms. No matter what you do and think the AM1 APUs and the AM1 boards arent able to do ECC.

      CPU-Register D18F3xE8 reports 1F74F00h and that is readable in the BKDG and reports read only. So no ECC for AM1 ever.

      german source:

      http://www.planet3dnow.de/vbulletin/threads/421749-Geruecht-Zen-kommt-zuerst-als-Opteron?p=4988619&viewfull=1#post4988619

      ************


      I've checked the BKDG for Kabini and the register on my board. Content matches.

      And yet people report about successful fault injections etc.

      I'll toy with this later.

      Comment


      • RAM bit errors are real. They happen all the time.

        For anything that needs to be reliable get the ECC. It is worth it.

        Comment


        • Originally posted by starshipeleven View Post
          ZFS was designed with BIG HUGE systems where RAM was ECC and also registered in mind.
          The filesystem assumes that what is in RAM is always true, and this is an issue if a bitflip happens.
          I've seen many people unfortunately lose their zpools over this topic, so I'm going to try to provide as much detail as possible. If you don't want to read to the end then just go with ECC RAM. For those of you that want to understand just how destructive non-ECC RAM can be, then I'd encourage...

          You can ignore the highly unlikely case of a fixed hardware fault being hit 2 times, the data was thrashed already by the first error.
          Well, that link of yours is wrong. My link shows that the "facts" in your thread are wrong. My link says, about your link:
          "...As far as I can tell, this idea originates with a very prolific user on the FreeNAS forums named Cyberjock, and he lays it out in this thread here. It’s a scary idea – what if the very thing that’s supposed to keep your system safe kills it? A scrub gone mad! Nooooooo!..."

          And my link shows that exactly that link you point to is wrong. Also, at the end of my link, Matt Ahrens, the architect of ZFS says that “There’s nothing special about ZFS that requires/encourages the use of ECC RAM more so than any other filesystem.”

          So, again, you are wrong. That FreeNAS thread is wrong.


          When I usually talk of ZFS I mean RAIDZ1 or RAIDZ2, not single drives without redundancy.
          But now you see that you can run ZFS on a 256MB RAM Raspberry Pie? Also, I wonder, why do you think that the Raspberry Pie implementation of ZFS does not allow it to run RAIDZ1/2? BTW, as I have told you earlier: I ran a 1GB, 32 bit Pentium4, Solaris PC with raidz1 using 5 discs, for over a year without problems.

          So, again, as I have told you before; it seems that both of your statements of ZFS are wrong
          A) Your thread about ECC data corruption on ZFS is wrong, non ECC RAM does not worsen the ZFS situation in case of corrupt RAM - instead ZFS with NonECC RAM is safer than using any other filesystem.
          B) ZFS does not require huge amounts of RAM. If you have less RAM, you dont get a large L2ARC diskcache, but that is all. ZFS performance degrades down to disk speed, instead of running at RAM speed.

          Once again as I have did before, I would kindly ask you to stop spreading these statements? Because I have posted these links earlier for you and still you continue to spread these false statements, are you deliberately FUDing? Are you a FUDer?

          Comment


          • Originally posted by justmy2cents View Post
            and i only use 5
            btrfs never was ready for your slow and less robust raid level

            Comment


            • Originally posted by kebabbert View Post
              Well, that link of yours is wrong. My link shows that the "facts" in your thread are wrong.
              I'm just using that to show what a single memory error can do, I already said that the condition in that thread is far-fetched.

              ECC is necessary mostly because ZFS uses so much ram even at idle (for caching, but also for filesystem structures), most other filesystems don't and they lessen the risk by a lot.

              But now you see that you can run ZFS on a 256MB RAM Raspberry Pie? Also, I wonder, why do you think that the Raspberry Pie implementation of ZFS does not allow it to run RAIDZ1/2?
              It's Raspberry Pi. "Pi" Like the greek letter, not "pie" like the cake.
              That said, Raspi has all connectivity on a SINGLE usb port controller in the SoC, so whatever ports you see on the board, they all come from a USB hub, also the ethernet is on the same USB hub.
              So that's what, 3 or 4 drives contending ONE single USB port over a hub. Plus the ethernet traffic, of course. Running ZFS's equivalent of RAID5/6.
              It's going to suck so hard that it's more useful as room heating.

              B) ZFS does not require huge amounts of RAM.
              https://wiki.freebsd.org/ZFSTuningGuide
              "To use ZFS, at least 1 GB of memory is recommended (for all architectures) but more is helpful as ZFS needs *lots* of memory. Depending on your workload, it may be possible to use ZFS on systems with less memory, but it requires careful tuning to avoid panics from memory exhaustion in the kernel. A 64-bit system is preferred due to its larger address space and better performance on 64-bit variables, which are used extensively by ZFS. 32-bit systems are supported though, with sufficient tuning. "


              " Everything that I’ve read says ZFS will use all of the RAM in your system that it can, how can it leave 30GB just wasting away doing nothing? "


              8GB* RAM (minimum)
              16GB RAM (recommended)
              If you have less RAM, you dont get a large L2ARC diskcache, but that is all. ZFS performance degrades down to disk speed, instead of running at RAM speed.
              Performance tanks hard. But like... hard. If you have too little ram you also risk instability.

              are you deliberately FUDing? Are you a FUDer?
              How about we go look at your post history, hmm? Where you claim total bs about very high-end servers and Unix until actual experts show up and nuke you from the orbit?

              Comment


              • Originally posted by johnc View Post
                My friend recently bought a Synology NAS and I was surprised that they were recommending he select btrfs during its setup. And he wanted to do RAID5, too.
                Originally posted by jolebole View Post
                I did upgrade from 2bay to 4bay Synology NAS couple months ago and during the setup I was offred to setup the Volume to BTRFS. I guess is back to Ext4 lol ..or FREENAS
                Synology has at no time used the btrfs RAID56 feature. Btrfs on Synology is on top of mdraid (and unfortunately also on top of LVM) so it is "insulated" from problems with the underlying hardware. Therefore all of the bugs in btrfs that are related to devices disappearing/reappearing and device replacement and scrubbing will not be a problem for Synology users. And those are the main problems btrfs has - the basic functionality is pretty stable as long as your disks don't screw up.
                (of course the advantage a filesystem that keeps checksums is not realized until they fix their handling of unreliable devices! But snapshots and compression alone are good reasons for btrfs.)

                Comment


                • Originally posted by starshipeleven View Post
                  I'm just using that to show what a single memory error can do, I already said that the condition in that thread is far-fetched.

                  ECC is necessary mostly because ZFS uses so much ram even at idle (for caching, but also for filesystem structures), most other filesystems don't and they lessen the risk by a lot.
                  I'd say it will be necessary only because ZFS performs so much data integrity checks, that a non-ECC RAM will cause ZFS to keep resorting to the hard disk to ensure the data is not corrupted, thus reducing performance to the storage level.

                  So, it is bad from a performance point of view.

                  On the flip side, it's good from a data integrity point of view.

                  Other filesystems would simply use the bad bit and deliver corrupted data to the process.

                  With ZFS, however, data at rest is guaranteed to be safe.

                  Comment

                  Working...
                  X