Announcement

Collapse
No announcement yet.

EXT4 Lets Us Down, There Goes Our R600/700 Mesa Tests

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by phtpht View Post
    Not true. Since the dawn of times drives have used in-memory "write-back" caches. Your operating system has one, your RAID controller has one and hell even most modern consumer drives have one
    Assuming you're not mad enough to run a hardware RAID controller without battery backup, the only one of those which could lose your files on a typical ext3 configuration is the disk cache. Which will flush itself very quickly, so it's really only an issue in a sudden power failure combined with a file system sync combined with a drive which reorders writes so that the metadata is updated before the file data, combined with the power going out before the write is complete... in other words, almost never. I guess you could get a partially written block, but I've never seen one in all the power failures I've had in all the computers I've used, so that seems very unlikely.

    Ext4, on the other hand, will lose or corrupt your data in normal operation if the system crashes or power fails, because it doesn't have all the reliability features that ext3 eventually added precisely because an unreliable filesystem is useless as a general purpose filesystem: most people would rather read valid data slowly than corrupt data fast.

    So what do you think that happens when you cut off power while the data is still in cache but not on disk? Data lost.
    And if you're running ext3, that only really matters if the writes were reordered by the drive. If you're running a file system like ext4 which believes it can write random data to random places at random times, well, you're toast.

    If you want to guarantee your data on the disk, you will use sync, period.
    Where does sync magically avoid data loss from the disk cache on a power failure? The only thing it guarantees is that the filesystem tries to write the data to the disk; there's no guarantee that it actually gets there, if the system crashes while the sync is in progress. And if the filesystem writes in a random order there's no guarantee that whatever part of the file does get to disk before a crash will be valid.

    Back in the real world, any general-purpose filesystem which expects every application to call sync every time it writes data to the disk which it doesn't want corrupted is simply broken. Firstly because at least 90% of applications don't call sync and won't be updated to do so within the next decade, secondly because sync is slow and unnecessary on the most common current Linux filesystem so you're now expecting application developers to check the filesystem they're writing to in order to determine whether or not they should bother to sync after each write if they don't want to cripple performance, and thirdly because I strongly suspect that 90% of the 10% of applications which do sync don't sync properly (e.g. syncing the directory as well as the file, when that's required).

    Worse than that, most applications that write to the disk do actually expect their data to get there, and even those which don't have to call sync if they don't want the file corrupted because the unsynced filesystem wrote out the metadata before it crashed but not the actual file data. So that means calling sync all the time and crippling performance, all in the name of supporting a cache which exists solely to improve performance: aka 'we had to kill our performance in order to save it'.

    Seriously, you're demanding that programmers return to the stone age of computing where they had to worry all the time about what the hardware was doing underneath them; you might as well demand they make low-level BIOS calls to write files to disk or write their own raw I/O routines and interrupt handlers to read them back.

    I honestly don't know of a better way to point out what a stupid idea this is.

    But that's also why each sync takes sooooooo looooooong on ext3 and that's why people have GIVEN UP on doing that and that's why IT SEEMS that ext3 is more reliable (unless you use Ubuntu which uses unsafe defaults).
    No, ext3 _IS_ more reliable, at least by default. That is a simple fact: the default configuration for ext3 on pretty much all distributions is set for reliability over performance, which is what the vast majority of users want for a general purpose filesystem.

    So now people are being told 'dump ext3, which reliably stores your data in 99.999% of cases and replace it with ext4 which will happily corrupt it if your application doesn't use a transactional database model'. And you're surprised that people aren't rushing towards the brave new future of random data loss or lousy performance?

    But because of EXT3 history, people forget to goddamn sync their data when they want them on the disk. Then they whine that if their game freezes they lose their goddamn bookmarks with EXT4.
    A general purpose filesystem exists to reliably store user data on the disk. If a supposed general purpose filesystem deletes my bookmarks when a game crashes, then the filesystem is broken. I don't care that you can save my game 0.1 seconds faster than ext3 if you delete my bookmarks when the game crashes, and I would note that Mozilla took an age to fix Windows deleting bookmarks from NTFS after a crash, and the consequent move to sqlite instead of a simple HTML file is probably responsible for its lousy performance on ext3 because sqlite syncs all the time _even though syncs are not required for that application on ext3_.

    That's not to say that such a filesystem doesn't have other uses where reliable data storage is less important than performance, but it certainly should not be pushed for general purpose use like storing user home directories.

    So you want to either use a sissy filesystem like EXT3 and forget about my lecture here or you use a real one and learn how to use it properly.
    A real filesystem like ZFS, for example, which is light years ahead of ext4 technically and suffers from none of its data corruption problems so you can actually trust that your data will still be there when you go to read it.

    Or is ZFS for 'sissies' too?

    Comment


    • #62
      Originally posted by movieman View Post
      Assuming you're not mad enough to run a hardware RAID controller without battery backup
      There's not that much danger in that, now that we have filesystems with the concept of write barriers. As I said, most consumer drives have w/b caches and obviously no tiny batteries.
      And if you're running ext3, that only really matters if the writes were reordered by the drive. If you're running a file system like ext4 which believes it can write random data to random places at random times, well, you're toast.

      Where does sync magically avoid data loss from the disk cache on a power failure? The only thing it guarantees is that the filesystem tries to write the data to the disk; there's no guarantee that it actually gets there, if the system crashes while the sync is in progress. And if the filesystem writes in a random order there's no guarantee that whatever part of the file does get to disk before a crash will be valid.

      Seriously, you're demanding that programmers return to the stone age of computing where they had to worry all the time about what the hardware was doing underneath them; you might as well demand they make low-level BIOS calls to write files to disk or write their own raw I/O routines and interrupt handlers to read them back.
      You got it upside down. The concept of block device caches is well known and well documented throughout the (for example POSIX) API. The contract for write() is that it will SOMEHOW make notion that this and that should go here and there and all it guarantees is that subsequent read() returns that data; on the other hand, fsync() specification says that it will WAIT until the data, the metadata and the directory entry are reported by the DEVICE to have been written on a stable storage. If your OS or device behaves differently then it's BROKEN and all bets are off.

      This is what I expect the programmers to acknowledge and work with, nothing more, nothing less. It's an abstraction that shields you from the actual implementation, either if the file system writes to disk at once or if on the other hand the data takes a round trip through the solar system.

      However, the concept of EXT3 "hoping that data won't be reordered" is the exact opposite. You're ASSUMING certain geometry and behavior of the drives that may or may not hold. EXT4 actually steps down from this in the concept that a disk is a random write device with details unknown.

      No, ext3 _IS_ more reliable, at least by default. That is a simple fact: the default configuration for ext3 on pretty much all distributions is set for reliability over performance, which is what the vast majority of users want for a general purpose filesystem.
      Oh yes, that's why barriers were initially OFF for ext3 and that's why some distro's (ubuntu) maintain that tradition even when the default has changed after a lenghty debate akin to the one we have.

      So now people are being told 'dump ext3, which reliably stores your data in 99.999% of cases and replace it with ext4 which will happily corrupt it if your application doesn't use a transactional database model'. And you're surprised that people aren't rushing towards the brave new future of random data loss or lousy performance?
      They are told "leave your illusions and welcome to the real world". And in the spirit of freedom they always can revert to their old ways.

      In other words, "buy an UPS and backup your data, morons".

      A general purpose filesystem exists to reliably store user data on the disk. If a supposed general purpose filesystem deletes my bookmarks when a game crashes, then the filesystem is broken.
      You've reversed the cause and consequence. Firefuck not syncing its bookmarks is the result of a broken filesystem the majority used. Again an abstration that leaked: if ff would adhere to the standards, people would whine that their games run slow, because ext3 would also sync stuff from the game.

      That's not to say that such a filesystem doesn't have other uses where reliable data storage is less important than performance, but it certainly should not be pushed for general purpose use like storing user home directories.
      Home dirs are general purpose? C'mon.

      Or is ZFS for 'sissies' too?
      I don't know yet. Enlighten me.

      Comment


      • #63
        so, what you guys are saying is:
        ext3 is a clusterfuck
        ext4 is broken.

        welcome to my world.

        Comment


        • #64
          Is ZFS use more widespread nowadays? I used to do tech support for Sun's storage division (left in 2007) and it was very rare to get calls about ZFS, even after being trained for it. Maybe it was too new back then, or maybe those customers weren't calling for a reason? :P

          Comment


          • #65
            Originally posted by energyman View Post
            ext4 is broken.
            I'd say immature.

            Comment


            • #66
              Magic SysReq keys

              This may help in the future. Assuming the kernel has not completely hard locked (and maybe even so), you can force filesystems to sync and remount read-only.

              Comment


              • #67
                Originally posted by decep View Post
                This may help in the future. Assuming the kernel has not completely hard locked (and maybe even so), you can force filesystems to sync and remount read-only.

                http://en.wikipedia.org/wiki/Magic_SysRq_key
                Yeah. Just make sure it works well before it's needed, as some distros are stupid enough to disable it for you.

                That is, see if /proc/sys/kernel/sysrq contains a 1 (and try it). Also, you can invoke the functions without the keyboard by echoing the character to /proc/sysrq-trigger. Finally, IIRC there is an iptables module that can convert incoming magic packets into magic sysrq.

                Comment

                Working...
                X