Page 1 of 4 123 ... LastLast
Results 1 to 10 of 43

Thread: The Performance Of EXT4 Then & Now

Hybrid View

  1. #1
    Join Date
    Jan 2007
    Posts
    15,657

    Default The Performance Of EXT4 Then & Now

    Phoronix: The Performance Of EXT4 Then & Now

    Over the past week there has been a lot of talk about the EXT4 file-system following the announcement that Google is migrating their EXT2 file-systems to EXT4. Their reasons for this transition to EXT4 are attributed to the easy migration process and Google engineers are pleased with this file-system's performance. However, as we mentioned in that news post last week and in many other articles over the past weeks and months, EXT4 is not as great of a contender as it was in the past, well, for some tests at least. The performance of the EXT4 file-system commonly goes down with new kernel releases and not up, as kernel developers continue to introduce new safeguards to address potential data loss problems that initially plagued some EXT4 users. For our latest EXT4 benchmarks we have numbers that show this file-system's performance using a vanilla 2.6.28 kernel (when EXT4 was marked as stable) and then every major kernel release up through the latest Linux 2.6.33 release candidate.

    http://www.phoronix.com/vr.php?view=14516

  2. #2
    Join Date
    Jan 2010
    Location
    Vienna, Austria
    Posts
    7

    Default

    Are you really complaining that kernel developers have chosen safe over fast defaults?

    I dunno, but I <3 my data and would rather prefer that I can access it even if some unlucky power-out happened on my laptop.

    If you don't mind that, change the mount option.. that's what it's here for. But the defaults are sane, you can't expect a newbie user to manually alter that kind of thing.

  3. #3
    Join Date
    Jul 2009
    Posts
    351

    Default Benchmarking bogus code

    These benchmarks are tough to process by themselves. The older benchmarks are really invalid because the code has a fatal flaw: no data safety. You can make ANY filesystem look fast if you are only pretending to write the data to disk. You might as well benchmark the write performance of /dev/null.

    Comparing to other filesystems would be much more interesting.

    Every filesystem has the problem of committing data to disk. What is interesting is how they handle it.

  4. #4
    Join Date
    Dec 2007
    Posts
    248

    Default

    Quote Originally Posted by frantaylor View Post
    These benchmarks are tough to process by themselves. The older benchmarks are really invalid because the code has a fatal flaw: no data safety. You can make ANY filesystem look fast if you are only pretending to write the data to disk. You might as well benchmark the write performance of /dev/null.
    I think that's not correct... It's just that new ext4 settings for data safety are paranoid.

    I'm using ext4 for a year now and I have not experienced any problems with 2.6.29-2.6.30 (using it right now).
    While I had problems with 2.6.28 to be exact.

    I think the settings from 2.6.31 and up are good for a paranoid server admin but not for a regular desktop use. I plan to stay with 2.6.30 for as long as I can but probably at some point new mesa/drm/xorg etc. will force me to upgrade.
    Last edited by val-gaav; 01-20-2010 at 03:30 PM.

  5. #5
    Join Date
    Feb 2008
    Posts
    15

    Default

    Quote Originally Posted by kingN0thing View Post
    Are you really complaining that kernel developers have chosen safe over fast defaults?

    I dunno, but I <3 my data and would rather prefer that I can access it even if some unlucky power-out happened on my laptop.

    If you don't mind that, change the mount option.. that's what it's here for. But the defaults are sane, you can't expect a newbie user to manually alter that kind of thing.
    I donít think the discussion at hand is solely regarding whether or not the safety mechanisms should be put in place.

    I'm pretty certain I read that the original problems with data loss were due to application developers not writing to files properly. Something about depending on the kernel to automatically fsync instead of doing it themselves, but I could be waay off :P

    Anyway, I think there's a chance that if most programmers out there had written their apps while catering for optimizing how their apps wrote to files, then we could have maintained the significant speeds of the original benchmarks.

    But then, is there any real expectation that your average non-guru developer will want to think about things like that?

    Alex

  6. #6
    Join Date
    Jan 2010
    Location
    Vienna, Austria
    Posts
    7

    Default

    Quote Originally Posted by jackflap View Post
    But then, is there any real expectation that your average non-guru developer will want to think about things like that?
    Alex
    Nope, don't believe most developers being capable of doing that.. but users can decide that they do not have any useful data and enable the faster (but less secure) behaviour with -o nobarriers. So I don't get the constant fuzz about that change.

  7. #7
    Join Date
    Apr 2008
    Location
    Saskatchewan, Canada
    Posts
    466

    Default

    Quote Originally Posted by jackflap View Post
    I'm pretty certain I read that the original problems with data loss were due to application developers not writing to files properly. Something about depending on the kernel to automatically fsync instead of doing it themselves, but I could be waay off :P
    The problem was not so much that data when missing when you didn't fsync(), it was that you could write to a file, rename it on top of an old file, and then after a reboot discover that your file had been truncated to zero bytes rather than being either the old file or the new file. Given that's been the normal mechanism for anyone needing to perform an atomic update since the Stone Age, for any modern file system _not_ to handle such behaviour cleanly is insane.

    As for fsync(), it's all very well to say you have a wonderfully fast file system because you don't write data out to disk unless you have to, but if that then requires every application to call fsync() any time it writes to the disk in order to ensure that the data will actually be there after a reboot then all your performance gains have just been thrown away.

    Worse than that, fsync() on ext3 with default configuration is slow and unneccesary in most distributions, so suddenly applications have to look at the file system of the computer they're working on in order to determine whether or not they should be calling fsync() all the time; that's mad.

    Lastly, of course, the odds of getting more than a small fraction of application developers to implement fsyncs() properly throughout their code is minute (e.g. even if they're syncing the data files, will they also remember to sync the directory when that's required, and do so in the correct order to ensure that the file contains the correct data?), so why force changes to millions of lines of code when you can just do it once in the file system?

  8. #8
    Join Date
    Jan 2008
    Posts
    772

    Default

    Quote Originally Posted by movieman View Post
    The problem was not so much that data when missing when you didn't fsync(), it was that you could write to a file, rename it on top of an old file, and then after a reboot discover that your file had been truncated to zero bytes rather than being either the old file or the new file.
    AFAIK, that happens because the rename (metadata) can be committed before the write (data), and if you really need the write to be committed first, you're supposed to call fsync() between the two. And, unless I'm completely misunderstanding the scenario you're describing, it's not just "after a reboot", but "after a crash/power loss/other abnormal shutdown that occurs between the rename commit and the data commit".
    Last edited by Ex-Cyber; 01-19-2010 at 01:42 PM.

  9. #9
    Join Date
    Apr 2008
    Location
    Saskatchewan, Canada
    Posts
    466

    Default

    Quote Originally Posted by Ex-Cyber View Post
    AFAIK, that happens because the rename (metadata) can be committed before the write (data), and if you really need the write to be committed first, you're supposed to call fsync() between the two.
    Except no other current file system requires that, and 99.999% of all existing software doesn't do it. And even if much of that software is 'fixed', probably 90% of the people 'fixing' it won't realise that they also need to sync the directory to ensure that it works.

    And one of the common uses is in shell scripts, where you'll have to sync the entire disk. Just to safely update a two-line file.

    And, unless I'm completely misunderstanding the scenario you're describing, it's not just "after a reboot", but "after a crash/power loss/other abnormal shutdown that occurs between the rename commit and the data commit".
    True, but 99% of Linux systems crash at some point, even if only because of a power failure; and I believe that ext4 as originally implemented could delay the data write up to a couple of minutes after the metadata, so the odds of this happening on a crash were high.

    Applications should be able to rely on some basic, sane behaviour from a file system (such as a 'rename a b' leaving them with either file a or file b on the disk and not an empty file which never existed in the logical filesystem), with a few exceptions like databases which provide explicit guarantees to their users. File systems which don't behave in such a manner simply won't get used for anything which requires reliable storage, because no matter how fast they are they're not performing their most basic function of storing your data.

    In addition, different users and different uses have different thresholds for data reliability: for example, I might not care if I lose a data file that I saved two minutes ago so long as I still have the data file which I wrote out five minutes ago... someone else might be incensed if they lose data that they wrote out two seconds ago. That kind of decision should not have to be made on a per-application basis ('Edit/Preferences/Do you care about your data?'), it should be part of the filesystem configuration.

    The only argument I've seen for this behaviour is that 'Posix doesn't require us to do anything else'. But Posix doesn't require much of anything and I suspect that at least 90% of current software would fail on a system which only implements the absolute minimum Posix requirements.

  10. #10
    Join Date
    Jan 2010
    Location
    Vienna, Austria
    Posts
    7

    Default

    another thing that I remember from earlier linux days:

    ext2/3 were mostly CPU-bound which means you can increase performance vastly by adding more CPU power. Other file-systems (ie. ReiserFS) are IO-bound which means that you can vastly improve performance by adding a faster disk.

    The test platform (Atom 330) might thus be inherently 'unfair' for ext2/3/4. And do not forget that advantages in CPU speed are a magnitude higher than advantages in storage technology.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •