Announcement

Collapse
No announcement yet.

With Linux 2.6.32, Btrfs Gains As EXT4 Recedes

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • phoronix
    started a topic With Linux 2.6.32, Btrfs Gains As EXT4 Recedes

    With Linux 2.6.32, Btrfs Gains As EXT4 Recedes

    Phoronix: With Linux 2.6.32, Btrfs Gains As EXT4 Recedes

    We have published articles containing EXT4 benchmarks many times now going back to our original real world benchmarks of EXT4 to when Ubuntu 9.04 received EXT4 support and when we ran a variety of file-system benchmarks on an Intel X25-E SSD. We had also thrown in EXT4 numbers when benchmarking Btrfs (and again with Btrfs 0.19) along with NILFS2 benchmarks. Each time has been with a different kernel and the performance of the different Linux file-systems continue to change as each file-system matures and picks up different features. Though with the Linux 2.6.32 kernel the EXT4 performance had changed a great deal due to a change that provides better data integrity on writes but at a significant performance cost. To see how this changes the Linux file-system landscape, atop the latest Linux kernel we have a fresh set of benchmarks for EXT3, EXT4, XFS, ReiserFS, and Btrfs.

    http://www.phoronix.com/vr.php?view=14445

  • misiu_mp
    replied
    Originally posted by fhj52 View Post
    It has *always* been all the data I wanted
    Its rather "all the data you expected"

    Originally posted by fhj52 View Post
    If you are interested in learning, find the documents(books, white papers and articles) and do the homework( RTFM ) because there is obviously much yet to be learned. Once you get the facts straight and do not want to make ridiculous statements about pulling plugs on UPS, , and the like, I may be available for discussion since I too can be edified. ( We all can about something ... ).
    Until then, good luck to you.
    ...
    Damn, that was rude. I see you got all the schooling you think youll ever need already.
    Well here are some basics: upses are devices usually connected to whatever they are supplying with power by cables and cables are ended with plugs. Plugs can be pulled and cables can be cut.
    But that was just a stupid example.
    The *point* is (this time Ill paint it black on white to make sure you will understand) that these are all machines and they are not fail-proof. There are dozens of ways the power can fail at any time whatever you do to make sure it will not. All you can do is add redundancy until your odds are good enough for you.
    And *when* the power fails while the hardware is generating data you are likely to loose some of that data whatever your backup method is. Its impossible to have 100% buckup at all times whatever you think you know.

    Leave a comment:


  • fhj52
    replied
    later you say they can only prevent the backup-ed data from being lost, which is never all of the data you want.
    No. I said:
    Any appreciable amount data should never be lost. That is what backups are for.
    The ability to get snapshots without shutting down the fs is a very important widget in the fs toolset. I find it geekily exciting[for lack of better phrase] that such a great tool will be available for OSS. One cannot, should not, rely upon any other means than a backup to protect data. All the other methods are there, IMHO, to keep from having to restore data because restoration does not always go as planned and backups are only as good as the last backup set.
    I explained " appreciable " as much as possible. ... The latter part is a fair warning that backups should be done as appropriate for the task at hand or they will not provide the desired result . Mirroring RAID using a RAID adapter with BBU and RAMdisk with imaging, as well as using an appropriate filesystem, are some of the "other methods" employed.

    It has *always* been all the data I wanted because when specific backups are performed methods are used that are appropriate for that specific task. In general, standard, timed, incremental backup to a separate disk method is appropriate and all that is needed.
    ...


    If you are interested in learning, find the documents(books, white papers and articles) and do the homework( RTFM ) because there is obviously much yet to be learned. Once you get the facts straight and do not want to make ridiculous statements about pulling plugs on UPS, , and the like, I may be available for discussion since I too can be edified. ( We all can about something ... ).
    Until then, good luck to you.
    ...

    Leave a comment:


  • misiu_mp
    replied
    Originally posted by fhj52 View Post
    WIthin that context I do not think it is a contradiction to say backups are for preventing losses but they are only as good at doing that as the end-user makes them(the time of the last backup).

    -Ric
    "Any appreciable amount data should never be lost. That is what backups are for."

    The contradiction was that, apart from the 'appreciable' part you said that backups will prevent data loss while later you say they can only prevent the backup-ed data from being lost, which is never all of the data you want.

    Some data is always lost in such case, and if it happens to be a result of a lengthy or unique operation, any amount of it can be appreciable.

    Even when you do 'proper' work on a stable 'production' machine that produces tons of appreciable data every second, you can have a power outage any time (even if you have UPS, somebody can accidentally pull the wrong plug). File system will be the first thing to help you out, not the backup.

    Even the best backup is only as good as the file system it is stored on.
    So what happens when the shit hits the fan while you are "backuping"? The backup is invalid and the most recent valid data is most likely what the original file system was able to keep on disk. Backups wont help here. They do help if the original file system is faulty and wipes data they had no good reason to wipe.


    Snapshots are neat, but alone arent the ultimate answer to the backup question since they are stored on the same device (the same file system even) as the original data. Real backups have to be properly transfered over to a separate site, which takes time and makes continuous backups infeasible. You can take backups only as often as it takes to transfer a snapshot (this could be indeed not so much if they are incremental, but takes a lot of time on most ordinary file systems).
    If you mirror all the write operations, you could take snapshots on the mirror which could decrease the backup period even more, but would require the backup system to be much faster than the working machine (so that you can continue mirroring while the snapshot it taken). Now that has to be the ultimate backup system.

    Leave a comment:


  • fhj52
    replied
    Originally posted by misiu_mp View Post
    I think you just contradicted yourself. I dont know what you mean by appreciable amount, but in many cases an hour worth of data for example, could be called appreciable amount. Do you do backups each hour? What amount of data is it okay to lose: 10 minutes, 3 minutes, 30 seconds?
    If ext4 was proper from the start and did not truncate some opened files after crash, the amount of lost data would be limited to that written since last flush (say, 30 seconds) and thats it. Thats better than backups can do in this case.
    Backups will never be able to completely prevent data loss. They are mainly to prevent a complete disaster in case of hardware failure or gross human error.
    In day-to day work with experimental drivers, you should be able to trust your file system to prevent any 'appreciable' amount data from being lost. Thats the ideal we should strive to.
    You are right that by using " appreciable " loss is left open-ended. Appreciable is dependent upon the value, whether innate or time to produce, as well as quantity.
    Some things might be worth a backup|snapshot every 300 seconds or even more frequently. I.e., RAMDISK is saved, incrementally, at 300 seconds. It is possible to make incremental backups of the system the same way.
    The system will be pretty darn busy all the time if it is a server with many clients but that is what snapshots are all about. ...and will add what the new fs are about too since stopping the fs to make the snapshot is not suitable in such environment(server w/ many clients) or, i will say, for the cautious workstation user either.

    IMHO, in day-today work with experimental anything, there should be no appreciable data to lose because such work should not be done on a "production" (i.e., one's main) system and suitable methods are, or should be, employed to keep the last few bits|bytes safe(e.g., direct I/O & using a BBU) if those few bytes are important to the task at hand.

    I understand your point that in at least one sense, no matter how little or how relatively unimportant the data might be, one should not lose it due to a fs screwup. I agree. ...

    The point was that, within practicality, the backup snapshot has to be done frequently enuf to satisfy the requirements to balance the cost of the loss with the time & effort needed to prevent it and it is based on the value and quantity of data being produced. WIthin that context I do not think it is a contradiction to say backups are for preventing losses but they are only as good at doing that as the end-user makes them(the time of the last backup).

    -Ric

    Leave a comment:


  • energyman
    replied
    well, how many supercomputers use solaris?
    and how many use linux?

    since money is not a concern for these things but performance, you should get a clue.

    Leave a comment:


  • kebabbert
    replied
    Originally posted by energyman View Post
    opensolaris and others on the same hardware, solaris sucked:
    http://bulk.fefe.de/lk2006/
    Do you think it is fair to compare a home brewn alfa version OpenSolaris distro by some random guy, in alfa version Schillix v0.5.3, compare to a fully released Linux distros in production?

    http://www.eweek.com/c/a/Linux-and-O...how-Promise/3/
    "Schillix development appears to be on the wane: Version 0.5.2 of the distribution came out in April, and the most recent entry in the distros discussion mailing list was from July. Also, there have been no bugs reported since January."


    How about we compare Linux when it is was in alfa version, to a real Unix in production? Would that be fair?

    Leave a comment:


  • misiu_mp
    replied
    Originally posted by fhj52 View Post
    Any appreciable amount data should never be lost. That is what backups are for.
    [...snip...]
    backups are only as good as the last backup set.
    I think you just contradicted yourself. I dont know what you mean by appreciable amount, but in many cases an hour worth of data for example, could be called appreciable amount. Do you do backups each hour? What amount of data is it okay to lose: 10 minutes, 3 minutes, 30 seconds?
    If ext4 was proper from the start and did not truncate some opened files after crash, the amount of lost data would be limited to that written since last flush (say, 30 seconds) and thats it. Thats better than backups can do in this case.
    Backups will never be able to completely prevent data loss. They are mainly to prevent a complete disaster in case of hardware failure or gross human error.
    In day-to day work with experimental drivers, you should be able to trust your file system to prevent any 'appreciable' amount data from being lost. Thats the ideal we should strive to.

    Leave a comment:


  • fhj52
    replied
    Any appreciable amount data should never be lost. That is what backups are for.
    The ability to get snapshots without shutting down the fs is a very important widget in the fs toolset. I find it geekily exciting[for lack of better phrase] that such a great tool will be available for OSS. One cannot, should not, rely upon any other means than a backup to protect data. All the other methods are there, IMHO, to keep from having to restore data because restoration does not always go as planned and backups are only as good as the last backup set. If we open source users can copy on write (CoW) and get snapshots without interfering with computing tasks, ...wow!

    [snip JFS(JFS2) is oh so much better than ext* stuff]

    I appreciate the benchmarks posted here despite the shortcomings(e.g., they are "canned benchmarks"). I do wish they would include JFS and ZFS also as well as perform the critical tasks like fsck for those four but if not I will prbly find out on my own, ...unfortunately.

    Leave a comment:


  • misiu_mp
    replied
    Originally posted by fhj52 View Post
    The last part got cut ....
    It should also say:
    But neither is a reality. AFAIK, the diff between any modern fs are very small and certainly nowhere near 150MBps (or 550MBps!).
    ...
    When errors happen and they cannot be corrected, speed doesn't matter when your data is LOST.

    If the data is not lost and recoverable by say an fsck, one could argue a fast fs will be better. The thing is though, the speed of fsck is not related to the speed of the file system in normal use. The capacity to performance gap of the hardware forced designers to take the fsck performance into account only recently. Otherwise no effort was made to design the internal data structures in a way to make them quick to be repaired or rebuild by fsck.

    Leave a comment:

Working...
X