Announcement

Collapse
No announcement yet.

Large HDD/SSD Linux 2.6.38 File-System Comparison

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by drag View Post
    Yeah sure. Didn't know if anybody cared. :P
    Great elaboration on the subject drag! I didn't ask for it but sure appreciated it.

    Comment


    • #32
      Originally posted by Michael View Post
      Michael , i meant to ask before but forget, why dont you also mount and run a generic ram disk test , if nothing else it serves as a baseline, and perhaps shows us if there's any DMA regressions in a given test set...

      perhaps you should set one up and add it to the results for that machine.

      Comment


      • #33
        Originally posted by drag View Post
        If you want a summary of what is the best FS for you to use... use Ext4. It's a safe file system.

        JFS is effectively unsupported. It's a port of a file system from OS/2 Warp... the AIX JFS is a entirely different beast. It was interesting when it was new, but besides a few fixes here and there it has essentially been unmaintained for years.

        XFS is good if you need big datasets. If you have multiple TB-large file systems then XFS is a good choice. It's fast, it scales well, and it behaves well when dealing with large amounts of data. You'll want to have very good hardware for it... it's not nearly as robust as Ext4 is.

        BTRFS is good if you want something to play around with. Otherwise leave it alone until distros start using it by default.
        I could not agree more, please don't read this article and begin to think like: hey! I want to use JFS / NILFS2 because I see on phoronix that it has more performance than ext4. Unless you know what are u dealing with, use ext4, it is good, and actively developed by google and Theodore Ts'o.

        Comment


        • #34
          The color of the EXT4 and JFS bars should be different, it is too similar.

          Comment


          • #35
            If you would like to recommend a particular set of mount options for each file-system, I would be happy to carry out such tests under those conditions as well to complement the default options.
            Michael, if you could, will you benchmark the various file systems with different schedulers/elevators? I'm performing my own comparisons, but I'd like to see noop, cfq, and deadline on the various filesystems compared.

            I think this would be a good thing for those of us with multiple types of hard drives to determine which scheduler to use for which hard drive. Should a 2TB use the same as an 80gb drive, or should ssd drives use noop or deadline?

            Also, what's the performance difference between relatime, noatime, and atime?

            How much difference do these disk benchmarks make depending on your amount of ram? I noticed months ago, part of the reason I went with 12gb instead of 6gb of ram, that the 8gb read tests were WAY higher if you had 8gb or more of ram.

            Separate issue... can you make it possible for us to choose colors of each bar on the benchmark graphs? I've done tests where there are 3 systems, and 2 of the 3 have the same color bar, instead of each one having it's own color.

            Thanks,

            Skeetre

            Comment


            • #36
              ext4 settings

              Michael I read your recently article about why we should leave default settings. I mostly agree with everything, based on the fact that most of the people are gonna use it that way.

              But not in this precise scenario: Ext4 + SSD.

              90% of the SSD users use Ext4 with noatime and discard options. This is almost mandatory to preserve disk's life. I guess if it were an ssd-mode as in btrfs, this would be as default. so in this case i think default options dont apply here, at least when we talk about number of people using it.

              I don't know if results were gonna change much, just wanted to point it out.

              Comment


              • #37
                Originally posted by drag View Post
                Yeah sure. Didn't know if anybody cared. :P

                Some of the things you need to keep very careful of is the data set size is correct for the test.

                Like, for example, if I am measuring raw I/O speeds for read/write and I have 4 GB of RAM and the dataset I am working with is only 4-5GB then your not really measuring the file systems as much as measuring the file system cache.
                ... and if you are testing a copy-on-write or logging file system, such as btrfs or nilfs2, the benchmark needs to write (and delete) at least 2-3 times the free space if you want to fairly test the effects of the garbage collector (in the case of a logging file system) or test the effects of the how well the file system holds up against fragmentation (in the case of copy-on-write file systems, such as ZFS and btrfs).

                And of course, if the benchmark isn't testing with the same size files as what you plan to use (small files? typical file sizes as seen by Open Office? MP3 sized files? Video files?), the results may also be not the same.

                Finally, for all file systems, how full the file system is during the test will also make a huge difference. It's very easy for file systems to exhibit good performance if you're only using the first 10-20% of the file system. But how does the file system perform when it's 50% full? 75% full?

                I can tell you that ext4's performance can be quite bad if you take an ext4 file system which is 1 or 2TB, and fill it up to 95% with file sizes are mostly 8-64 megabytes in size, especially if this happens where many 8-64 meg files are getting deleted, and then replaced with other 8-64 meg files, and the disk gradually fills up until it's 95% full. Since this is a workload (using ext4 as a backend store for cluster file systems) is one that I'm paid to care about at $WORK, I'm currently working on an extension to ext4 to help address this particular case.

                Measuring how file systems' performance fall off as the disk utilization increases is hard, yes. But the question is whether the benchmarking is being done primarily for entertainment's sake (i.e., to drive advertising dollars, like a NASCAR race without the car crashes), or to help users make valid decisions about which file system to use, or to help drive improvements in one or more file systems. Depending on your goals, how you approach the file system benchmarking task will be quite different.

                -- Ted

                Comment


                • #38
                  Originally posted by tytso View Post
                  And of course, if the benchmark isn't testing with the same size files as what you plan to use (small files? typical file sizes as seen by Open Office? MP3 sized files? Video files?), the results may also be not the same.

                  Finally, for all file systems, how full the file system is during the test will also make a huge difference. It's very easy for file systems to exhibit good performance if you're only using the first 10-20% of the file system. But how does the file system perform when it's 50% full? 75% full?

                  I can tell you that ext4's performance can be quite bad if you take an ext4 file system which is 1 or 2TB, and fill it up to 95% with file sizes are mostly 8-64 megabytes in size, especially if this happens where many 8-64 meg files are getting deleted, and then replaced with other 8-64 meg files, and the disk gradually fills up until it's 95% full. Since this is a workload (using ext4 as a backend store for cluster file systems) is one that I'm paid to care about at $WORK, I'm currently working on an extension to ext4 to help address this particular case.
                  No benchmark, no matter how thorough, is going to cover every scenario and every possible usage of a file system. If you are in a position such as you are where you have a very narrow set of requirements and usage scenarios for a file system, shouldn't you be running your own benchmarks and not relying on this set? No offense, but your situation doesn't apply to me and it never will. I wouldn't want a benchmark for your particular scenario because I'll never enter a situation like that, and I would venture that a majority of users would not either; any such data would skew opinions of file systems unnecessarily. I don't care how well a Corolla tows a 3 ton camper, just tell me how well it drives in basic conditions (city, highway) and I'll go from there.

                  Comment


                  • #39
                    Originally posted by cruiseoveride View Post
                    Where are the overall performance charts? I've been asking for these sort of charts since pts was in beta. These sort of articles are useful for individual metrics but are useless for the average reader who is just trying to choose the best overall.

                    If you take the time to tabulate and average out relative performance, you will see that NILFS2 was the best overall for a HDD, but reading this article won't tell you that. Is this so much to ask from pts?
                    I've found this also. Wrong description under SQLite graph. NIL2FS was fastest

                    Comment


                    • #40
                      Originally posted by cruiseoveride View Post
                      Where are the overall performance charts? I've been asking for these sort of charts since pts was in beta. These sort of articles are useful for individual metrics but are useless for the average reader who is just trying to choose the best overall.
                      They aren't shown in Phoronix articles as it would detract in page views... If you want averages, you can go to the respective page on OpenBenchmarking.org, and opt to view the averages, see the overall table, calculate geometric / harmonic / aggregate sums, etc etc.
                      Michael Larabel
                      http://www.michaellarabel.com/

                      Comment


                      • #41
                        Originally posted by locovaca View Post
                        I wouldn't want a benchmark for your particular scenario because I'll never enter a situation like that, and I would venture that a majority of users would not either; any such data would skew opinions of file systems unnecessarily. I don't care how well a Corolla tows a 3 ton camper, just tell me how well it drives in basic conditions (city, highway) and I'll go from there.
                        Sure but you're begging the question of what is "normal conditions". Are you always going to fill a file system to 10% of capacity, and then reformat it, and then fill it to 10% again? That's what many benchmarkers actually end up testing. And so a file system that depends on the garbage collector for correct long-term operation, but which never has to garbage collect, will look really good. But does that correspond to how you will use the file system?

                        What is "basic conditions", anyway? That's fundamentally what I'm pointing out here. And is performance really all people should care about? Where does safety factor into all of this? And to be completely fair to btrfs, it has cool features --- which is cool, if you end up using those features. If you don't then you might be paying for something that you don't need. And can you turn off the features you don't need, and do you get the performance back?

                        For example, at $WORK we run ext4 with journalling disabled and barriers disabled. That's because we keep replicated copies of everything at the cluster file system level. If I were to pull a Hans Reiser, and shipped ext4 with its defaults to have the journal and barriers disabled, it would be faster than ext2 and ext3, and most of the other file systems in the Phoronix file system comparison. But that would be bad for the desktop users for ext4, and that to me is more important than winning a benchmark demolition derby.

                        -- Ted

                        Comment


                        • #42
                          weird results

                          Hi! i am looking at http://www.phoronix.com/data/img/res...38_large/2.png
                          with the results of sqlite ...
                          is really ext3 2 times slower on SSD? how can this be? is this the efect of lack of garbage collection or not trimming?

                          Thanks for info!
                          Adrian

                          Comment


                          • #43
                            Michael, for the graphs could you put a larger separator between the HDD and the SSD? I see there's a little hash mark, and the colors repeat. But at first glance it was kind of hard to tell where one ends, and the other begins.

                            Comment


                            • #44
                              Originally posted by adrian_sev View Post
                              Hi! i am looking at http://www.phoronix.com/data/img/res...38_large/2.png
                              with the results of sqlite ...
                              is really ext3 2 times slower on SSD? how can this be? is this the efect of lack of garbage collection or not trimming?
                              I'm pretty sure that ext3 is winning very big on the SQLite benchmark because it does a large number of random writes to the same blocks --- and since ext3 has barriers off by default, on the hard drive the disk collapses the writes together and most of the writes don't actually hit the disk platter. Good luck to your data if you have a power hit, but that's why ext3 wins really big on an HDD.

                              On an SSD, at least OCZ, it's not merging the writes, and so the random writes result in flash write blocks getting written, so that's why ext3 appears to be much worse on the OCZ SSD. Other SSD's might be able to do a better job of merging writes to the same block, if they have a larger write buffer. This would be very SSD-specific.

                              I suspect that JFS didn't run into this problem, even though it also doesn't use barriers, because its write patterns happened to fit within the OCZ's write cache, so it was able to collapse the writes. Personally I don't think it really matters, since running a database like SQLite which is trying to provide ACID properties without barriers enabled is obviously going to (a) result in a failure of the ACID guarantees, and (b) result in very confusing and misleading benchmark results.

                              Comment


                              • #45
                                Someone said it before, but I don't get the point of benchmarking ext4 on an SSD without the discard option (and maybe noatime.)

                                An SSD benchmark would in fact be a good place to tell people they should use discard, for the few who wouldn't know it already.

                                Comment

                                Working...
                                X