Large HDD/SSD Linux 2.6.38 File-System Comparison

Azpegath replied

11 March 2011, 08:02 AM
Originally posted by squirrl View Post

Reiser3 is still the best all around choice.
* Fault Tolerant
* Efficient
* Static

But sadly it degenerates and fragments like a motherfokker. After one and a half year it's at 20% of the speed it started at. And there's no known way of defragmenting it, except copying all the files from-and-to the filesystem again.
Leave a comment:
stqn replied

11 March 2011, 03:47 AM
Someone said it before, but I don't get the point of benchmarking ext4 on an SSD without the discard option (and maybe noatime.)

An SSD benchmark would in fact be a good place to tell people they should use discard, for the few who wouldn't know it already.
Leave a comment:
tytso replied

10 March 2011, 01:55 PM
Originally posted by adrian_sev View Post

Hi! i am looking at http://www.phoronix.com/data/img/res...38_large/2.png
with the results of sqlite ...
is really ext3 2 times slower on SSD? how can this be? is this the efect of lack of garbage collection or not trimming?

I'm pretty sure that ext3 is winning very big on the SQLite benchmark because it does a large number of random writes to the same blocks --- and since ext3 has barriers off by default, on the hard drive the disk collapses the writes together and most of the writes don't actually hit the disk platter. Good luck to your data if you have a power hit, but that's why ext3 wins really big on an HDD.

On an SSD, at least OCZ, it's not merging the writes, and so the random writes result in flash write blocks getting written, so that's why ext3 appears to be much worse on the OCZ SSD. Other SSD's might be able to do a better job of merging writes to the same block, if they have a larger write buffer. This would be very SSD-specific.

I suspect that JFS didn't run into this problem, even though it also doesn't use barriers, because its write patterns happened to fit within the OCZ's write cache, so it was able to collapse the writes. Personally I don't think it really matters, since running a database like SQLite which is trying to provide ACID properties without barriers enabled is obviously going to (a) result in a failure of the ACID guarantees, and (b) result in very confusing and misleading benchmark results.
Leave a comment:
pvtcupcakes replied

10 March 2011, 01:33 PM
Michael, for the graphs could you put a larger separator between the HDD and the SSD? I see there's a little hash mark, and the colors repeat. But at first glance it was kind of hard to tell where one ends, and the other begins.
Leave a comment:
adrian_sev replied

10 March 2011, 11:57 AM
weird results

Hi! i am looking at http://www.phoronix.com/data/img/res...38_large/2.png
with the results of sqlite ...
is really ext3 2 times slower on SSD? how can this be? is this the efect of lack of garbage collection or not trimming?

Thanks for info!
Adrian
Leave a comment:
tytso replied

10 March 2011, 10:53 AM
Originally posted by locovaca View Post

I wouldn't want a benchmark for your particular scenario because I'll never enter a situation like that, and I would venture that a majority of users would not either; any such data would skew opinions of file systems unnecessarily. I don't care how well a Corolla tows a 3 ton camper, just tell me how well it drives in basic conditions (city, highway) and I'll go from there.

Sure but you're begging the question of what is "normal conditions". Are you always going to fill a file system to 10% of capacity, and then reformat it, and then fill it to 10% again? That's what many benchmarkers actually end up testing. And so a file system that depends on the garbage collector for correct long-term operation, but which never has to garbage collect, will look really good. But does that correspond to how you will use the file system?

What is "basic conditions", anyway? That's fundamentally what I'm pointing out here. And is performance really all people should care about? Where does safety factor into all of this? And to be completely fair to btrfs, it has cool features --- which is cool, if you end up using those features. If you don't then you might be paying for something that you don't need. And can you turn off the features you don't need, and do you get the performance back?

For example, at $WORK we run ext4 with journalling disabled and barriers disabled. That's because we keep replicated copies of everything at the cluster file system level. If I were to pull a Hans Reiser, and shipped ext4 with its defaults to have the journal and barriers disabled, it would be faster than ext2 and ext3, and most of the other file systems in the Phoronix file system comparison. But that would be bad for the desktop users for ext4, and that to me is more important than winning a benchmark demolition derby.

-- Ted
Leave a comment:
Michael replied

10 March 2011, 09:48 AM
Originally posted by cruiseoveride View Post

Where are the overall performance charts? I've been asking for these sort of charts since pts was in beta. These sort of articles are useful for individual metrics but are useless for the average reader who is just trying to choose the best overall.

They aren't shown in Phoronix articles as it would detract in page views... If you want averages, you can go to the respective page on OpenBenchmarking.org, and opt to view the averages, see the overall table, calculate geometric / harmonic / aggregate sums, etc etc.
Leave a comment:
rofrol replied

10 March 2011, 09:45 AM
Originally posted by cruiseoveride View Post

Where are the overall performance charts? I've been asking for these sort of charts since pts was in beta. These sort of articles are useful for individual metrics but are useless for the average reader who is just trying to choose the best overall.

If you take the time to tabulate and average out relative performance, you will see that NILFS2 was the best overall for a HDD, but reading this article won't tell you that. Is this so much to ask from pts?

I've found this also. Wrong description under SQLite graph. NIL2FS was fastest
Leave a comment:
locovaca replied

10 March 2011, 08:57 AM
Originally posted by tytso View Post

And of course, if the benchmark isn't testing with the same size files as what you plan to use (small files? typical file sizes as seen by Open Office? MP3 sized files? Video files?), the results may also be not the same.

Finally, for all file systems, how full the file system is during the test will also make a huge difference. It's very easy for file systems to exhibit good performance if you're only using the first 10-20% of the file system. But how does the file system perform when it's 50% full? 75% full?

I can tell you that ext4's performance can be quite bad if you take an ext4 file system which is 1 or 2TB, and fill it up to 95% with file sizes are mostly 8-64 megabytes in size, especially if this happens where many 8-64 meg files are getting deleted, and then replaced with other 8-64 meg files, and the disk gradually fills up until it's 95% full. Since this is a workload (using ext4 as a backend store for cluster file systems) is one that I'm paid to care about at $WORK, I'm currently working on an extension to ext4 to help address this particular case.

No benchmark, no matter how thorough, is going to cover every scenario and every possible usage of a file system. If you are in a position such as you are where you have a very narrow set of requirements and usage scenarios for a file system, shouldn't you be running your own benchmarks and not relying on this set? No offense, but your situation doesn't apply to me and it never will. I wouldn't want a benchmark for your particular scenario because I'll never enter a situation like that, and I would venture that a majority of users would not either; any such data would skew opinions of file systems unnecessarily. I don't care how well a Corolla tows a 3 ton camper, just tell me how well it drives in basic conditions (city, highway) and I'll go from there.
Leave a comment:
tytso replied

10 March 2011, 08:24 AM
Originally posted by drag View Post

Yeah sure. Didn't know if anybody cared. :P

Some of the things you need to keep very careful of is the data set size is correct for the test.

Like, for example, if I am measuring raw I/O speeds for read/write and I have 4 GB of RAM and the dataset I am working with is only 4-5GB then your not really measuring the file systems as much as measuring the file system cache.

... and if you are testing a copy-on-write or logging file system, such as btrfs or nilfs2, the benchmark needs to write (and delete) at least 2-3 times the free space if you want to fairly test the effects of the garbage collector (in the case of a logging file system) or test the effects of the how well the file system holds up against fragmentation (in the case of copy-on-write file systems, such as ZFS and btrfs).

And of course, if the benchmark isn't testing with the same size files as what you plan to use (small files? typical file sizes as seen by Open Office? MP3 sized files? Video files?), the results may also be not the same.

Finally, for all file systems, how full the file system is during the test will also make a huge difference. It's very easy for file systems to exhibit good performance if you're only using the first 10-20% of the file system. But how does the file system perform when it's 50% full? 75% full?

I can tell you that ext4's performance can be quite bad if you take an ext4 file system which is 1 or 2TB, and fill it up to 95% with file sizes are mostly 8-64 megabytes in size, especially if this happens where many 8-64 meg files are getting deleted, and then replaced with other 8-64 meg files, and the disk gradually fills up until it's 95% full. Since this is a workload (using ext4 as a backend store for cluster file systems) is one that I'm paid to care about at $WORK, I'm currently working on an extension to ext4 to help address this particular case.

Measuring how file systems' performance fall off as the disk utilization increases is hard, yes. But the question is whether the benchmarking is being done primarily for entertainment's sake (i.e., to drive advertising dollars, like a NASCAR race without the car crashes), or to help users make valid decisions about which file system to use, or to help drive improvements in one or more file systems. Depending on your goals, how you approach the file system benchmarking task will be quite different.

-- Ted
Leave a comment:

Announcement

Large HDD/SSD Linux 2.6.38 File-System Comparison

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: