Originally posted by Thetargos
View Post
Well actually the point of the benchmark is to remove the subjective element of the platform and try to approach it with a more 'scientific' bent.
Bonnie++ is a nice benchmarking tool and comparing file systems is very difficult. Much more difficult then it seems at first blush. Which is why, if your doing FS benchmarks, it's a necessity to see the configuration and full output from the application. It's not like benchmarking a application or a game were you have the same app on both systems.
Our goal here is not to benchmark the benchmarking application, but to measure the performance of the OSes relative to one another.
Lets see a little bit more into this:
So, and I don't know this for absolute certainty, that Mac OS X lies to you about whether or not a 'sync' ended successfully. But lets assume that it does.
The point of 'sync' or similar system calls is that the OS does it's best to make sure that your data is written out to the disk. This is important for a number of reasons, but mostly in case something bad happens to your system. It's a preventative measure that exists to protect your data.
So if the system is lying about what it's doing in order to make you think that it's faster then it really is, then it's putting your data at higher risk of corruption. That is unless there is some miracle of computer science that Apple figured out or whatnot.
That is.. if you didn't think your data was important enough to have a sync done correctly then what is the point of having sync in the first place and why use it at all?
-----------------------
And lets look at Bonnie++ again.
It's designed to do performance analysis of file systems and harddrives....
Most operating systems don't immediately write out data to your harddrive. This is because harddrives are very very slow and memory is fast. So if you can access and write data only to main memory then that's like having a SDRAM-based harddrive to access your data.
Of course if the power goes out, then that's lights out for your data; corrupted and gone forever in a few seconds. (hence the fsync/sync stuff)
So unless your using data sets that your sure are getting written to disk then your not really testing anything but memory access.
-----------------
To put it another way. Bonnie++ has, as one part of it's test, random data access benchmarks. So the idea is that it writes out a bunch of data then immediately begins to read it back in.
In real-world situations that sort of behavior is pretty pathological. It's fairly odd to write out data then immediately read in random bits of it. You may do it for a database or you may need to do it for editing videos or whatnot... but even then the likelihood that the data at some point gets flushed out to disk is very high. So what your really interested in knowing is how well does the file system deal with multi-user or multi-tasking performance, as well as application start up times, and that sort of thing. And most of those things involve reading in stale data from the disk and file systems. Most of those things involve using data sets larger then your main memory.
I mean at some point your going to be reading data randomly from a disk. No matter what. It's why you have a disk. So you want to know actually how well that performs. So unless bonnie++ actually writes out to disk then you have no way of knowing how the system is going to perform under those circumstances.
--------------------------
If that is to long to read.. then look at this way:
It's very easy to tune a system's performance to favor benchmark results. But the end users are going to have their performance sacrificed as a result because real-world usage rarely follows benchmarks, especially FS benchmarks. If you wanted real benchmarks then they would probably involve several weeks of testing per configuration. No fun, too expensive.
So bonnie++ can be useful and can provide useful information, but not when all you see is a couple graphs. That's not enough, unfortunately. Depending on what your actually testing different numbers can be configured.
For example maybe your goal is really testing file system cache performance. That's worthy goal and it may expose bugs or other issues. So you use small datasets and try to keep the benchmarks small enough that nothing gets written.
But for a end user that isn't as interesting as actual performance you'll get when reading or writing data to the drive, unless it's very bad or something.
Comment