Quick, overall system performance suite?
I am getting a new computer soon, and of course I am drooling about overclocking and stuff ;-)
Long story short, I installed PTS in my old machine to start playing. I think the easy access to global and how you can make comparisons with online results is incredibly cool and useful.
What I found lacking is in usability for someone who wants a quick test. Anyone used geekbench? It is a pleasure. Click, download, run, and you get a score for your machine. And you get in a minute or two in a slow machine. Granted, it is lacking disk and graphics performance, but I think we could use something like this. I have some ideas, and I am planning to post in the sticky thread for this forum (PTS). But I wonder if such a quick and to the point suite already exists? I saw a bunch of "sys performance" suites, but they involve 100Mb or more of downloads, and many dozen minutes of runtime. Am I missing something?
Thanks so much Michael and all for the great work!
There is currently no system-quick suite, but if you propose a suite with what tests to include and such, I would be happy to make it. Or if you run: phoronix-test-suite build-suite you can create the suite and I would be glad to push it upstream.
Great, Michael. What I had in mind was a bit more involved. I think we could aggregate results with some sort of an average, maybe geometric mean:
What we would average is the score of each component:
- CPU_S = Single Core CPU, indirectly measures memory speed/bandwidth
- CPU_M = Multiple Core CPU, also measures memory
- DISK = hard drive read/write performance.
- GRAPH_2D = 2D Performance
- GRAPH_3D = 3D Perfomance
Is something like this doable in this framework?
In this mixture you get contributions from the speed of each processor per se (useful when you run single threaded apps), multithreaded speed, and so for. I though memory is already included in the two CPU tests, so adding a memory test separately would give memory itself too much weight.
For tests measured as "lower is better", we would output as a score the inverse of this number. For instance, if the output is execution time, we would output 1 divided by that time (a "frequency").
Also, for the numbers to make sense, scores would need to be normalized to some benchmark machine. In that machine, the score is 1. Maybe a single core older machine you have hanging around ;-)
A natural byproduct would be to have a system-quick-cli with the first three contributions and a system-quick-gui with the other two.
Yes, it can be done (well, it needs to be implemented within pts-core, but I should be able to fit it into PTS 2.0). Should all of them be weighted the same then? If we can start a discussion and get others involved in this thread to provide their thoughts and feedback, it would be great so we can settle for a fair, standard composite scoring system.
If you want to start by proposing some tests, that would be good, etc. Well, for CPU_M the best multicore CPU test in my opinion is graphics-magick. For DISK, IOzone is probably the best but that takes a while to run. So perhaps one of the compression suites.
As soon as it's settled for how the scoring should be, etc. I can then work on the needed support within the framework to offer this.
Thanks a lot Michael!
Yes, getting more people involved is important. Maybe you can make this thread sticky until this is settled?
Scoring: correct, we don't need in principle to change the weights. If we wanted, the natural way is to use exponents. For instance, if we want to make the disk twice as important, we would add the disk score to the quare, and then raise the whole thing to the power 1/6 instead of 1/5.
Composition: we can change the number of tests, what components are tested, etc.
The goal: something that runs in a couple minuter in a 2 Ghz single processor, and perhaps a couple more minutes for download and installation of packages (assuming broadband connection). Again, this is flexible, but the idea is to allow people to quickly get a number. For detail analysis we have lots of tests already, and people will keep adding good stuff.
* For CPU_S I would say Super-Pi, it is the most popular test, fast and single threaded.
* For CPU_M: I tried graphics-magick. Just downloading and installing the test took about 15 minutes in my Sempron 2400+. Way too long for this, we'll need something else.
* Disk: what happen to bonnie? Was it any good? (I never tried it and it's not in PTS 1.8). I agree about IOZone, unless we can call it with arguments to make it faster. And just one iteration, even if we loose some accuracy. People can run the whole test a few times if they want an average, but most people won't care for a quick number.
The Graphics I still couldn't find good candidates. Except perhaps 2D performance: would one of the gtkperf be good for that? I think they measure 2D performance mostly, no?
If things start shaping up I'll keep a clean first post in this thread summarizing the progress.
Last edited by mendieta; 04-12-2009 at 11:42 AM.
One thing that would be nice is for the selected tests to work on Linux, Mac OS X, OpenSolaris, and ideally BSD too.
- Super-Pi. I am not too fond of super-pi. Additionally, the license of super-pi is not clear and it's binary-only. Maybe scimark2 or something similar? Check out the computational suite.
- bonnie disappeared due to a parsing bug I haven't gotten around to fixing. IOzone really isn't accurate though unless the tested size is greater than the system memory size, which ends up needing options or to use some very large size default. As a result, maybe a compression test might end up working better.
gtkperf is good. Or maybe qgears2.
Good point. Also, open source if at all possible. Maybe these core tests could be distributed (the sources) with the PTS, so there is no risk of some of the servers holding the tests being down or slow. Not sure about this, just a thought.
Originally Posted by Michael
I'll look at the tests you suggested and other tests, and see if other people bring ideas/insight over the next few days. I'll also clean up the original post.
Nope, won't happen. I will not begin distributing tests with PTS. However, with PTS Linux Live that is an option...
Originally Posted by mendieta
Yeah, you are right, it's much better to keep the test small and download stuff on demand.
I am looking into this.
* For 2D maybe the Circles test in gtk perf seems good, PixBufs seems good and fast too. Of course these are not "real world" test, But I doubt we can get real world tests in graphics. Qgears2 didn'd run here!
* For 3D I am looking at the GL suites, because real wrold tests demand downloading large games, GLMark Cored here. I'll keep looking.
* For disk all I've seen take a long time so far, Including fio. Maybe we can test disk and multicore cpu with a compile (it exercises lots of disj reads and writes). The issue is the time. Compiling Apache (the fastest compile test i've seen) takes 1 minute. Can we fdo just one compilation instead of 4?
* Single processor: scimark2 depends on Java, That;s a biggie, and it also can have issues if your system doesn't have a JIT compiler for Java. But I think we'll find a good single-threaded test which mosly exercises. This way, CPU_S would be mostly 1CPU + MEM, and CPU_M would be MultiCPU + Disk, Seems reasonable. Maybe single threaded music/video encoding would be good for this.
Last edited by mendieta; 04-12-2009 at 12:46 PM.