Announcement

Collapse
No announcement yet.

Quick, overall system performance suite?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #51
    Yeah there are a few tests which I have found that are relatively quick and have worked on the few systems I am working with. I'm not sure how well they test GPU though... Renderbench, gtkperf, and jxrendermark. have you ever looked into any of these?

    edit: these tests have relatively small download sizes as well.

    They all seem to be 2D tests but I'm not sure how well they rep performance

    Comment


    • #52
      Originally posted by channon View Post
      Yeah there are a few tests which I have found that are relatively quick and have worked on the few systems I am working with. I'm not sure how well they test GPU though... Renderbench, gtkperf, and jxrendermark. have you ever looked into any of these?

      edit: these tests have relatively small download sizes as well.

      They all seem to be 2D tests but I'm not sure how well they rep performance
      Exactly, these are 2D. AFAICT, qgears is exercising opengl, so it gives an idea of the 3D capabilities. Glmark2 is another simple openGL test, but it is extremely simple and it takes a long time (why does that head need to turn so many times?? One turn would be more than enough).

      Overall, I continue to think that if this idea ever works, it's best to have a quickbench-cli and a quickbench-gui. For the former, something like quickbench could give a balanced view: there is a single threaded test (scimark2), a multhreaded (7-zip) and a non-trivial disk stress that seems to capture real world speed.

      For the GUI, perhaps some of the tests in gtkperf or similar could do the 2D part, and then I guess a 3D test would be challenging. I am not sure qgears2 is a good test, even in a synthetic sense.

      Cheers!

      Comment


      • #53
        Hi all, I'm new to this thread so perhaps you've already thought of this & rejected it for some reason but why not make this as close as possible to the Windows Experience Index reported by Microsoft's Windows System Assessment Tool ?
        It gives a user friendly score for each major area & a final over-all score. It's also pretty well documented (see links from http://en.wikipedia.org/wiki/Windows...ssessment_Tool or just search for WSAT) though the exact details of the tests are probably protected. However, as long as the PTS version tested the same aspects (E.g. using the new Unvanquished test for 3D) it could then use a weight to produce a comparable figure to what WSAT would give.
        Although the normal user's view of WEI is a pretty GUI summary, it actually has a command line that allows you to run individual tests which would make tuning the PTS version to match it a lot easier.
        Microsoft's WEI was introduced in Vista & seems to have abandoned in Windows 8. It was never back-ported to XP and I've never seen an advert for a PC mention it's WEI score. But I'm sure they did a lot of research into what weights to put on each aspect and it does do exactly what this thread aims for, so why not aim for comparability? You could call this the PTS-EI.
        As a side note, if Microsoft have abandoned it, they might be willing to disclose more detail.

        Comment


        • #54
          Hi all, I'm new to this thread so you may have already thought of this and rejected it for some reason but why not make this comparable to the Windows Experience Index score returned from the Windows System Assessment Tool? http://en.wikipedia.org/wiki/Windows...ssessment_Tool

          It's aims were exactly what we're looking for - a test of each sub-system with an over-all score that's quick to understand and compare.
          I'm sure Microsoft would have put in a lot of research into the weights required to yield a simple number that reasonably reflects the actual "feel" or "user experience" of diversely different machines. It's also pretty well documented (Google WSAT) and includes a command line tool that lets you run individual tests which would help in calibrating a PTS version to match it's scores in each sub-system.

          E.g. For the 3D sub-system, use the new Unvanquished tests, then apply a calibrated weighting so it produces the same score that WEI does.

          Microsoft introduced WEI with Vista but seem to have dropped it from Windows 8 and it was never ported to XP. But if they have abandoned it, then they may be willing to open up more details.

          How does the Phoronix Test Suite - User Experience Index sound ?

          P.S. Sorry if I double posted, but first try didn't seem to work & I had to re-type it all - Grr!!!

          Comment


          • #55
            Thank for the suggestion, I actually don't think you can trust Microsoft to set a standard and cooperate with others, when their whole business model relies on breaking compatibility to force you to upgrade, and refusing to cooperate with others. Actually, WEI was dropped already. It was never a good idea, because it had a hardcoded range. So, a supercomputer would get the same score as a decent desktop computer. Which is insane.

            I think we are trying to get something more like geekbench, but based on a (geometric, to remove scaling issues) real world tests, and also including non-trivial disk and graphics components. For instance, I just upgraded my computer, and the geekbench score got 4 times higher. My quickbench tests for CPU are similar (about 4 times faster for single threaded, and 4.5 times faster for multithreaded). But I can also look at the disk speed up (besides faster CPU, faster RAM, and moving from Sata II to Sata III controller on a Sata III SSD, the disk scored 2.5 times higher)

            Thanks, again.

            Comment


            • #56
              PCGA pass / fail

              A long time ago, the PC Gaming Alliance was formed to unify the PC industry in defense of consoles, such as by providing the consumer with some form of assurance that any given PC was up to the job of playing any given game. I.e. a benchmark much like WEI.

              Unfortunately, they hide behind NDAs and to date have never published any recommended hardware specs. nor any compliance test.

              I wonder if any Phoronix readers might also be PCGA members. In which case would it be possible to create a PTS profile that shows PCGA pass/fail compliance without you breaching the terms of the NDA?

              I suspect the whole PGGA may become moot once Steam boxes start showing up as Valve seem pretty keen on keeping things fairly open so their "Good / Better / Best" grading scheme should be easy to define in a PTS profile.

              Comment


              • #57
                Not to resurrect a dead horse, but...

                I work at a small school district, and have access to a very large range of machines from Pentium 3's to dual Xeon workstations. This test is exactly what I am looking for to have a quick and standardized way for myself and my students to assess the quality of donations, older hardware, and new low power options under consideration.

                How can I help, Mendieta?

                Side note: The
                Code:
                phoronix-test-suite benchmark mendieta-4549-6954-342
                gave me an invalid argument, but
                Code:
                phoronix-test-suite benchmark 1306113-MEND-QUICKBE80
                after
                Code:
                aptitude install libxrender-dev
                is looking good so far!

                Comment


                • #58
                  Originally posted by Tijok View Post
                  I work at a small school district, and have access to a very large range of machines from Pentium 3's to dual Xeon workstations. This test is exactly what I am looking for to have a quick and standardized way for myself and my students to assess the quality of donations, older hardware, and new low power options under consideration.

                  How can I help, Mendieta?

                  Side note: The
                  Code:
                  phoronix-test-suite benchmark mendieta-4549-6954-342
                  gave me an invalid argument, but
                  Code:
                  phoronix-test-suite benchmark 1306113-MEND-QUICKBE80
                  after
                  Code:
                  aptitude install libxrender-dev
                  is looking good so far!
                  The former format was for Phoronix Global (since deprecated several years ago and not supported by modern versions of PTS) as has been replaced by http://OpenBenchmarking.org since Phoronix Test Suite 3.0.

                  Is there anything else you're looking out of the testing experience / needs?
                  Michael Larabel
                  https://www.michaellarabel.com/

                  Comment


                  • #59
                    Originally posted by Tijok View Post
                    I work at a small school district, and have access to a very large range of machines from Pentium 3's to dual Xeon workstations. This test is exactly what I am looking for to have a quick and standardized way for myself and my students to assess the quality of donations, older hardware, and new low power options under consideration.

                    How can I help, Mendieta?
                    Hi Tijok

                    Sorry for the delay, crazy week. I'd love to help you with the testing at school. I don't believe this thread ever helped creating a Quick and Generic performance suite for CPU/Disk/Graphics. But I believe we digged enough to let you achieve your goals.

                    Could you give a bit more detail?
                    • Do you care about all three major components?? (disk/graphics/CPU) or just some of them?
                    • Do you install a standard Linux distribution on each machine? Or do you plan to test with a Linux Live USB image, etc?
                    • If you care about graphics, do you care about both 2D and 3D?


                    This is what we can do:
                    • I can create a test for you, up on Openbenchmarking, and help you run against it.
                    • You create an Openbenchmarking account for the tests, and select a password that your students would reuse with the same account.
                    • Each time you have a new machine, someone goes into Openbenchmarking, locates the latest test in the stream of tests, and they run against it, using the account information for the test.


                    The results, if you use the Analyze tab in Openbenchmarking.org, would look like this (note that all these tests look very similar because I am using the same machine with different software/BIOS settings):

                    OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles


                    You would get disk speed, single-threaded CPU performance, multi-threaded performance and graphics speed, and the average at the end, all normalized to 1 (I can run initially with a low end Laptop and you take over from there)

                    Sounds like a plan? Cheers!

                    Comment


                    • #60
                      Great thread!

                      I propose creating four consecutive standardized test sets which build upon each other, meaning that test set 2 should include all tests from test set 1 and so on:
                      1. Quick screening: Quick and representative test of all essential subsystems: Processor (single-threaded and multi-threaded), Disk, Memory and Network.
                      2. Small screening: Should be able to run from a live medium on systems with an limited amount of RAM, which means the requirements are a small download and small on-disk test size and not-too-long runtime
                      3. Extensive screening: Adds useful tests where download and on-disk test size do not matter
                      4. Long screening: Adds a broad testing range where test runtime does not matter
                      5. (Graphics: Maybe some graphics-related tests, but I'll leave that to the experts out there.)

                      Goals for these test sets should be:
                      1. The tests should be representative for each subsystem
                      2. They should be as timeless as possible in order to be comparable to past and future systems (which rules out compiling tests, as comparing compilation performance on Linux 2.6 vs. Linux 4.11 would produce very different results)
                      3. They should be quite popular among the community, e.g. on OpenBenchmarking.org, so it is easier to compare to other systems
                      4. They should be a healthy mix of theoretical and real-world-usage benchmarks
                      5. The tests should be as self-contained as possible, minimizing dependencies on other packages upon installation.

                      Question is, which tests are most representative for all four (five) test sets?

                      Some ideas, based on this thread and the list of the most downloaded benchmarks (test options in parenthesis):

                      Quick screening:
                      Processor (single-threaded):
                      • SciMark (Computational Test: Test All Options): 16 minutes

                      Processor (multi-threaded):
                      • C-Ray: 6 minutes
                      • 7-Zip Compression: 4 minutes
                      • Gzip Compression: 4 minutes

                      Disk:
                      • Flexible IO Tester (Type: Test All Options; IO Engine: POSIX AIO; Buffered: Yes; Direct: No; Block Size: 512 KB): 15 minutes

                      Memory:
                      • RAMspeed SMP (Type: Test All Options; Benchmark: Integer): 26 minutes

                      Network:
                      • Loopback TCP Network Performance: 3 minutes



                      Estimated total runtime: 1:14 hours
                      Approx. Download size: 5 MB
                      Approx. installed size: 22 MB



                      Small screening:
                      Processor (single-threaded):

                      Processor (multi-threaded):
                      • FFTE: 4 minutes
                      • BLAKE2: 1 minute
                      • CacheBench (Test: Test All Options): 20 minutes
                      • Gcrypt Library: 4 minutes
                      • GnuPG: 2 minutes
                      • GraphicsMagick (Operation: Test All Options): 16 minutes
                      • John The Ripper (Test: Test All Options): 9 minutes
                      • ebizzy: 2 minutes
                      • FLAC Audio Encoding: 2 minutes
                      • Himeno Benchmark: 3 minutes

                      Disk:
                      • PostMark: 17 minutes
                      • SQLite (Test Target: Default Test Directory): 24 minutes

                      Memory:
                      • Stream (Type: Test All Options): 41 minutes

                      Network:


                      Estimated total runtime: 3:39 hours
                      Approx. Download size: 27 MB
                      Approx. installed size: 101 MB



                      Extensive screening:
                      Processor (single-threaded):

                      Processor (multi-threaded):
                      • LAME MP3 Encoding: 2 minutes
                      • OpenSSL: 2 minutes
                      • x264: 3 minutes

                      Disk:
                      • BlogBench (Test: Test All Options): 1:04 hours

                      Memory:

                      Network:

                      System:
                      • Apache Benchmark: 5 minutes
                      • NGINX Benchmark: 4 minutes
                      • PostgreSQL pgbench: 1:35 hours


                      Estimated total runtime: 6:34 hours
                      Approx. Download size: 469 MB
                      Approx. installed size: 1950 MB



                      Long screening:
                      Processor (single-threaded):

                      Processor (multi-threaded):
                      • BYTE Unix Benchmark (Computational Test: Test All Options): 4:15 hours
                      • NAS Parallel Benchmarks (Test / Class: Test All Options): 39 minutes
                      • Primesieve: 15 minutes

                      Disk:
                      • FS-Mark (Test: Test All Options): 1:03 hours
                      • Iozone (Record Size: Test All Options; File Size: 512MB; Disk Test: Test All Options): 2:13 hours
                      • Dbench (Client Count: 6): 37 minutes

                      Memory:

                      Network:

                      System:
                      • Hierarchical INTegration (Test: Test All Options): 51 minutes


                      Estimated total runtime: 16:27 hours
                      Approx. Download size: 493 MB
                      Approx. installed size: 2010 MB



                      What do you think? Which tests would you add and which tests would you remove? Are the various test options sane? Are the tests representative?

                      Comment

                      Working...
                      X