Announcement

Collapse
No announcement yet.

The Past 12 Linux Kernels Benchmarked

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The Past 12 Linux Kernels Benchmarked

    Phoronix: The Past 12 Linux Kernels Benchmarked

    Taking a break from our graphics excitement last week with the release of AMD's 8.42.3 Display Driver, we have finished our largest (and most time consuming) Linux performance comparison to date. We have taken the last 12 major kernel releases, from Linux 2.6.12 to Linux 2.6.23, built them from source and set out on a benchmarking escapade. This testing also includes the Linux 2.6.24-rc1 kernel. From these benchmarks you can see how the Linux kernel performance has matured over the past two and a half years.

    http://www.phoronix.com/vr.php?view=11317

  • #2
    The test system must be without udev/hal then as the backend has changed several times. Last time with 2.6.20. Do you use a static kernel?

    Comment


    • #3
      most benchmarks vary a bit from run to run under the same conditions. eg if i run hdparm -t /dev/sda 3 times in a row i get
      Timing buffered disk reads: 190 MB in 3.00 seconds = 63.30 MB/sec
      Timing buffered disk reads: 190 MB in 3.02 seconds = 62.82 MB/sec
      Timing buffered disk reads: 194 MB in 3.02 seconds = 64.29 MB/sec

      that is a bigger variation than i might get with different kernels.

      if you could indicate the normal variation on the graphs that would make them a lot more useful.

      a rigorus method would be to do many runs and show the standard deviation. but even the min and max from 3 or 4 runs would be better than nothing.

      thanks

      Comment


      • #4
        The thing I mean is that you bench wrong, as some subsystems do not work with older kernels when the rest of the environment is newer. Same vice versa when the rest is too new for old kernels.

        Comment


        • #5
          Conclusions

          Good article, but what it lacks off are conclusions. You've done a huge job. The part of reviewing the kernels feautures is awesome, but... why did you choose theese particular tests? Kernel is a huge system. What subsystems did you want to benchmark? From your tests i can make one conclusion -- with an addition of bunch of feautures kernel doesn't run slowly then the previous versions.

          Comment


          • #6
            I like this benchmark but i would like it even more if you could compare it to Windows XP and Windows Vista on the same hardware with roughly the same benchmarks. that will give a real idea if linux is actually faster than windows.

            Most of those tests are also possible in windows.

            Comment


            • #7
              Damn isn't this frustrating? Compiling, benchmarking, evaluating for a few days just to find out that nothing remarkably has happened. I was quite dissappointed. Of course a bunch of new features were added, but no speed bumps, neither positive nor negative.

              Comment


              • #8
                Most of the tests were OS agnostic (ramspeed, lame) and those that did tests kernel aspects (gunzip? hdparm) only tested the I/O sub-system.
                As long as you run a single process that isn't really system-call-intensive, and/or thread/process intensive, the kernel isn't really playing a role in the game - creating (close-to) identical results across the board.

                On the other hand, if you run simultaneous benchmarks (read: encoding an MP3 file while running UT2K4, running hdparm while running gunzip, etc) - small changes in the kernel's scheduler, I/O layers and/or driver APIs should increase the variation in the benchmark scores.

                - Gilboa
                DEV: Intel S2600C0, 2xE52658V2, 32GB, 4x2TB + 2x3TB, GTX780, F21/x86_64, Dell U2711.
                SRV: Intel S5520SC, 2xX5680, 36GB, 4x2TB, GTX550, F21/x86_64, Dell U2412..
                BACK: Tyan Tempest i5400XT, 2xE5335, 8GB, 3x1.5TB, 9800GTX, F21/x86-64.
                LAP: ASUS N56VJ, i7-3630QM, 16GB, 1TB, 635M, F21/x86_64.

                Comment


                • #9
                  I agree, although it's promising that the benchmarks don't show a standard incline that would imply kernel-bloat, they also don't stress the kernel's scheduler and IO subsystem.

                  I am also curious about the configuration of each test kernel? How did you chose to config the kernels? Did you use defconfig or did you hand tune them? I ask because much of the new support in these kernels is turned off by default and needs to be enabled, which might also point to the flat curves.

                  Comment


                  • #10
                    Another interesting set of tests would be to test the virtualization subsystems in the kernel to see how they have improved over time. Examples would be to create a VM and run it on various kernels to see the performance of the host kernel and then lock down a host kernel and run various versions of the test kernel in a VM. Some of the newer features in the kernel (i.e. tickless timers and all the virtualization stuff) would hopefully flex it's muscles in those tests.

                    Comment


                    • #11
                      I hate to sound ungrateful, I mean so far no one has done these sorts of benchmarks, and they're useful. However, its possible to greatly improve how useful these numbers could be with a little extra care. First, what number of tests did you run? Was it 1, 10, 100? If it is stated in the article I missed it, but the average of 100 tests carries a lot more weight than just 1. Ideally, an error bar, standard deviantion, or confidence interval would also be included. At least some measure of how sure you are that the result is indicative of the measurement. This is especially important in cases where the benchmarks are very close, since it lets us run a correlation test to make sure they are actually statistically different.

                      Also, on a more pedantic level is the number of significant figures in the graphs. It should reflect the precision of the score. The general rule of thumb is everything but the last digit should be fixed on repeat of measurement (or withing the standard deviation). A score of 28.1293 for example, means that every time you get 28.129x. Ssam's data for example would only warant a score of 63 MB/s, even though the actual average of the three scores is 63.47 MB/s since both the units digit varied between measurements and the standard deviation is roughly ~.75 MB/s.

                      Kcalc (the standard KDE calculator app) and a wide variety of Linux graphics packages can compute this stuff automatically.

                      Comment


                      • #12
                        Originally posted by Tillin9 View Post
                        I hate to sound ungrateful, I mean so far no one has done these sorts of benchmarks, and they're useful. However, its possible to greatly improve how useful these numbers could be with a little extra care. First, what number of tests did you run? Was it 1, 10, 100?
                        It's the average of four runs for each test.
                        Michael Larabel
                        http://www.michaellarabel.com/

                        Comment


                        • #13
                          At least we know Linux is not getting slower, like windows.

                          Comment

                          Working...
                          X