Announcement

Collapse
No announcement yet.

Linux 2.6.24 Through Linux 2.6.33 Benchmarks

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux 2.6.24 Through Linux 2.6.33 Benchmarks

    Phoronix: Linux 2.6.24 Through Linux 2.6.33 Benchmarks

    At Phoronix we have been benchmarking the Linux kernel on a daily basis using Phoromatic Tracker, a sub-component of Phoromatic and the Phoronix Test Suite. We launched our first system in the Linux kernel testing farm just prior to the Linux 2.6.33 kernel development cycle and found a number of notable regressions during the past three months. Now with the Linux 2.6.34 kernel development cycle getting into swing, we have added an additional two systems to our daily kernel benchmarking farm. One of the systems is an Atom Z520 system but what makes it more interesting is that the system is using a Btrfs file-system and then the second new system added to the kernel tracker is a 64-bit setup. However, to provide a historical look at the Linux kernel performance, we have ran some fresh benchmarks going back to the Linux 2.6.24 kernel and ending with the recently released Linux 2.6.33 kernel.

    http://www.phoronix.com/vr.php?view=14633

  • #2
    I can't believe such big regressions like apache where left untouched and unresolved during so much release. I don't understand either why the PostgreSQL performance regression passed through the rc phase... Either you just prove to the worl the kernel development model is flawed, either their is a flaw in the tests you ran. In both case, some analysis and explanation would have been nice along the results.

    Comment


    • #3
      Originally posted by Xheyther View Post
      I can't believe such big regressions like apache where left untouched and unresolved during so much release. I don't understand either why the PostgreSQL performance regression passed through the rc phase... Either you just prove to the worl the kernel development model is flawed, either their is a flaw in the tests you ran. In both case, some analysis and explanation would have been nice along the results.
      Usually those tests are meaningless. If there's intentional change in the kernel it's described here like regression. In example some file system has different mode set as default in newer kernel and this can make big impact on some benchmarks - PostgreSQL etc. Apache benchmark is meaningless, because it's not done properly.

      Comment


      • #4
        Originally posted by kraftman View Post
        Apache benchmark is meaningless, because it's not done properly.
        http://lkml.indiana.edu/hypermail/li...9.1/02631.html

        Be careful not to run ab on the same machine as you run apache, otherwise
        the numerous apache processes can limit ab's throughput. This is the same
        reason as why I educate people so that they don't run a single-process
        proxy in front of a multi-process/multi-thread web server. Apparently
        it's not obvious to everyone.

        Comment


        • #5
          I like this article a lot!! testing from 24-33, When I first read the Phoromatic Tracker i was wondering if that could be done. Phoromatic Tracker is a superb tool! well done.

          I always thought that there will be regression on performance from 24 to 33, because kernel is implementing more APIs, graphics code, graphics focus... but this article shows me that i was worng.

          Comment


          • #6
            I was very excited about PTS at the beginning.
            Now it produces meaningless numbers. your conclusions based on the PTS results could be wrong.
            Like with HD test. It's results leading to very wrong assumption. 20$ CPU can not play HD movie with high bit rate.
            I can't find testing file systems with different settings producing something useful.
            You can use some tests to evaluate some components , but i can't see the point to post this 24-33 tests. they are at least misleading.

            Comment


            • #7
              Originally posted by Jimbo View Post
              I always thought that there will be regression on performance from 24 to 33, because kernel is implementing more APIs, graphics code, graphics focus... but this article shows me that i was worng.
              This article mainly shows that tests are often meaningless.

              Comment


              • #8
                Real-world feedback

                It would be nice to hear heavy users/developers of PostgreSQL to chime in on these numbers. There must be businesses around to which performance of an order of magnitude less would mean disaster.

                OR is this just another case of Phoronix relying on the "default" Ext3 settings for each kernel and not bothering to compare the same settings across kernels to generate FUD and page hits?

                Comment


                • #9
                  I'm interested in the special commit that made PostSQL run so much faster and slower again. And I also miss the feature you described a few weeks ago, that you can see the deviation of every single run if you move the mouse over the graph.

                  All in all interesting I think, thank you

                  Comment


                  • #10
                    Skip to the next response to that thread..

                    I turned on apache, and played with ab a bit, and yup, ab is a hog, so
                    any fairness hurts it a badly. Ergo, running ab on the same box as
                    apache suffers with CFS when NEW_FAIR_SLEEPERS are turned on. Issuing
                    ab bandwidth to match it's 1:N pig nature brings throughput right back.


                    http://lkml.indiana.edu/hypermail/li...9.1/02861.html

                    Remember that you can't test anything, and testing in the obvious path will usually result in flat lines - since they represent the 95% path.

                    As indicated above, what has been identified is that in some scenarios CFS completely tanks. The ab is just a tool to make this visible.

                    As per usual, if there is any benchmark which you believes provides a suitable equivalent scenario but is more "correct", please tell us.

                    Comment


                    • #11
                      Originally posted by kraftman View Post
                      Usually those tests are meaningless. If there's intentional change in the kernel it's described here like regression. In example some file system has different mode set as default in newer kernel and this can make big impact on some benchmarks - PostgreSQL etc. Apache benchmark is meaningless, because it's not done properly.
                      A Regression is a unexpected change in behavior. If the kernel developers make a change in one area, and they are not expecting the behavior change in other areas those areas have regressed.

                      Remember that file system tuning (turning features on and off) is a specialist skill. Most people get very concerned with making changes that they may believe are uninformed when they may be risking their data. The maintainers of the filesystems and the distros that package them are the ones that control the behavior.

                      I'd like you to expand on your "not done properly" if you could.

                      Comment


                      • #12
                        Originally posted by Xheyther View Post
                        I can't believe such big regressions like apache where left untouched and unresolved during so much release. I don't understand either why the PostgreSQL performance regression passed through the rc phase... Either you just prove to the worl the kernel development model is flawed, either their is a flaw in the tests you ran. In both case, some analysis and explanation would have been nice along the results.
                        It depends on what people are watching for. As mentioned on this thread, all that has been shown is that the ab benchmark as currently in place is extremely sensitive (mind you in this case in a good way) to the changes going on in the kernel.

                        This doesn't show anything is flawed. In any system all metrics don't go up monotonically. You make improvements in one area that degrade an another. You just want the "average" experience to be on an upward trend.

                        Remember that Linux covers everything from embedded systems through to big-iron servers. Being a generalist is *really* hard to do.

                        Comment


                        • #13
                          Thanks for these tests!

                          I think it is extremely valuable to have these tests in public.

                          It may cause hiccups in some camps, but, that is what they're good for, stopping hiccups!

                          Comment


                          • #14
                            Content

                            I appreciate the great job Phoronix does on reporting news in the Linux community, but I find that the benchmarking articles could be much better. I don't need someone to show me a graph and then list the statistics in the text below the graph. The graph shows the statistics already. These articles fail to draw any real conclusions about the results. Rather than saying "these numbers went down, these numbers went up, and these numbers stayed the same," Phoronix should look into *why* changes occur. I'm not saying that you have to research every regression you find, but at least put a little effort into finding a couple of real interesting development notes to provide some solid information along with the figures.

                            Comment


                            • #15
                              Originally posted by maccam94 View Post
                              I appreciate the great job Phoronix does on reporting news in the Linux community, but I find that the benchmarking articles could be much better. I don't need someone to show me a graph and then list the statistics in the text below the graph. The graph shows the statistics already. These articles fail to draw any real conclusions about the results. Rather than saying "these numbers went down, these numbers went up, and these numbers stayed the same," Phoronix should look into *why* changes occur. I'm not saying that you have to research every regression you find, but at least put a little effort into finding a couple of real interesting development notes to provide some solid information along with the figures.
                              Agreed. If it takes too much time, perhaps someone else out there could chip in - you make the graphs, raise some questions, and someone else, maybe who works on these software projects, can explain.

                              Comment

                              Working...
                              X