Announcement

Collapse
No announcement yet.

Linux 2.6.38 Kernel Multi-Core Scaling

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Linux 2.6.38 Kernel Multi-Core Scaling

    Phoronix: Linux 2.6.38 Kernel Multi-Core Scaling

    Last month there were benchmarks on Phoronix looking at the multi-core scaling performance of multiple operating systems, including CentOS 5.5, Fedora 14, FreeBSD 8.1, and OpenIndiana b148. CentOS 5.5 uses the long-term Linux 2.6.18 kernel while Fedora 14 has the more recent Linux 2.6.35 kernel by default, but a number of users asked how the Linux 2.6.38 kernel would fair for multi-core scaling with the removal of the Big Kernel Lock and various other low-level improvements in this forthcoming kernel. Here are some benchmarks showing just that.

    http://www.phoronix.com/vr.php?view=15753

  • #2
    Shouldn't this be tested on something with a really big number of cores/processors to be able to see any differences? Something like 48 cores or more? 6 cores isn't all that much, even if they have HT.

    Comment


    • #3
      The 48 core systems will be 4x12 cores. That's a slightly different (and expensive scenario).

      This system represents a single package high core count - which is arguably going to be the typical mid-high end system that people will be getting for the next year or so.

      What is interesting is that Ubuntu did get a reasonable gain from 6 real cores to 12 cores (6 being HT). The PC-BSD and OpenIndiana systems would typically collapse when HT was turned on. You get the benefit of about 1-2 _real_ cores with the 6 HT "cores" being enabled.

      Comment


      • #4
        Originally posted by mtippett View Post
        The 48 core systems will be 4x12 cores. That's a slightly different (and expensive scenario).
        Yes, but this scenario is typical for small-scale Linux-based clusters which are typically used for engineering / scientific calculations thus it is of interest to many of us.
        But I do get your point and unfortunately I can't donate a system like that so the situation is unlikely to change. Maybe you could ask Tyan or Supermicro for the reasons I mentioned.

        Comment


        • #5
          comments on the graphs are rare these days on phoronix.com
          nothing to tell why the big kernel lock patch and the "patch that does wonders" - as proclaimed - hardly make a change?

          Comment


          • #6
            Originally posted by jakubo View Post
            comments on the graphs are rare these days on phoronix.com
            nothing to tell why the big kernel lock patch and the "patch that does wonders" - as proclaimed - hardly make a change?
            It depends on where the contention lies. For heavy CPU-only loads, the BKL won't immediately yield any difference. When you start getting into heavy IO multi-threaded IO heavy loads, the waiting within the kernel becomes critical. This IO load can be graphics, disk or network.

            Different benchmarks will have different sensitivity.

            Comment


            • #7
              Originally posted by HokTar View Post
              Yes, but this scenario is typical for small-scale Linux-based clusters which are typically used for engineering / scientific calculations thus it is of interest to many of us.
              But I do get your point and unfortunately I can't donate a system like that so the situation is unlikely to change. Maybe you could ask Tyan or Supermicro for the reasons I mentioned.
              I'm expecting that it will come. Although I doubt that the scalability testing will be done by the vendors, having results from those systems are fully expected.

              Comment


              • #8
                IDK, The way Intel likes to never ever lower the prices on their higher-end consumer grade chips (i.e. gulftown), and with the relatively low cost of entry-level dual socket boards it might be a very logical upgrade path to grab yourself a second low-end i7 chip and go for a NUMA xeon setup. I'll take a 16-thread NUMA configuration over a $1000 12-thread single-socket gulftown.

                Comment


                • #9
                  Originally posted by devius View Post
                  Shouldn't this be tested on something with a really big number of cores/processors to be able to see any differences? Something like 48 cores or more? 6 cores isn't all that much, even if they have HT.
                  The improvements should be even visible even on 4 cores. Thanks for the test Michael. I wonder why there's no difference? Maybe some funny stuff is disabled or something?

                  Comment


                  • #10
                    Originally posted by jakubo View Post
                    comments on the graphs are rare these days on phoronix.com
                    nothing to tell why the big kernel lock patch and the "patch that does wonders" - as proclaimed - hardly make a change?
                    Weren't those patches aiming at responsiveness?

                    Comment


                    • #11
                      the BKL hasn't mattered in a long time, removing it was nearly purely symbolic unless you were using one of the last few holdouts. So of course it would have no effect on a benchmark. Not sure where the people who can't be bothered to do any proper research and think they know stuff got the idea that BKL removal would affect any benchmarks.

                      here's a link for anyone who really is as useless at research as anyone here.
                      http://halobates.de/blog/p/56

                      As for the 200-line wonder patch, it also has nothing to do with scalability, unless one of the tests is to watch a video while compiling a kernel in a terminal, which is the only case the patch does anything for.

                      Comment


                      • #12
                        Originally posted by devius View Post
                        Shouldn't this be tested on something with a really big number of cores/processors to be able to see any differences? Something like 48 cores or more? 6 cores isn't all that much, even if they have HT.
                        Yes you are right. But 48 cores is a bit low too. You can not really talk about true scalability on as few as 48 cores. You need more cores. Scalable means it scales from few cores up to several 100s.

                        Comment


                        • #13
                          Originally posted by kraftman View Post
                          Weren't those patches aiming at responsiveness?
                          If i understand that patch right then the responsiveness is improved through process grouping. CFQ would normally allocate CPU resources evenly, for example 9 make instances and 1 of vlc, in that case vlc would get about 10% CPU resources while 9 make instances would get the 90%. With the new patch 9 make instances are allocated cpu resources as a group so 9 make instances would get 50% CPU and vlc would get also 50% CPU (of course if it needs so much). For that to work you need cgroups enabled in the kernel. The patch isn't supposed to get more performance but to evenly spread cpu resources and prevent demanding process to starve.

                          Comment


                          • #14
                            Originally posted by mtippett View Post
                            I'm expecting that it will come. Although I doubt that the scalability testing will be done by the vendors, having results from those systems are fully expected.
                            Actually I meant to ask them for the hardware so you could run the tests but my phrasing was dubious. Sorry about that.

                            Comment


                            • #15
                              Originally posted by airlied View Post
                              the BKL hasn't mattered in a long time, removing it was nearly purely symbolic unless you were using one of the last few holdouts. So of course it would have no effect on a benchmark. Not sure where the people who can't be bothered to do any proper research and think they know stuff got the idea that BKL removal would affect any benchmarks.
                              Errrr.... http://en.wikipedia.org/wiki/Giant_lock

                              Kernel lock: Kernel locks all threads, except one. So only one thread at a time. Removing this kernel lock means that threads still needs to lock, but there is not a total serial thread management going on. Now if you have a single core than no matter what you might hack together; only one process is done at a time anyway.

                              Now onto multiple cores; multiple threads at once.

                              Seems like a very simple conclusion to me?

                              Now if that's not the case then Linux realy sucks balls at scaling...

                              Comment

                              Working...
                              X