Announcement

Collapse
No announcement yet.

Another Major Linux Power Regression Spotted

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Another Major Linux Power Regression Spotted

    Phoronix: Another Major Linux Power Regression Spotted

    Since Friday there's been a number of Phoronix articles about a very bad power regression in the mainline Linux kernel, which is widespread, Ubuntu 11.04 is one of the affected distributions, and has been deemed a bug of high importance. This yet-to-be-resolved issue is affected Linux 2.6.38 and 2.6.39 kernels and for many desktop and notebook systems is causing a 10~30% increase in power consumption. Nevertheless, this is not the only major outstanding power regression in the mainline tree, there is another dramatic regression now spotted as well that is yet-to-be-fixed.

    http://www.phoronix.com/vr.php?view=15943

  • #2
    Any word on AMD processors?

    And would that be a Dunkles Weißbier?

    Comment


    • #3
      Jees, the difference between 2.6.28 and 2.6.38 could easily mean an extra hour or two battery power. We are indeed going backwards.

      Comment


      • #4
        Originally posted by FunkyRider View Post
        Jees, the difference between 2.6.28 and 2.6.38 could easily mean an extra hour or two battery power. We are indeed going backwards.
        But performance, flexibility and features are going up at the same time. Hmmm.

        Sounds like we need to option-out the stuff that sucks power the most, and maybe over time we'll have a linux-lowpower package in Ubuntu.

        Normally, using more power, to the extent that it's correlated with increased performance, is a "good thing" for servers and other computers running on A/C power. This usually means that you are seeing increased utilization of your hardware. A very hot chip is a busy chip, and if you get corresponding performance increases, that just confirms that you're getting what you paid for.

        But I don't think we need to have any kind of stand-off between those who want performance (energy be damned) and those who want to get 8+ hours of battery life on a ThinkPad X-Series. Instead, we should just isolate those particular things that are most pivotal in determining the power consumption, and then: if they are performance-enhancing things, we should keep them (generally speaking), but provide an option to disable them, ideally at runtime, for power savings. But if it turns out that we're surrendering energy consumption without any performance benefit, that's bad for everyone, so that needs to be fixed, of course.

        Simple non-kernel example: if you're running a composited desktop, it's going to use way more energy than a non-composited desktop, because not only will the display output be awake all the time, but the GPU will always be awake, processing the frames of the composited pipeline. On most laptops with a discrete GPU, keeping this beast awake constantly is a huge drain on battery life. With a pure 2D environment, you use less energy because the GPU (if you even have one) can go to sleep since it has no OpenGL contexts open against it. You just rasterize off the CPU and go to sleep. And since we usually expect much less (in terms of rendering complexity) from a rasterized scene, we significantly save on the number of computations overall, so of course we'll use less power!

        But does that mean we should get rid of composited desktops? No -- because they provide much-desired features and eye candy, and (sometimes) increased performance on certain drivers/workloads. They do consume more power, and the computations that go into compositing are more complex than rastering a simple desktop, but we think the cost is worth it. Or at least, we think the cost is worth providing the user with the option of running with or without a compositor, rather than just going one way.

        Recognizing the pivot points in the kernel that most dramatically affect power usage will help the kernel more closely resemble the desktop in terms of allowing users to trade off performance/features and power. We just have to make sure that, at each pivot, that we're actually getting some kind of benefit from it -- otherwise it's a bug and should be fixed, no option needed.

        Comment


        • #5
          Normally, using more power, to the extent that it's correlated with increased performance, is a "good thing" for servers and other computers running on A/C power...
          Most probably this is not the case here, we are under a regression that affects power consumption. If there is power consumption increase is because your computer resources are used more frequently, unnecessarily, so there is less resources for user applications.

          My opinion here is that correctly solving those regressions will give a "little" more performance to user applications.

          Comment


          • #6
            Are we absolutely sure this is a kernel bug and it isn't a piece of user space that's not talking to the kernel properly like udev or the like

            I compile my own minimal kernels (about 2MB) with everything I don't use switched of and everything I do use compiled in (rather than compiled as a module)

            I'm tempted to see if I'm as effected by this bug on my laptop as everyone else

            Comment


            • #7
              Is it possible to search for this regression at home with pts?
              Can't we just upload the tests to openbenchmarking.org and see which distributions and kernel configurations are affected?

              Comment


              • #8
                Those facts shows us that Phoronix Test Suite is very useful thing and that it's vital for good development. I think this tool should be developed further and maintained cause it is everybody's interest.

                Comment


                • #9
                  Originally posted by FireBurn View Post
                  Are we absolutely sure this is a kernel bug and it isn't a piece of user space that's not talking to the kernel properly like udev or the like

                  I compile my own minimal kernels (about 2MB) with everything I don't use switched of and everything I do use compiled in (rather than compiled as a module)

                  I'm tempted to see if I'm as effected by this bug on my laptop as everyone else
                  Since it was reproduced on the 8.04 userspace which is now 3 years old the answer is no. And if it's a piece of userspace that "isn't talking to the kernel properly" then that points to a kernel bug since the new kernel broke the old userspace code.

                  Comment


                  • #10
                    Originally posted by DeiF View Post
                    Is it possible to search for this regression at home with pts?
                    Can't we just upload the tests to openbenchmarking.org and see which distributions and kernel configurations are affected?
                    Normally, yes, but for this power testing there's lots of uncommitted code as of right now. I don't think I'll have the improvements merged in the public repository this week, unfortunately, since I have to leave on Thursday and have lots of work to still take care of prior to that.
                    Michael Larabel
                    http://www.michaellarabel.com/

                    Comment


                    • #11
                      Originally posted by amphigory View Post
                      And would that be a Dunkles Weißbier?
                      No, just some Hacker-Pschorr. I was low on beer and busy with this testing so just walked to a store that's a block away to pickup some Hacker where as otherwise the beer I normally drink is about six kilometers away. So I was just being efficient and drinking slightly-less-good-but-still-amazing Munich beer.
                      Michael Larabel
                      http://www.michaellarabel.com/

                      Comment


                      • #12
                        Originally posted by allquixotic View Post
                        Normally, using more power, to the extent that it's correlated with increased performance, is a "good thing" for servers and other computers running on A/C power.
                        That might be true on workstations, but is certainly not the case for rack-dense servers where THE problem is maximizing performance per watt, both from energy cost and problems of heat dissipation. Perhaps I'm missing something, but how is this bug not driving the server people nuts? 15% greater power consumption at idle would massive problem in a server farm, let alone cloud server containers.

                        Comment


                        • #13
                          Originally posted by mthome View Post
                          That might be true on workstations, but is certainly not the case for rack-dense servers where THE problem is maximizing performance per watt, both from energy cost and problems of heat dissipation. Perhaps I'm missing something, but how is this bug not driving the server people nuts? 15% greater power consumption at idle would massive problem in a server farm, let alone cloud server containers.
                          Because most server people aren't running a Beta OS, let alone a bleeding edge kernel, on a production hardware. You're seeing why

                          Comment


                          • #14
                            Kernel list/bugzilla anyone?

                            This bug has been open on the Ubuntu bugzilla (Launchpad) for two years

                            https://bugs.launchpad.net/ubuntu/+s...ux/+bug/524281

                            The only way to get it fixed is for someone to attract the kernel developers attention.....

                            These test results should be added to with some powertop runs and results reported. I am beyond frustration over this matter. It reflects very badly on linux.

                            Comment


                            • #15
                              Originally posted by locovaca View Post
                              Because most server people aren't running a Beta OS, let alone a bleeding edge kernel, on a production hardware. You're seeing why
                              Hmm - well, while Ubuntu 10.10 isn't LTS, it isn't beta either, and kernel 2.6.35 is supposed to be production quality (with longterm support). In any case, my point is that I'm surprised that this is still being portrayed as an issue that mainly applies to laptop users when it is precluding anything more recent than 2.6.34 from running on the server side - the traditional linux stronghold.

                              Comment

                              Working...
                              X