Announcement

Collapse
No announcement yet.

Autonomously Finding Performance Regressions In The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by droidhacker View Post
    Amazing.
    That this thing could actually be used to do something *useful*
    Raining hard is it?

    Comment


    • #22
      Originally posted by mirza View Post
      Wait, this means that if I want to have consistent data on ext4 I must look at 5x slower PostgreSQL and other write-intensive applications? And proposed solution is either obscure mount switch with possibility of occasional file corruption or really _slow_ PostgreSQL. WTF?
      possibility of data corruption on power fail is there any time write caching is enabled on the disk, regardless of file system.

      if you have a system you care that much about performance or data integrity on you certainly have a battery backup and power line conditioning UPS hooked to it right? in which case you can safely disable the extra integrity protection.

      Comment


      • #23
        Originally posted by xianthax View Post
        possibility of data corruption on power fail is there any time write caching is enabled on the disk, regardless of file system.

        if you have a system you care that much about performance or data integrity on you certainly have a battery backup and power line conditioning UPS hooked to it right? in which case you can safely disable the extra integrity protection.

        Data Integrity and performance are at opposite sides of a scale. The correct "default" slides around from somewhat secure to somewhat fast. The main sort of focus seems to be moving towards you might lose data, but at least make sure that you lose data in a consistent manner.

        Barriers and so on go a long way to "chunk" the written data, balancing between caching and synchronous writing (a barrier guarantees that the data before the barrier is written before the data afterwards, rather than each forcing in-order data to be written.

        I don't think we'll ever get to a "right" balance point, it will always be wrong in some way.

        As a user, you have two choices, either accept the default (and hence the "best judgement/best awareness" of the maintainers), or tune to your requirement (either performance at the risk of data, or integrity at the costof speed).

        Regards,

        Matthew

        Comment


        • #24
          did you use or recreate 'git bisect run'?

          git has the command 'git bisect run <script>' that will automate the bisect and testing process (the script returns one code if the test passes, a different on if it fails, and a third one if the test was unable to be run for some unrelated reason)

          I'm curious if you took advantage of this function, or if you ended up recreating it (or most of it) in your code?

          Comment


          • #25
            Originally posted by dlang View Post
            git has the command 'git bisect run <script>' that will automate the bisect and testing process (the script returns one code if the test passes, a different on if it fails, and a third one if the test was unable to be run for some unrelated reason)

            I'm curious if you took advantage of this function, or if you ended up recreating it (or most of it) in your code?
            No, the git bisect run portion is not being used within this module, since it doesn't suit the requirements for PTS.
            Michael Larabel
            https://www.michaellarabel.com/

            Comment


            • #26
              Originally posted by Michael View Post
              No, the git bisect run portion is not being used within this module, since it doesn't suit the requirements for PTS.
              interesting.

              what was it lacking? (so that I can pass it on the the git developers as a possible enhancement)

              how do you handle the case where a kernel picked by the bisect can't compile, crashes on boot, etc? (this is one thing that git bisect run does have a mechanism to handle)

              what do you do if new compile options appear as you bisect?

              one extreme case of this a year or so ago was that a bunch of compile options were moved into a submenu, with the menu needing to be selected before the other options would work. (this broke a lot of people's processes, and there never was a good solution for it found that I know of)

              please do not take this the wrong way, I am not trying to attack you for building this feature. I am just trying to point out land mines that other people discovered doing this so that you can work on fixing them before they blow up on you.

              Comment


              • #27
                Bisection is CM agnostic, hell, you don't even need CM. You just need the following
                1. An ordered list with identifiers
                2. A way to set up a system based on the identifier
                3. Something to run that generates a quantative result
                4. A fulcrum (my term) that you want to detect


                Assuming that the ordered list is has a single transition you can do all sorts of funky things.

                Determine optimum cluster size for a filesystem
                1. Cluster size (512,1k,2k,4k,8k)
                2. mke2fs
                3. PTS doing a benchmark of some sort
                4. A perfomance value you can't go outside


                Somewhat contrived, but if you have hard performance criteria, and want to balance that against size. The above would work, just set it up, and a bisection would be able to tell you what cluster size meets your requirement.

                Work out when a driver slowed down a 2D operation
                1. Driver releases (CAT 9.1, 9.2, 9.3)
                2. curl to download, script to install, reboot
                3. PTS doing a benchmark of some sort
                4. A known before and after value for a benchmark



                Of course defect tracking through bisection is easiest to grok, and is usually the best value for a developers time .
                Last edited by mtippett; 22 October 2009, 10:01 PM.

                Comment


                • #28
                  Originally posted by dlang View Post
                  interesting.

                  what was it lacking? (so that I can pass it on the the git developers as a possible enhancement)

                  how do you handle the case where a kernel picked by the bisect can't compile, crashes on boot, etc? (this is one thing that git bisect run does have a mechanism to handle)

                  what do you do if new compile options appear as you bisect?

                  one extreme case of this a year or so ago was that a bunch of compile options were moved into a submenu, with the menu needing to be selected before the other options would work. (this broke a lot of people's processes, and there never was a good solution for it found that I know of)

                  please do not take this the wrong way, I am not trying to attack you for building this feature. I am just trying to point out land mines that other people discovered doing this so that you can work on fixing them before they blow up on you.
                  It shouldn't be git-centric. Perforce, svn, cvs, driver releases, etc can all get some love.

                  The "setup" ie: download, build, install stage can be rigged to do different things based on different builds. ie: if after this commit, setup this way, otherwise set up this way.

                  It does bring in a danger that you end up commiting a cardinal sin modifying two variables, (the setup and the commit point), but in some cases you can't avoid it.

                  Comment


                  • #29
                    Originally posted by phoronix View Post
                    Phoronix: Autonomously Finding Performance Regressions In The Linux Kernel

                    Last weekend a few Phoronix benchmarks were underway of the Linux 2.6.32-rc5 kernel when a very significant performance regression was spotted. This regression caused the PostgreSQL server to run at about 18% of the performance found in earlier kernel releases. Long story short, in tracking down this performance regression we have finally devised a way to autonomously locate performance regressions within the Linux kernel and potentially any Git-based project for that matter. Here are a few details.

                    http://www.phoronix.com/vr.php?view=14285
                    This is awesome! Can anyone just run this, or do I need to buy something?

                    Comment


                    • #30
                      Autonomous (btw, why not automatic) regression testing is super useful.
                      How about extending it for regression testing wine with a set of windows programs? This could at least test for crashes. You could add automatic screen shots to get some of the functionality testing done too.

                      Comment

                      Working...
                      X