Announcement

Collapse
No announcement yet.

Google Engineer Reworks Direct I/O In Linux Kernel

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Google Engineer Reworks Direct I/O In Linux Kernel

    Phoronix: Google Engineer Reworks Direct I/O In Linux Kernel

    A Google engineer working on Linux, Kent Overstreet, has reworked the Linux DIO (Direct I/O) code so that it's "vastly simpler" while also being faster for at least some test cases...

    http://www.phoronix.com/vr.php?view=MTMwMTI

  • #2
    Originally posted by phoronix View Post
    Phoronix: Google Engineer Reworks Direct I/O In Linux Kernel

    A Google engineer working on Linux, Kent Overstreet, has reworked the Linux DIO (Direct I/O) code so that it's "vastly simpler" while also being faster for at least some test cases...

    http://www.phoronix.com/vr.php?view=MTMwMTI
    Practical application of this patch set? Like is Direct IO a specific subsystem used by specific programs or is it more like the entire IO subsystem? What kind of programs will this help?

    Comment


    • #3
      Well I googled Direct I/O and I came up with this. http://unixfoo.blogspot.com/2008/01/...direct-io.html

      Basically its a method to access files while telling the operating system to not buffer the file in ram (aka, the program will keep a copy of the file in its memory and won't be reloading the file).

      I'm gonna guess that very few applications use this method. (According to the link, databases use this).

      Sooo...
      If you are running a database... this is good news. Free speed boost.
      If you aren't.... at least the kernel is 600 lines smaller in size.

      Comment


      • #4
        Originally posted by Ericg View Post
        Practical application of this patch set? Like is Direct IO a specific subsystem used by specific programs or is it more like the entire IO subsystem? What kind of programs will this help?
        This is for databases. A database and a file system are pretty much the same thing, so putting a database inside a file is redundant. When you set up your database system, you don't format the data drives, you just hand them directly to the database and it manipulates them directly. Direct IO means the database has faster access to the hardware. Many commercial databases operate in this manner.

        Comment


        • #5
          Originally posted by ua=42 View Post
          at least the kernel is 600 lines smaller in size.
          I stopped worrying about things like this at about the same time I stopped using a boot floppy to start my linux box. I've got at least 8 gb of ram in all of my computers, I stopped thinking about code size years ago. In practical settings it has no relevance. The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.

          Comment


          • #6
            But the more code there is, the higher the barrier to entry for potential new programmers. Also, the longer it takes to compile. Also, more work and pollution (RAM sticks) is not always the right answer.
            So, well done Kent .

            Comment


            • #7
              Originally posted by frantaylor View Post
              I stopped worrying about things like this at about the same time I stopped using a boot floppy to start my linux box. I've got at least 8 gb of ram in all of my computers, I stopped thinking about code size years ago. In practical settings it has no relevance. The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.
              Easy maintenance is another advantage, but what I thing ua=42 meant to say is that is always a good new, although it doesn't represent an improvement for your specific use.

              Comment


              • #8
                I would imagine that Google engineers are mainly targeting Kernel code that is useful to their own servers or Android... No??

                Comment


                • #9
                  Originally posted by Otamay View Post
                  Easy maintenance is another advantage, but what I thing ua=42 meant to say is that is always a good new, although it doesn't represent an improvement for your specific use.
                  I love ripping out dead code but I hate regressions even more. Change is great but it's gotta come with a full regression test suite.

                  Comment


                  • #10
                    Originally posted by frantaylor View Post
                    The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.
                    While that's true, reducing line count is still a big factor, since in general it reduces complexity, makes the code easier to understand. I'm not advocating perl-one-liner style coding, but most large codebases can usually stand to lose 10% or more - you run analysis tools over it, and find chunks of dead code not called anywhere, utility functions that have been copy-pasted around, that sort of stuff. And that's not counting the just plain bad stuff, where someone's used a hundred lines to achieve what could easily have been done in less than half that.

                    And yes, a lot of that does correspond to age - particularly if the codebase is old enough to predate good tooling. Parts of the Java codebase I work on date to the late 90's when Java was new - no good IDEs then, nor unit testing libraries, code analysis tools, etc. And it shows - even today, it's not uncommon that we add a bunch of new functionality, and the total line count has actually gone down from all the cleanup that we did in the process. Fixing old warnings, taking out dead code, extracting helper functions for heavily duplicated operations... I reckon I'd delete a few thousand lines a month on average.

                    Comment

                    Working...
                    X