Announcement

Collapse
No announcement yet.

Google Engineer Reworks Direct I/O In Linux Kernel

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • phoronix
    started a topic Google Engineer Reworks Direct I/O In Linux Kernel

    Google Engineer Reworks Direct I/O In Linux Kernel

    Phoronix: Google Engineer Reworks Direct I/O In Linux Kernel

    A Google engineer working on Linux, Kent Overstreet, has reworked the Linux DIO (Direct I/O) code so that it's "vastly simpler" while also being faster for at least some test cases...

    http://www.phoronix.com/vr.php?view=MTMwMTI

  • Delgarde
    replied
    Originally posted by frantaylor View Post
    The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.
    While that's true, reducing line count is still a big factor, since in general it reduces complexity, makes the code easier to understand. I'm not advocating perl-one-liner style coding, but most large codebases can usually stand to lose 10% or more - you run analysis tools over it, and find chunks of dead code not called anywhere, utility functions that have been copy-pasted around, that sort of stuff. And that's not counting the just plain bad stuff, where someone's used a hundred lines to achieve what could easily have been done in less than half that.

    And yes, a lot of that does correspond to age - particularly if the codebase is old enough to predate good tooling. Parts of the Java codebase I work on date to the late 90's when Java was new - no good IDEs then, nor unit testing libraries, code analysis tools, etc. And it shows - even today, it's not uncommon that we add a bunch of new functionality, and the total line count has actually gone down from all the cleanup that we did in the process. Fixing old warnings, taking out dead code, extracting helper functions for heavily duplicated operations... I reckon I'd delete a few thousand lines a month on average.

    Leave a comment:


  • frantaylor
    replied
    Originally posted by Otamay View Post
    Easy maintenance is another advantage, but what I thing ua=42 meant to say is that is always a good new, although it doesn't represent an improvement for your specific use.
    I love ripping out dead code but I hate regressions even more. Change is great but it's gotta come with a full regression test suite.

    Leave a comment:


  • bobwya
    replied
    I would imagine that Google engineers are mainly targeting Kernel code that is useful to their own servers or Android... No??

    Leave a comment:


  • Otamay
    replied
    Originally posted by frantaylor View Post
    I stopped worrying about things like this at about the same time I stopped using a boot floppy to start my linux box. I've got at least 8 gb of ram in all of my computers, I stopped thinking about code size years ago. In practical settings it has no relevance. The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.
    Easy maintenance is another advantage, but what I thing ua=42 meant to say is that is always a good new, although it doesn't represent an improvement for your specific use.

    Leave a comment:


  • stqn
    replied
    But the more code there is, the higher the barrier to entry for potential new programmers. Also, the longer it takes to compile. Also, more work and pollution (RAM sticks) is not always the right answer.
    So, well done Kent .

    Leave a comment:


  • frantaylor
    replied
    Originally posted by ua=42 View Post
    at least the kernel is 600 lines smaller in size.
    I stopped worrying about things like this at about the same time I stopped using a boot floppy to start my linux box. I've got at least 8 gb of ram in all of my computers, I stopped thinking about code size years ago. In practical settings it has no relevance. The number of bugs in code is not strongly related to the number of lines of code. The number of bugs in code is much more strongly related to the age of the code and how well it has been tested. Five lines of code can contain 10 bugs while a hundred lines of code that does the same thing can be bug-free.

    Leave a comment:


  • frantaylor
    replied
    Originally posted by Ericg View Post
    Practical application of this patch set? Like is Direct IO a specific subsystem used by specific programs or is it more like the entire IO subsystem? What kind of programs will this help?
    This is for databases. A database and a file system are pretty much the same thing, so putting a database inside a file is redundant. When you set up your database system, you don't format the data drives, you just hand them directly to the database and it manipulates them directly. Direct IO means the database has faster access to the hardware. Many commercial databases operate in this manner.

    Leave a comment:


  • ua=42
    replied
    Well I googled Direct I/O and I came up with this. http://unixfoo.blogspot.com/2008/01/...direct-io.html

    Basically its a method to access files while telling the operating system to not buffer the file in ram (aka, the program will keep a copy of the file in its memory and won't be reloading the file).

    I'm gonna guess that very few applications use this method. (According to the link, databases use this).

    Sooo...
    If you are running a database... this is good news. Free speed boost.
    If you aren't.... at least the kernel is 600 lines smaller in size.

    Leave a comment:


  • Ericg
    replied
    Originally posted by phoronix View Post
    Phoronix: Google Engineer Reworks Direct I/O In Linux Kernel

    A Google engineer working on Linux, Kent Overstreet, has reworked the Linux DIO (Direct I/O) code so that it's "vastly simpler" while also being faster for at least some test cases...

    http://www.phoronix.com/vr.php?view=MTMwMTI
    Practical application of this patch set? Like is Direct IO a specific subsystem used by specific programs or is it more like the entire IO subsystem? What kind of programs will this help?

    Leave a comment:

Working...
X