Announcement

Collapse
No announcement yet.

Google Engineers Propose "Machine Function Splitter" For Faster Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Google Engineers Propose "Machine Function Splitter" For Faster Performance

    Phoronix: Google Engineers Propose "Machine Function Splitter" For Faster Performance

    Google engineers have been working on the Machine Function Splitter as their means of making binaries up to a few percent faster thanks to this compiler-based approach. They are now seeking to upstream the Machine Function Splitter into LLVM...

    http://www.phoronix.com/scan.php?pag...ction-Splitter

  • #2
    I'm guessing this will be part of PGO? Otherwise how does it know what's going to be hot or cold?

    Comment


    • #3
      Originally posted by FireBurn View Post
      I'm guessing this will be part of PGO? Otherwise how does it know what's going to be hot or cold?
      Yep.

      Seems to be (originally) developed as countermeasure to LTO + excessive inlining creating lotsa rarely accessed code.

      Comment


      • #4
        Maybe I'm missing something here, but is it ok to add something as complex as this (with the required maintenance and whatnot) to a compiler in return for only a few percent improvement? Does no one do a cost-benefit analysis?

        Comment


        • #5
          This doesn't sound all that complex to me. Aside from being able to figure out which code path is hot vs cold, moving the code around is pretty trivial in my opinion. In fact, this perhaps doesn't need to be inside the compiler at all -- this could make sense as a run-time profiling and self modifying code feature.

          Comment


          • #6
            Originally posted by bug77 View Post
            Maybe I'm missing something here, but is it ok to add something as complex as this (with the required maintenance and whatnot) to a compiler in return for only a few percent improvement? Does no one do a cost-benefit analysis?
            It's done all the time. And isn't *that* complex... its just deciding if the code is hot or not based on profiling and hinting the compiler about which sections of code should be in the same cache lines or not. Hot code gets grouped with hot code and cold code with cold...so that when when a cache line is loaded or prefeched its more likely to get entirely hot code.

            Comment


            • #7
              Originally posted by bug77 View Post
              Maybe I'm missing something here, but is it ok to add something as complex as this (with the required maintenance and whatnot) to a compiler in return for only a few percent improvement? Does no one do a cost-benefit analysis?
              16 to 35% from iTLB and 62..67% of sTLB, I do not call that a few percent. That the total benchmark "only" shows 1.5% of performance increase is a total of other factors too.
              But the more threads you have the more important that TLB becomes, or the cache itself.

              Comment


              • #8
                There is no fully automatic feedback-directed compilation in gcc/llvm yet. The -fauto-profile gcc option is semi-automatic and requires human interaction and/or additional scripts. Without full automation, the machine function splitter will not achieve wide adoption.

                Comment


                • #9
                  Originally posted by cb88 View Post
                  deciding if the code is hot or not
                  Just do it mechanical Turk style. Put code snippets on a web page that asks "hot or not?"

                  Comment


                  • #10
                    What about Goldilocks code? Code that's not too hot & not too cold; code that's just right.

                    Comment

                    Working...
                    X