Announcement

Collapse
No announcement yet.

BOLT Merged Into LLVM To Optimize Binaries For Faster Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    In general you should read up what bolt is. It does not instantly improve the performance of the compiler or any binary. Bolt itself is merged into it and can be with compiled in their cmake and you have the functions which bolt gives. Nothing more.

    If you want to improve the performance of the compiler, binary or kernel, there are much more steps requiered and time ...

    Comment


    • #22
      Originally posted by monkeynut View Post
      Does this improve performance if used on applications?
      It will depend on the app. If the app has well-defined and heavily-optimized hotspots, such as A/V encoding/decoding, then you shouldn't expect any significant wins.

      For other things, such as a compiler or anything of significant size & complexity, with broadly-distributed time, I'd expect so.

      Comment


      • #23
        Originally posted by Jannik2099 View Post

        I haven't worked with it, but the first profile should theoretically be sufficient for both - BOLT cares about what functions are commonly used together, and that stays invariant under PGO.

        Though it's not like any distro does widespread PGO to begin with, since you need a profile after all.

        In general, if you have regressions with PGO/BOLT, then your profile was probably misleading. Deviations would require individual analysis - usually a function that should not have been inlined but did
        PGO affects inlining which in turn affects set of hot functions in the program. Also BOLT measures more than just function hotness, so I do not think you can use profile from instrumented binary on PGO optimized binary.
        Also with LTO+PGO the compiler should be able to do all transforms BOLT does.

        Comment


        • #24
          Originally posted by dr_wix View Post
          > BOLT works with gcc-built binaries too, and it works on every compiled program - it's a second compilation pass much like PGO

          > it's a feedback driven optimization pass. You build the binary, collect profiling data, then build it again. Thus it doubles.

          Not sure if that is correct, with bolt you don't need to recompile your code.
          Oh yeah, was kinda sleeping on that sorry - no recompilation needed of course

          Comment


          • #25
            Originally posted by hubicka View Post

            PGO affects inlining which in turn affects set of hot functions in the program. Also BOLT measures more than just function hotness, so I do not think you can use profile from instrumented binary on PGO optimized binary.
            Also with LTO+PGO the compiler should be able to do all transforms BOLT does.
            LLVM discards invalid parts of a profile, so the profile would still work, just not be 100% accurate

            BOLT also does some post link time reordering & refusing of sections - in theory it should all be perfect after LTO+PGO, but evidently it is not. Though I couldn't explain why that is.

            Comment


            • #26
              I wonder if it can be used on PIE binaries that you don't have the source code for? That might let you goose a few percent more UPS out of Factorio.

              Comment

              Working...
              X