Announcement

Collapse
No announcement yet.

A Fresh Look At The PGO Performance With GCC 8

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • A Fresh Look At The PGO Performance With GCC 8

    Phoronix: A Fresh Look At The PGO Performance With GCC 8

    It's been a while since we last ran some GCC PGO benchmarks, the Profile Guided Optimizations or feedback-directed optimization technique that makes use of profiling data at run-time to improve performance of re-compiled binaries. Here are some fresh benchmarks of GCC PGO impact on a Xeon Scalable server while using the newly-released GCC 8.2 release candidate.

    http://www.phoronix.com/vr.php?view=26589

  • #2
    I guess it's useful if you need to get that last few % performance benefits out of your program, like in HFT etc. Feeding realistic workloads is the important part.

    Comment


    • #3
      Nice benchmark Michael, a couple of things:

      You don't need to use the '-fprofile-dir=foo/' parameter, you can just do '-fprofile-generate=foo/' and likewise '-fprofile-use=foo/'

      In the 'm-queens v1.1' benchmark you listed to following options: -fopenmp -O3 -march=native -O2

      The -O2 at the end will override -O3 at the start, which is probably not what you wanted.

      Overall the amount of benefit from PGO depends on how well or not the compiler manages to guess the correct optimization strategies without profile data, one piece of software I've compiled which has benefited greatly is rendering in Blender (cpu rendering naturally) where I've had ~15-20% improvement, as such it was interesting to see that the largest benefit in these tests was in the C-Ray renderer with ~17%.

      Comment


      • #4
        PGO is nice to have, I do PHP 7 PGO compiles for that extra bit of performance.

        Michael did you compile GCC 8.2 from http://www.netgull.com/gcc/snapshots/8.2.0-RC-20180719/ snapshot ? I just tried and gcc version is still reported as 8.1.1 ?

        Comment


        • #5
          Try O2 or Os profile guided. It can do much of the same as O3 without blowing up binary size

          Comment


          • #6
            Originally posted by eva2000 View Post
            PGO is nice to have, I do PHP 7 PGO compiles for that extra bit of performance.

            Michael did you compile GCC 8.2 from http://www.netgull.com/gcc/snapshots/8.2.0-RC-20180719/ snapshot ? I just tried and gcc version is still reported as 8.1.1 ?
            Yes that snapshot reports 8.1.1.
            Michael Larabel
            http://www.michaellarabel.com/

            Comment


            • #7
              Originally posted by carewolf View Post
              Try O2 or Os profile guided. It can do much of the same as O3 without blowing up binary size
              I disagree, the binary size differences aren't particularly large between -Os, -O2, -O3 on the vast majority of code unless you are using extremely constrained hardware, also one of the optimizations that PGO enable is loop unrolling (-funroll-loops) which is one of the optimizations that has a very large impact on binary size.

              Besides that, if you go through the trouble of using PGO, you are most likely looking for the best possible performance, which with very few exceptions, is something you get from -O3 / -Ofast .

              Comment


              • #8
                Originally posted by Grinch View Post

                I disagree, the binary size differences aren't particularly large between -Os, -O2, -O3 on the vast majority of code unless you are using extremely constrained hardware, also one of the optimizations that PGO enable is loop unrolling (-funroll-loops) which is one of the optimizations that has a very large impact on binary size.

                Besides that, if you go through the trouble of using PGO, you are most likely looking for the best possible performance, which with very few exceptions, is something you get from -O3 / -Ofast .
                The point is that one of the tricks PGO has is enabling many of the O3 optimizations on demand, so it will use them in all hot code, as if you had indicated O3 even if you didn't.

                Comment


                • #9
                  Originally posted by Michael View Post

                  Yes that snapshot reports 8.1.1.
                  cheers so it isn't my eyes playing tricks on me

                  Comment


                  • #10
                    Originally posted by carewolf View Post

                    The point is that one of the tricks PGO has is enabling many of the O3 optimizations on demand, so it will use them in all hot code, as if you had indicated O3 even if you didn't.
                    Hmmm... PGO does not enable any of the -O3 optimizations, which in turn are '-finline-functions', '-fweb', '-frename-registers'.

                    PGO enables '-fbranch-probabilites', '-fvpt', -funroll-loops,'-fpeel-loops','-ftracer'

                    If you are right then that is very interesting, but I've seen no such information, are you sure ?

                    Comment

                    Working...
                    X