Announcement

Collapse
No announcement yet.

Squeezing More Juice Out Of Gentoo With Graphite, LTO Optimizations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by darkbasic View Post
    Some packages do not benefit from -O3 optimizations, on the contrary they will perform worse. It will be needed to filter them out from being further optimized.
    Anyway I love the idea of a centralized repository with all the LTO/Graphite overrides for known broken packages.
    Packages that are known to cause problems when built with -03 is filtered out by the ebiuld for that package. Gentoo devs have a lot of years dealing with build failures, so they started a long time ago filtering out flags that don't work for specific ebuilds. Most times even if you have -03 set in make.conf, the ebuilds themselves will override it.

    Comment


    • #12
      Originally posted by MNKyDeth View Post
      Does Arch still have srcpac?
      You could always set the optimizations and recompile your installed packages and newley installed ones with that instead of pacman.

      I haven't used Arch for almost 10yrs now so not exactly sure what they have for building from source anymore, if anything.
      Not sure whether that still exists (can't find it without Googling), but you can download the PKGBUILD and manually build and install the packages. I don't know an automated way to do this on reinstall, though.

      Comment


      • #13
        I used to compile my Gentoo installs with -O3 in my CFLAGS, but I would encounter random issues not just at compile time but at runtime. And those never allowed me to be certain if I'd found a bug in the program or a bug resulting from using -O3 and I always had to recompile with -O2 and it often fixed the problem. Sure, some ebuilds filter -O3 but not all do.

        Doesn't seem worth it, but I still like Gentoo because -O2 -march=native is still better than the generic crap you get in most distros.

        Edit: Haven't tried LTO yet, but I will eventually. I'm waiting for it to mature a bit, and I appreciate that others are testing (not to mention developing) this support.
        Last edited by Holograph; 11 September 2017, 01:18 PM.

        Comment


        • #14
          Originally posted by Brane215 View Post
          gcc-7.2.0 has those bugs fixed. I have graphite on all my packages ( and LTO on selected) and with gcc-7.2.0 I can't remember when a package build has failed due to graphite...
          No, it hasn't. Just of the top of my head does ffmpeg fail to build with an ICE.

          I thought the same as you first, but then ran into the ICE with 7.2.1 (git). Searched the bug database but didn't find anything (I wasn't searching for the word "graphite" directly) and so I reported it. Then someone pointed out there is an entire list of bugs around graphite grouped into a "meta bug":



          So, yeah, this looks bad in my opinion and shouldn't be used until it receives some love. A rather simple piece of code dating back to early 2016 and last known to work with gcc-5 is still triggering ICEs today. The gcc folks just won't touch this stuff and whoever started with the work on graphite appears to have abandoned it.
          Last edited by sdack; 11 September 2017, 11:21 AM.

          Comment


          • #15
            I was using LTO for a while, I've switched it back off now though. Also switched back to O2 and back to GCC 6.4.0 as 7.2.0 is too unstable. I've not really noticed that much of a difference between any of the different configs bar the compilation time

            Comment


            • #16
              Originally posted by darkbasic View Post
              Some packages do not benefit from -O3 optimizations, on the contrary they will perform worse. It will be needed to filter them out from being further optimized.
              This is increasingly unusual. I've done quite a bit of benchmarking on current amd64 systems and rarely if ever will -O3 actually reduce performance. There are some quite unrelated flags pulled in to -O3 - albeit mostly loop optimizations. If there's ever a real performance problem it would likely be just one or two that are actually to blame. I suppose the compiler might sometimes make bad guesses WRT function inlining.

              Code:
               { comm -1 -3 <(gcc -v -Q -O2 -march=native --help=optimizers -x c++ /dev/null | sed -nr 's/^[[:space:]]+-/-/p' | sort -u) <(gcc -v -Q -O3 -march=native --help=optimizers -x c++ /dev/null | sed -nr 's/^[[:space:]]+-/-/p' | sort -u); } 2>/dev/null
              -fgcse-after-reload                     [enabled]
              -finline-functions                      [enabled]
              -fipa-cp-clone                          [enabled]
              -fpeel-loops                            [enabled]
              -fpredictive-commoning                  [enabled]
              -fsplit-loops                           [enabled]
              -fsplit-paths                           [enabled]
              -ftree-loop-distribute-patterns         [enabled]
              -ftree-loop-vectorize                   [enabled]
              -ftree-partial-pre                      [enabled]
              -ftree-slp-vectorize                    [enabled]
              -funswitch-loops                        [enabled]
              I also don't know that a whole lot of thought goes in to which flags are in -O3 vs -O2. The gcc manual has always noted that code size is the metric, which doesn't necessarily have anything to do with performance. I'd say -O3 is the better default for current servers and desktops. Embedded or very old systems might benefit from -02 or -Os somewhat more often.
              Last edited by ormaaj; 11 September 2017, 12:36 PM.

              Comment


              • #17
                Originally posted by MNKyDeth View Post


                Does Arch still have srcpac?
                You could always set the optimizations and recompile your installed packages and newley installed ones with that instead of pacman.

                I haven't used Arch for almost 10yrs now so not exactly sure what they have for building from source anymore, if anything.


                As for Gentoo... It's still one of the best around to learn from. And even hitting up lfs and doing a distro from scratch and then tweaking it to your specific wants is a ton of fun the first couple times around when learning.

                Go nuts on optimizations and see what works or doesn't then tweak each program for the best performance flags as wanted.

                Wish I had the time to still do that sort of stuff.... Sigh.....
                I don't think it does no.

                Hmmm, well makepkg does not support these, so you'd have to patch each pkgbuild (or patch makepkg) it's a bit annoying.
                I also wish it was done by the distribution so that users wouldn't have to recompile
                I don't mind recompiling some package, but I don't want to recompile everything for 2% gain

                Comment


                • #18
                  Originally posted by ormaaj View Post
                  I also don't know that a whole lot of thought goes in to which flags are in -O3 vs -O2. The gcc manual has always noted that code size is the metric, which doesn't necessarily have anything to do with performance. I'd say -O3 is the better default for current servers and desktops. Embedded or very old systems might benefit from -02 or -Os somewhat more often.
                  That is my experience too. It seems to me that if your CPU has plenty of cache, the O3 optimizations are always better.

                  Comment


                  • #19
                    In the cases where code actually breaks at O3, it is almost always a problem with the code itself. Occasionally it is a GCC error, but when I ran Gentoo it was shocking how often code abused floating point equality or tried to use signed integer overflow. Even at O2 pointer aliasing is an error but there was still code doing it. It is much worse with more CPU registers. AMD64 and Itanium revealed tons of bugs.

                    I haven't run Gentoo lately but I'm sure it is still doing the job of revealing everyone's bad C and C++ code.

                    Comment


                    • #20
                      Originally posted by FireBurn View Post
                      I was using LTO for a while, I've switched it back off now though. Also switched back to O2 and back to GCC 6.4.0 as 7.2.0 is too unstable. I've not really noticed that much of a difference between any of the different configs bar the compilation time
                      LTO is still at an early stage, but it's definitely the future. I say definitely, because it increases the scope on which a compiler can optimize code. The first optimizations were only made on expressions, followed by code blocks and later on entire functions. Now compilers can optimize over the entire scope of a file and its includes. LTO only follows this trend and widens the scope further and extends it onto libraries.

                      Some gains can already be had from using LTO, but it requires libraries to be compiled with LTO first, before a compiler gets to see the full scope of a program. This is why it's good to start compiling entire distros with LTO to get to the point where this can happen.

                      Once this is understood can one see why LTO doesn't do anything for small applications. It's applications, which link against hundreds of libraries, that gain the most from it. LTO will become more important in the future. When you look at applications such as Liferea (a news reader), which looks simple enough, but links against 100+ libraries, then it becomes obvious why it's needed.

                      Comment

                      Working...
                      X