Announcement

Collapse
No announcement yet.

LTO'ing Mesa Is Getting Discussed For Performance & Binary Size Reasons

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • LTO'ing Mesa Is Getting Discussed For Performance & Binary Size Reasons

    Phoronix: LTO'ing Mesa Is Getting Discussed For Performance & Binary Size Reasons

    Enabling compiler Link-Time Optimizations (LTO) by default for Mesa in non-debug builds is being discussed in the name of performance and binary size...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    great idea
    Last edited by CochainComplex; 31 May 2016, 12:17 PM. Reason: edit

    Comment


    • #3
      LTO is always great. It' suprising how long it still takes to utilize all the compiler optimizations from the last 50 years in off-the-shelf compilers.

      Comment


      • #4
        I have tried compiling mesa with lto recently and didn't have any problems - except compilation takes a lot longer. And the worst of it is that for incremental builds, it takes that long every time it links mesa. Is there something gcc can do to cache lto optimizations? Or does it already do that and it's just a high chance that any of the stuff linked together has changed so it needs to relink everything?

        Comment


        • #5
          Originally posted by haagch View Post
          I have tried compiling mesa with lto recently and didn't have any problems - except compilation takes a lot longer. And the worst of it is that for incremental builds, it takes that long every time it links mesa. Is there something gcc can do to cache lto optimizations? Or does it already do that and it's just a high chance that any of the stuff linked together has changed so it needs to relink everything?
          It is called lto (link time optimization) for a reason. If you have to recompile that often disable it.

          Comment


          • #6
            Originally posted by atomsymbol

            I am hoping that somebody finally writes a C/C++ JIT compiler that needs to be passed the source code once and all subsequent optimizing recompilations are fully automatic.
            Define "recompilations".

            With lto enabled gcc produces GIMPLE (intermediate language) object files, which are then compiled to binaries at link time.

            Comment


            • #7
              Originally posted by atomsymbol

              For example: Conversion of position-independent code to fixed-position code when the JIT determines the code is used very often. The latter code is slightly faster than the former.
              That could hypothetically be done already, but thats the linking phase, not re-compilation (even if technically wit LTO the linking phase will compile too).

              If fixed-position code woud be faster everytime then that would be used with-or-without LTO (unless PIC is needed/requested for technical reasons). Its generally wrong though, depends highly on architecture (if it allows PC-relative adressing ) and OS/Toolchain.

              Comment


              • #8
                I'm using -flto=8 in my CFLAGS and CXXFLAGS and -Wl,-ftlo=8 in my LDFLAGS, seems to be working fine with gcc 6.1

                Had to set:

                Code:
                [FONT=monospace][COLOR=#000000]AR="gcc-ar" [/COLOR]
                NM="gcc-nm"
                RANLIB="gcc-ranlib"[/FONT]
                In my make.conf too so it would work

                Here's my current setup:

                Code:
                [FONT=monospace][COLOR=#000000]CFLAGS="-O3 -march=native -pipe -floop-interchange -ftree-loop-distribution -floop-strip-mine -floop-block -flto=8 -Wno-narrowing" [/COLOR]
                CXXFLAGS="${CFLAGS} -fno-delete-null-pointer-checks -flifetime-dse=1 -fpermissive"
                LDFLAGS="-Wl,-O1 -Wl,--hash-style=gnu -Wl,--as-needed -Wl,-flto=8"[/FONT]
                And here the packages I have to switch some of that off for: (package.env)

                Code:
                [FONT=monospace][COLOR=#000000]app-arch/cpio                           no-graphite.conf [/COLOR]
                app-arch/tar                            no-lto.conf no-graphite.conf
                app-emulation/wine                      no-lto.conf no-graphite.conf
                app-office/libreoffice                  no-lto.conf
                app-text/convertlit                     no-lto.conf
                dev-lang/python                         no-graphite.conf
                dev-libs/libgcrypt                      no-lto.conf
                dev-qt/designer                         no-lto.conf
                dev-qt/qtdeclarative                    no-lto.conf
                dev-qt/qtgui                            no-lto.conf
                dev-qt/qtscript                         no-lto.conf
                dev-util/ragel                          no-lto.conf
                games-action/minetest                   no-graphite.conf
                games-fps/worldofpadman                 no-lto.conf
                kde-frameworks/kdoctools                no-lto.conf
                media-libs/alsa-lib                     no-lto.conf
                media-libs/flac                         no-graphite.conf
                media-libs/freeglut                     no-graphite.conf
                media-libs/lcms                         no-graphite.conf
                media-libs/libsndfile                   bfd.conf
                media-libs/mediastreamer                bfd.conf
                media-libs/vulkan-base                  no-lto.conf
                media-libs/x264                         no-lto.conf
                media-sound/pulseaudio                  no-lto.conf
                media-sound/twolame                     no-graphite.conf
                media-video/ffmpeg                      no-graphite.conf
                media-video/ffmpeg                      o2.conf
                media-video/handbrake                   no-lto.conf
                sys-apps/gawk                           no-graphite.conf
                sys-apps/groff                          no-graphite.conf
                sys-apps/pciutils                       no-lto.conf
                sys-devel/binutils                      gold.conf
                sys-devel/gettext                       no-lto.conf
                sys-fs/fuse                             bfd.conf no-graphite.conf
                sys-libs/ncurses                        no-lto.conf
                www-client/chromium                     o2.conf no-lto.conf no-graphite.conf
                x11-base/xorg-server                    no-lto.conf
                x11-drivers/xf86-video-intel            no-lto.conf[/FONT]

                Comment


                • #9
                  Originally posted by atomsymbol
                  Are you using -flto=$(nproc) ?
                  I did not know this was an option, thanks.

                  I also learned about other flags from https://lists.freedesktop.org/archiv...ay/118929.html so I'm now compiling with

                  export CFLAGS="$CFLAGS -O3 -flto=9 -ffat-lto-objects -flto-odr-type-merging"
                  export CXXFLAGS="$CFLAGS"
                  export LDFLAGS=" -flto=9"

                  Now the lto1 binary indeed uses all my cores. I still do not have any problems with mapi with these flags. My mesa installation is 9.24 MiB bigger than my previous build with -O2 and no lto. The message says that removing --enable-glx-tls also helps against his build failure, but I have it enabled and still no problems. Strange.

                  Anyway, the build took about 15 minutes for radeonsi,r600,swrast,ilo, anv, and most features enabled. Not sure how long it is without lto, but I guess somewhere between 5-10 minutes.

                  Comment


                  • #10
                    Originally posted by atomsymbol

                    Ok, but LTO != JIT
                    Thankfully, yes.
                    Just use Java if you want JIT, likely wont ever be as fast as compiled code (unless you carefully construct a problem for this "solution"), and measurements and heuristics that would be used to optimize the running code can just aswell just result in degradation of performance.

                    Comment

                    Working...
                    X