Announcement

Collapse
No announcement yet.

GCC's Potential GSoC Projects Include Better Parallelizing The Compiler

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GCC's Potential GSoC Projects Include Better Parallelizing The Compiler

    Phoronix: GCC's Potential GSoC Projects Include Better Parallelizing The Compiler

    While in some areas it's still an extremely cold winter, many open-source projects are already preparing for their participation in Google's annual Summer of Code initiative. The GNU Compiler Collection (GCC) crew that always tends to see at least a few slots for interested student developers has begun formulating some potential project ideas...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    So this is about parallelizing compilation of a single TU? How would this interact with parallel build systems?

    Comment


    • #3
      This speedup will always be sublinear, while compiling multiple files in parallel can scale linearly (assuming no bottlenecks like IO/shared Caches/Memory).
      The related cleanups are still useful, and might help if there are plans for a clangd like compile-server.

      Comment


      • #4
        Originally posted by atomsymbol

        What about the case when multiple C/C++ files being compiled in parallel #include a common include file, such as <QString>? In such a case, parsing <QString> a single time instead of 12 times on a 12-core CPU is in the total elapsed wallclock time of the parallel build faster than parsing <QString> 12 times in parallel, because the 12-1=11 processes could be doing something else while <QString> is being parsed.
        I would say that the parsing of include files dwarfs in comparison with the actual compilation of the source file, so much that optimizing that would probably not save much real runtime.

        Comment


        • #5
          Originally posted by discordian View Post
          This speedup will always be sublinear, while compiling multiple files in parallel can scale linearly (assuming no bottlenecks like IO/shared Caches/Memory).
          The related cleanups are still useful, and might help if there are plans for a clangd like compile-server.
          How does the clangd compile-server compare with the age old ccache? Or can they not be compared at all?

          Comment


          • #6
            Originally posted by atomsymbol

            What about the case when multiple C/C++ files being compiled in parallel #include a common include file, such as <QString>? In such a case, parsing <QString> a single time instead of 12 times on a 12-core CPU is in the total elapsed wallclock time of the parallel build faster than parsing <QString> 12 times in parallel, because the 12-1=11 processes could be doing something else while <QString> is being parsed.
            Thats not happening, buildsystems invoke gcc once per file. Threading up the compilation of a single file will take more cummulative time than just running over it serially.

            Originally posted by F.Ultra View Post

            How does the clangd compile-server compare with the age old ccache? Or can they not be compared at all?
            Not at all, clangd is currently a server for code completion in editors and the like. For gcc to be able to be turned into a server you will need to rip out global state (like its part of the GSOC Task as far as i understand).

            ccache runs per file, acompile-server could do much more, like Guest implied. the buildsystem would just queue up alot of stuff to build, a single server would then be able to sort common includes, template instantiations and code snippets for the whole project (and even previous runs). Then intelligently cache common constructs at various levels (files, preprocessed and preoptimized code, classes, ...).

            Comment


            • #7
              Originally posted by discordian View Post
              This speedup will always be sublinear, while compiling multiple files in parallel can scale linearly (assuming no bottlenecks like IO/shared Caches/Memory).
              This depends. One case I recently hit is I need to build a binary with a bunch of header only libraries.
              "make -j" can finish other parts quickly, but then I need to wait for ~1 min for this single binary.

              Comment


              • #8
                Originally posted by atomsymbol

                What about the case when multiple C/C++ files being compiled in parallel #include a common include file, such as <QString>? In such a case, parsing <QString> a single time instead of 12 times on a 12-core CPU is in the total elapsed wallclock time of the parallel build faster than parsing <QString> 12 times in parallel, because the 12-1=11 processes could be doing something else while <QString> is being parsed.
                That is already a solved problem with precompiled headers.

                Comment


                • #9
                  Originally posted by zxy_thf View Post
                  This depends. One case I recently hit is I need to build a binary with a bunch of header only libraries.
                  "make -j" can finish other parts quickly, but then I need to wait for ~1 min for this single binary.
                  You need more CPU time (all cores combined) than doing everything naively on one Core, hence sublinear scaling.
                  The assumption is, that you can do something else like compiling other files at the same time, which should be true most of the time.

                  On your specific example, you cant easily parallelize processing headers, as the order they are included matters (or could matter, but the compiler needs to assume they matter and guarantee correct results).

                  Comment


                  • #10
                    Originally posted by carewolf View Post
                    That is already a solved problem with precompiled headers.
                    Its far from transparent to the buildsystem, compilerspecific and easy to break (differing macros or settings) - and barely used because of that.

                    C++ Modules ought to tackle this issue, I hope for the best but I am somewhat expecting this to take a long time until all kinks are known and solved for buildsystems.

                    Comment

                    Working...
                    X