Announcement

Collapse
No announcement yet.

Fast Kernel Headers v2 Posted - Speeds Up Clang-Built Linux Kernel Build By ~88%

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by atomsymbol
    Apart the fact that the modules are for the C++ language, in the first patches set Ingo Molnar reported:

    [...]
    As to other approaches attempted: I also tried enabling the kernel to use pre-compiled headers (with GCC), which didn't bring a performance improvement beyond a 1-2% for repeat builds. Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again.
    [...]




    So it seems that the increase of speed is due mostly to the header re-organization.

    Comment


    • #32
      Originally posted by CochainComplex View Post

      *cough* ..as a German I'm obliged to point out that there a plenty of high quality options available. Just to name the obvious ones: Porsche, Mercedes, BMW, VW ....and then there are the acquired brands running on german engines: Lamborghini, Bugatti...but with the later ones you have issues with the shipmenttime aswell.
      Ford "invented" the assembly line with high speed, mass produced vehicles and, because of that, was able to ship out vehicles 88% faster.

      Comment


      • #33
        Originally posted by sinepgib View Post

        Because it also changed some functions to not be suggested for inlining. Depending on whether the compiler decided to follow the hint and whether that hint was clever or not, that could make runtime performance slightly better or slightly worse. I doubt the change would be huge tho.
        Another thing that might change is auto-inlining. I don't know if or how often the Linux kernel uses static definitions in the headers, but if those definitions have been moved to C files, they now can only be auto-inlined if LTO is enabled.

        Comment


        • #34
          Originally posted by oleid View Post

          Why would it change runtime speed? This is about changing the include hierarchy. There is zero chance it will have a runtime impact.
          He is touching also the inlining/not inlining of function


          [...]
          - Uninlining: there's a number of unnecessary inline functions that also couple otherwise unrelated headers to each other. The fast-headers tree contains over 100 uninlining commits.
          [...]

          https://lwn.net/ml/linux-kernel/[email protected]/


          It is hard to say if and how much it will impact on the performance, but theoretically it is possible to have a regression (or an improvement !)

          Comment


          • #35
            Originally posted by kreijack View Post
            He is touching also the inlining/not inlining of function

            [...]
            - Uninlining: there's a number of unnecessary inline functions that also couple otherwise unrelated headers to each other. The fast-headers tree contains over 100 uninlining commits.
            [...]

            https://lwn.net/ml/linux-kernel/[email protected]/
            Yup, having too many include dependencies suggests insufficient use of abstractions. And a very basic abstraction is the functional call.

            It's nice to have some general guidelines around function inlining. In userspace code, I typically start with a policy that functions calling 2 or more non-inlinable functions should not themselves be inline (or preferably even defined in a header-based class definition). And in C++, you have to remember that every expression which can allocate heap -> a non-inlinable function call.

            Other techniques to reduce header file dependencies tend to increase reliance on the heap, which potentially comes at some runtime cost. That lets you hide the types used to implement a given programming interface, which means your public headers don't need to include the lower-level ones. Of course, C99's support for variable-length arrays means you could have an opaque type that callers allocate on the stack!

            Code:
            char storage[obj_size()];
            struct Obj *p_obj = obj_init(storage);
            
            // use obj ...
            
            obj_cleanup(p_obj);
            Obviously, if the definition of struct Obj is hidden, then you have 3 non-inline function calls, whereas making it public might've enabled having just 2 inline functions (obj_init() and obj_cleanup()). However, assuming it's a heavy-weight type, that's not much overhead. If it's actually a lighter-weight type, then calling those + obj_size() is still a heck of a lot cheaper than a pair of calls to malloc() + free()!

            Again, making the type opaque reduced header file dependencies, since the caller doesn't need to see the definitions of the types used inside of struct Obj. It also preserves flexibility for the implementation, enabling it to add/remove/re-arrange members in struct Obj without the caller having to recompile.

            You could even hide the storage + init in a macro, but hiding too much in macros creates opportunities for unintentional misuse.
            Last edited by coder; 09 January 2022, 01:55 PM.

            Comment


            • #36
              Originally posted by atomsymbol
              - Modules/packages are a feature that can be added to most programming languages that do not already have modules/packages. It could be added to C as well.
              Being such a low-level/zero-overhead language, I doubt C will get modules.

              Depending on how these hypothetical C modules are implemented, they could just push most of the work from compilation to the linking phase.

              Originally posted by atomsymbol
              - Ingo's statement "Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again" is a testament to how inefficient/primitive the algorithms related to precompiled headers in the tested C/C++ compilers (gcc) are. A more appropriate name for gcc's "precompiled headers" would be "non-incremental preparsed headers". (Note: I developed, as an experiment only, an incremental compiler in the past.)
              Carrying around more persistent state seems like it'd significantly increase the complexity of the compiler, as well as multiplying the opportunities for things to go wrong.

              I'd gladly design my code around the limitations of a more primitive toolchain than try to debug a compiler that's collapsing under the weight of its own complexity.

              Comment


              • #37
                Originally posted by coder View Post
                That's a bit simplistic. I'm not sure there aren't a few things in C that needlessly make work harder for compilers than it needs to be, but I also wonder if the comment about Pascal isn't presuming the same degree of optimizations.
                Of course it's simplistic, it's post in a forum

                Originally posted by coder View Post
                It's also worth considering that how you use a language has a lot to do with compile times. I once had a template-heavy C++ file that took a couple minutes to compile and doing so used a couple GB of memory. Once I eliminated some unnecessary template parameter type deduction, it took only a few seconds to compile and I think memory usage dropped accordingly. Although this particularly bad example involves C++, I can imagine things one might do in C that also create needless burden, such as having lots of overlong and inline functions.

                Part of the software engineering discipline is understanding how to use programming language features in a scalable and maintainable way.
                That's what I was hinting at. You need lightning fast feedback, you use a scripting language. You need features, you pick a language that offers them. Know your tool, bend it to your will. Instead, I keep hearing whining about "this is slow to compile" or "this language doesn't have that feature". Makes me sad.
                Last edited by bug77; 09 January 2022, 06:57 PM.

                Comment


                • #38
                  Originally posted by atomsymbol
                  I don't understand why you believe what you believe, because:
                  - Modules/packages are a feature that can be added to most programming languages that do not already have modules/packages. It could be added to C as well.
                  I believe what I believe, because the modules doesn't exist for C; and you confirmed that . I suppose that Ingo doesn't have any wish to develop a better C compiler, but he is trying (with very good results) to rearrange the headers to get the speed improvement that he want.

                  Originally posted by atomsymbol
                  - Ingo's statement "Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again" is a testament to how inefficient/primitive the algorithms related to precompiled headers in the tested C/C++ compilers (gcc) are. A more appropriate name for gcc's "precompiled headers" would be "non-incremental preparsed headers". (Note: I developed, as an experiment only, an incremental compiler in the past.)
                  Ingo showed that the current state of art (the pre-compiled header) didn't give any advantage. If you know a compiler with a better support for the pre-compiled header/modules..., show the number of the progress and everybody (me first) will be happy.

                  E.g., if the fault is in the gcc, LVM will show better results (but I assume that Ingo already tested it).

                  Anyway reading a 2nd time the Ingo statement, I don't understand what he said. The linux source are about 8GB (source+git+ .o files), so even considering the memory usage of parallel compilers, a high end machine ( e.g. 64GB of ram, which is not an impossible target for the today standard) should be enough to not show a "cold cache" problem.

                  BR
                  G.Baroncelli

                  Comment


                  • #39
                    Originally posted by Michael View Post

                    Unfortunately I don't have any Model T or any other car pictures I've taken that would be relevant... when thinking of what to add for an image around speed, Ferrari came to mind as had some pictures during an AMD party at Ferrari HQ a few years ago.
                    Here, you can borrow one i took, It's only fitting as i'm about to build this on a PowerBook G4 1.67Ghz - it took well over a day building a Debian kernel last time. Cheers , I'll follow up on how it goes. IMG_20200427_120506-2.jpg

                    Comment


                    • #40
                      Originally posted by tildearrow View Post
                      Problem is, Phoronix is a very technical site, and for technical articles it is difficult to associate images with them, especially when the concepts are so abstract that they don't exist or cannot be represented with an object.
                      I also find it problematic having pictures in an article just for the sake of having a picture in there. When users click it they expect something but if there is no ferrari or even a car mentioned in the whole article, then it probably was the wrong picture or it would have been better to not have any picture at all.

                      I regularly visited a german website for IT news and at some point they started to include pictures in every article because it somehow got them a better scoring by google or something in that line. Problem was they diddn't even select the pitures them selfes but an AI did it with some keywords from the text. No need to say, that barely any picture really fitted.

                      Comment

                      Working...
                      X