Originally posted by oleid
View Post
Announcement
Collapse
No announcement yet.
Fast Kernel Headers v2 Posted - Speeds Up Clang-Built Linux Kernel Build By ~88%
Collapse
X
-
- Likes 4
-
Originally posted by atomsymbol View Post
[...]
As to other approaches attempted: I also tried enabling the kernel to use pre-compiled headers (with GCC), which didn't bring a performance improvement beyond a 1-2% for repeat builds. Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again.
[...]
https://lwn.net/ml/linux-kernel/[email protected]/
So it seems that the increase of speed is due mostly to the header re-organization.
- Likes 1
Comment
-
Originally posted by CochainComplex View Post
*cough* ..as a German I'm obliged to point out that there a plenty of high quality options available. Just to name the obvious ones: Porsche, Mercedes, BMW, VW ....and then there are the acquired brands running on german engines: Lamborghini, Bugatti...but with the later ones you have issues with the shipmenttime aswell.
- Likes 3
Comment
-
Originally posted by sinepgib View Post
Because it also changed some functions to not be suggested for inlining. Depending on whether the compiler decided to follow the hint and whether that hint was clever or not, that could make runtime performance slightly better or slightly worse. I doubt the change would be huge tho.
- Likes 1
Comment
-
Originally posted by oleid View Post
Why would it change runtime speed? This is about changing the include hierarchy. There is zero chance it will have a runtime impact.
[...]
- Uninlining: there's a number of unnecessary inline functions that also couple otherwise unrelated headers to each other. The fast-headers tree contains over 100 uninlining commits.
[...]
https://lwn.net/ml/linux-kernel/[email protected]/
It is hard to say if and how much it will impact on the performance, but theoretically it is possible to have a regression (or an improvement !)
- Likes 3
Comment
-
Originally posted by kreijack View PostHe is touching also the inlining/not inlining of function
[...]
- Uninlining: there's a number of unnecessary inline functions that also couple otherwise unrelated headers to each other. The fast-headers tree contains over 100 uninlining commits.
[...]
https://lwn.net/ml/linux-kernel/[email protected]/
It's nice to have some general guidelines around function inlining. In userspace code, I typically start with a policy that functions calling 2 or more non-inlinable functions should not themselves be inline (or preferably even defined in a header-based class definition). And in C++, you have to remember that every expression which can allocate heap -> a non-inlinable function call.
Other techniques to reduce header file dependencies tend to increase reliance on the heap, which potentially comes at some runtime cost. That lets you hide the types used to implement a given programming interface, which means your public headers don't need to include the lower-level ones. Of course, C99's support for variable-length arrays means you could have an opaque type that callers allocate on the stack!
Code:char storage[obj_size()]; struct Obj *p_obj = obj_init(storage); // use obj ... obj_cleanup(p_obj);
Again, making the type opaque reduced header file dependencies, since the caller doesn't need to see the definitions of the types used inside of struct Obj. It also preserves flexibility for the implementation, enabling it to add/remove/re-arrange members in struct Obj without the caller having to recompile.
You could even hide the storage + init in a macro, but hiding too much in macros creates opportunities for unintentional misuse.Last edited by coder; 09 January 2022, 01:55 PM.
Comment
-
Originally posted by kreijack View Post
Apart the fact that the modules are for the C++ language, in the first patches set Ingo Molnar reported:
[...]
As to other approaches attempted: I also tried enabling the kernel to use pre-compiled headers (with GCC), which didn't bring a performance improvement beyond a 1-2% for repeat builds. Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again.
[...]
https://lwn.net/ml/linux-kernel/[email protected]/
So it seems that the increase of speed is due mostly to the header re-organization.
- Modules/packages are a feature that can be added to most programming languages that do not already have modules/packages. It could be added to C as well.
- Ingo's statement "Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again" is a testament to how inefficient/primitive the algorithms related to precompiled headers in the tested C/C++ compilers (gcc) are. A more appropriate name for gcc's "precompiled headers" would be "non-incremental preparsed headers". (Note: I developed, as an experiment only, an incremental compiler in the past.)
Comment
-
Originally posted by atomsymbol View Post- Modules/packages are a feature that can be added to most programming languages that do not already have modules/packages. It could be added to C as well.
Depending on how these hypothetical C modules are implemented, they could just push most of the work from compilation to the linking phase.
Originally posted by atomsymbol View Post- Ingo's statement "Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again" is a testament to how inefficient/primitive the algorithms related to precompiled headers in the tested C/C++ compilers (gcc) are. A more appropriate name for gcc's "precompiled headers" would be "non-incremental preparsed headers". (Note: I developed, as an experiment only, an incremental compiler in the past.)
I'd gladly design my code around the limitations of a more primitive toolchain than try to debug a compiler that's collapsing under the weight of its own complexity.
Comment
-
Originally posted by coder View PostThat's a bit simplistic. I'm not sure there aren't a few things in C that needlessly make work harder for compilers than it needs to be, but I also wonder if the comment about Pascal isn't presuming the same degree of optimizations.
Originally posted by coder View PostIt's also worth considering that how you use a language has a lot to do with compile times. I once had a template-heavy C++ file that took a couple minutes to compile and doing so used a couple GB of memory. Once I eliminated some unnecessary template parameter type deduction, it took only a few seconds to compile and I think memory usage dropped accordingly. Although this particularly bad example involves C++, I can imagine things one might do in C that also create needless burden, such as having lots of overlong and inline functions.
Part of the software engineering discipline is understanding how to use programming language features in a scalable and maintainable way.Last edited by bug77; 09 January 2022, 06:57 PM.
- Likes 1
Comment
-
Originally posted by atomsymbol View PostI don't understand why you believe what you believe, because:
- Modules/packages are a feature that can be added to most programming languages that do not already have modules/packages. It could be added to C as well.
Originally posted by atomsymbol View Post- Ingo's statement "Plus during development most of the precompiled headers are cache-cold in any case and need to be generated by the compiler again and again" is a testament to how inefficient/primitive the algorithms related to precompiled headers in the tested C/C++ compilers (gcc) are. A more appropriate name for gcc's "precompiled headers" would be "non-incremental preparsed headers". (Note: I developed, as an experiment only, an incremental compiler in the past.)
E.g., if the fault is in the gcc, LVM will show better results (but I assume that Ingo already tested it).
Anyway reading a 2nd time the Ingo statement, I don't understand what he said. The linux source are about 8GB (source+git+ .o files), so even considering the memory usage of parallel compilers, a high end machine ( e.g. 64GB of ram, which is not an impossible target for the today standard) should be enough to not show a "cold cache" problem.
BR
G.Baroncelli
- Likes 1
Comment
Comment