Announcement

Collapse
No announcement yet.

Massive ~2.3k Patch Series Would Improve Linux Build Times 50~80% & Fix "Dependency Hell"

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by coder View Post
    I just wonder why this wasn't broken up into more stages of re-factoring. Perhaps the initial goal was more modest, but each new round of cleanups exposed more opportunities and the changes simply ballooned until they touched nearly everything.

    It'd be easier, less painful, and lower-risk to make such changes in multiple stages.
    That is what will probably happen, look at the mailing list recent conversation.
    I think the point is to get attention that the overall 70% build time reduction, if you would send small bath of patches where some of them only improve a few percentage some might say that it is not worth it and reject it. Reading the technical explanation in that patch I would say that commits work like a snow ball rolling down the hill, it start small and get's bigger and bigger but without the small start you wouldn't have nothing.

    Comment


    • #32
      He got 4 big architectures working on this patch set, going to need help/time on the arm variants and other rarer archs

      Comment


      • #33
        Originally posted by bofkentucky View Post
        He got 4 big architectures working on this patch set, going to need help/time on the arm variants and other rarer archs
        If he'd made changes with more restraint, such that the executable sections in the resulting object files shouldn't change, then he could've validated the changes by dumping the section contents with a tool like readelf and then diff'd them.

        Comment


        • #34
          Originally posted by coder View Post
          I just wonder why this wasn't broken up into more stages of re-factoring. Perhaps the initial goal was more modest, but each new round of cleanups exposed more opportunities and the changes simply ballooned until they touched nearly everything.

          It'd be easier, less painful, and lower-risk to make such changes in multiple stages.
          I came here to say the exact same thing. Assuming that each of these patches isn't directly linked to at least one of the others, this really should have been more split up. While I'm always in-favor of removing inefficient, outdated, obsoleted, or orphaned code, to do this much all at once basically makes it impossible to know what exactly was the cause if anything goes wrong. And with over 2k patches, there will be problems. I trust Ingo is no fool and that the vast majority of his work is valid, but he's not perfect.

          But, if all goes smoothly, this work will be very valuable.
          Last edited by schmidtbag; 03 January 2022, 03:57 PM.

          Comment


          • #35
            Originally posted by atomsymbol
            I don't understand why you believe it is amazing. It still remains to be 1000-times slower than what is theoretically possible.
            And I don't understand why you don't just rescue the human race from its pathetic existence within reality and haul us off into the theoretical Q-continuum. Should take you only a snap with your fingers.

            Comment


            • #36
              Originally posted by coder View Post
              I just wonder why this wasn't broken up into more stages of re-factoring. Perhaps the initial goal was more modest, but each new round of cleanups exposed more opportunities and the changes simply ballooned until they touched nearly everything.

              It'd be easier, less painful, and lower-risk to make such changes in multiple stages.
              Besides what dragonn said, it might be the case that some changes in headers make you cascade down to lots of other files. It happened to me even in small projects. That's one of the reasons header dependency hell is an annoying situation, it makes self-contained changes that much harder to achieve.

              Comment


              • #37
                Originally posted by coder View Post
                I just wonder why this wasn't broken up into more stages of re-factoring. Perhaps the initial goal was more modest, but each new round of cleanups exposed more opportunities and the changes simply ballooned until they touched nearly everything.

                It'd be easier, less painful, and lower-risk to make such changes in multiple stages.
                This was addressed directly in the cover letter.

                A justified question would be: why on Earth 2,200 commits??

                Turns out it's not easy to reduce header dependencies, at all:

                - When I started this project, late 2020, I expected there to be maybe 50-100 patches. I did a few crude measurements that suggested that about 20% build speed improvement could be gained by reducing header dependencies, without having a substantial runtime effect on the kernel. Seemed substantial enough to justify 50-100 commits.

                - But as the number of patches increased, I saw only limited performance increases. By mid-2021 I got to over 500 commits in this tree and had to throw away my second attempt (!), the first two approaches simply didn't scale, weren't maintainable and barely offered a 4% build speedup, not worth the churn of 500 patches and not worth even announcing.

                - With the third attempt I introduced the per_task() machinery which brought the necessary flexibility to reduce dependencies drastically, and it was a type-clean approach that improved maintainability. But even at 1,000 commits I barely got to a 10% build speed improvement. Again this was not something I felt comfortable pushing upstream, or even announcing. :-/

                - But the numbers were pretty clear: 20% performance gains were very much possible. So I kept developing this tree, and most of the speedups started arriving after over 1,500 commits, in the fall of 2021. I was very surprised when it went beyond 20% speedup and more, then arrived at the current 78% with my reference config. There's a clear super-linear improvement property of kernel build overhead, once the number of dependencies is reduced to the bare minimum.

                Incremental builds while doing kernel development benefit even more.
                In other words, fixing this stuff doesn't make much of a difference until it's almost all done, and nobody would agree to make all these changes without seeing a substantial payoff at the end to make it seem worth it.

                Of course, now that the work has been done to show what's possible, it's entirely possible this will get broken up into multiple stages rather than all being done at once. This is just an RFC (Request For Comments) to see what other devs think. If they say that this looks good but it should be broken up over 5 different kernel releases I'm sure that Ingo would consider that.
                Last edited by smitty3268; 03 January 2022, 04:29 PM.

                Comment


                • #38
                  Originally posted by kiffmet View Post
                  I can almost feel the future despair from users of out-of-tree modules. Bye nvidia-legacy drivers, Realtek USB Wlan, zenpower, hid-xpadneo, ZFS, PDS, fsync, AMD p-state … seems like a forceful relationship pause is incoming this year…

                  Well, I am hopeful that it won't turn out *that* bad, but using 30 patches on top of gentoo-sources has me a bit worried in this regard.
                  The solution to poorly supported hardware / drivers / modules:
                  1) use a compatible LTS kernel,
                  2) submit support requests, cross your fingers, and wait
                  3) port/contribute it yourself
                  4) stop using those products, and ideally let the manufacturer / supplier / developers know why they are losing customers.

                  That's a lot of options... not really a "relationship pause" unless you are a bleeding-edge user... in which case you should be happily prepared to adopt option 3.

                  Comment


                  • #39
                    Originally posted by smitty3268 View Post

                    I think that's atomsymbol's point, that the C language is bad about this.

                    C++ modules are supposed to improve it, but that's been pretty slow to materialize and may never end up in C.
                    It would be great if there was a compiler option to give warnings about unused header files.

                    Detecting whether an #include is used or not (ie from tools outside of a compiler) is tricky, since even build/not is insufficient as an #include can merely change a static value already provided elsewhere, resulting in different behaviour even though it built. So instead you need to bisect builds with removal of the headers, comparing the object output to ensure it remains identical.

                    Having a compiler warning re: "nothing referenced from #include" would be a hell of a lot less expensive for a large project.
                    Last edited by linuxgeex; 03 January 2022, 08:29 PM.

                    Comment


                    • #40
                      I wonder what kind of speed up using pre-compiled headers would achieve in addition to that.
                      But considering that they use raw Makefiles, I guess it would be too much of a pain to set up (well, at least from my perspective).

                      Comment

                      Working...
                      X