Announcement

Collapse
No announcement yet.

MuQSS/CK's Con Kolivas Becoming Concerned Over The Increasing Size Of The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41
    Originally posted by set135
    but the number of developers and companies contributing just keeps increasing, and given the amount of code integrated, it seems to be holding up phenomenally.
    While what you're saying is true, you are also ignoring something important: The cost of contributing.

    Linux got so complex that even small changes take a lot of effort, while large ones aren't really possible anymore. This is due to its poor design.

    I have no doubt that something else will gain traction and the industry will leave Linux behind. What I do not know is when, but I suspect it's not going to take long.

    Comment


    • #42
      Originally posted by xfcemint View Post
      I didn't read that yet, so thanks for pointing it out, I'll certainly take a look at it.

      I don't have any idea how this is possible, I mean, how does the microkernel mitigate the cost of all the required context switches? I think that context switches mostly trash the cache on current gen x86 CPUs. And all the CPU registers need to be saved, etc etc. So I don't get it.

      Perhaps Liedtke is being too optimistic?
      Re: Liedtke's, he is not optimistic anymore; he is dead.

      Once you're done with Liedtke's paper (or before; that paper is a heavy read), check http://sigops.org/s/conferences/sosp...lphinstone.pdf and https://sel4.systems/About/seL4-whitepaper.pdf.

      Comment


      • #43
        Originally posted by xfcemint View Post
        The major advatage of Linux: it was working, while others were not.
        Do read about the BSD lawsuit. That's the real reason Linux took off, rather than some BSD.

        Comment


        • #44
          Originally posted by Danny3 View Post

          Hmm... this is weird and I don't understand how is this possible.
          The ticket is still open and there's no new Virtualbox release saying that they have added support for 5.8
          Which distro are you using ?
          /usr/src/linux-5.8.1$ uname -srvmpio
          Linux 5.8.1 #2 SMP PREEMPT Tue Aug 11 22:48:17 CEST 2020 x86_64 Intel(R) Core(TM) i5-7300HQ CPU @ 2.50GHz GenuineIntel
          /usr/src/linux-5.8.1$ vboxmanage --version
          6.1.13r139785

          Slackware. Usually. I've been rolling my own kernels every release for the past... 22 years?
          Use the latest 6.1.x daily build. Make sure you use latest extpack test build if you're using USB passthrough.
          https://www.virtualbox.org/download/...inux_amd64.run
          https://www.virtualbox.org/download/...0.vbox-extpack
          I don't use distro packaged virtualbox. It's probably less integrated, but I only use vanilla installs.
          Atleast it's somewhat distro agnostic and I don't have to wait for slow distro managment to release new packages.
          Use at own risk. I can't guarantee it won't chew your computer to bits.

          Can't remember how I solved Nvidia. I think it requires some IOMMU definitions. I just fixed it.
          I usually didn't have IOMMU pagetable support enabled in my custom kernels. They are really minimalistic.

          Something like:
          < # Generic IOMMU Pagetable Support
          < #
          < # end of Generic IOMMU Pagetable Support
          <
          < # CONFIG_IOMMU_DEBUGFS is not set
          < # CONFIG_IOMMU_DEFAULT_PASSTHROUGH is not set
          < # CONFIG_AMD_IOMMU is not set
          < CONFIG_DMAR_TABLE=y
          < CONFIG_INTEL_IOMMU=y
          < CONFIG_INTEL_IOMMU_SVM=y
          < CONFIG_INTEL_IOMMU_DEFAULT_ON=y
          < CONFIG_INTEL_IOMMU_FLOPPY_WA=y
          < CONFIG_INTEL_IOMMU_SCALABLE_MODE_DEFAULT_ON=y
          < CONFIG_IRQ_REMAP=y

          I think that defines:
          < CONFIG_MMU_NOTIFIER=y
          Required by the nvidia driver or something.
          Don't quote me on it. Memory is very short.
          Last edited by milkylainen; 21 August 2020, 04:28 PM.

          Comment


          • #45
            Originally posted by Danny3 View Post

            Hmm... this is weird and I don't understand how is this possible.
            The ticket is still open and there's no new Virtualbox release saying that they have added support for 5.8
            Which distro are you using ?
            I personally am using virtualbox on Arch linux. I use the in repo dkms driver with a 5.8.x kernel I built. Just had to find a compatible extpack for usb support. Seems to work fine.

            Comment


            • #46
              Originally posted by xfcemint View Post
              Immediately, I'm not a fan of the minimality principle. I don't like when any "principle" is introduced and then stubbornly raised to a level of unmistakable proclamation of God.

              So, in my opinion, minimality is good for fitting kernel code into the cache. It's bad when it slows processes down due to being too restrictive. So it's a tradeoff, not some God-given principle.

              Some good things about L4:
              - tries not to invoke the scheduler for IPC (well, that's quite an obvious optimization)
              - immediately passes the message to the receiver, while the caller blocks (seems like a somewhat good idea, in many circumstances)
              - can do some memory copies as part of IPC, but avoids buffering

              Perhaps the cost of L4 IPC is negligible, but, that is still slower than a monolithic kernel.
              Not an "unmistakable proclamation of God", but solid research. This is in good part why context switches are so fast in L4 and seL4. You'll have to elaborate on what you think "restrictive" is. This realization (minimality) is most of what makes 2nd generation microkernels leave the 1st generation ones in the dust.

              As for "still slower than a monolithic kernel", I'll leave it as an exercise for you to figure out how many extra context switches seL4 would have to make in a pure microkernel multiserver operating system to be slower than ONE context switch on Linux. (spoiler: A lot. There's an order of magnitude difference in IPC cost)

              Comment


              • #47
                Originally posted by xfcemint
                So, for example, an IPC call has a high chance od trashing the cache, especially if using Spectre mitigations. Also, without carefull cache control mechanisms, an IPC call can apparently complete in a small number of cycles, but later kill the performance of the calling application by flushing it's cache (even when not mitigating Spectre). Has anyone even tried to measure real IPC in such situations?
                There's meltdown mitigation support in some seL4 versions for affected Intel CPUs. Those make context switches slower, unfortunately. Intel to blame, nothing to do with seL4.

                Originally posted by xfcemint
                So, before accepting all the claims from academics and their "solid research", I would like an explanation: how do L4-alike kernels avoid flushing the data cache, code cache and TLBs on IPC call, especially in the case when receiver-server has a long code path and has to call other servers before returning from the call?

                - the cycle counts of IPC do not include the cache refilling costs
                Thanks to the principle of minimality, the kernel barely touches any cache lines. TLBs are a problem, but many CPUs have mechanisms to deal with address space bouncing. Do note Linux doesn't leverage these.

                Originally posted by xfcemint
                Oh, I get that. But things are not that simple when moved outside research projects and experimental OS-es.

                This kind of concerns are rarely covered in academic papers, which are frequently overly optimistic and fail to address some important concerns that appear in practical usage.
                SeL4 isn't an "academic microkernel", but rather, the result of decades of experience among its authors in real-life microkernel usage (okL4). Refer to microkerneldude blog (I have read end to end and recommend doing).

                Originally posted by xfcemint
                Examples of "restrictive": only supporting blocking calls, or kernel cannot buffer IPC messages, or kernel cannot copy IPC messages...

                And I would certainly like for a microkernel to have a more versatile IPC, in particular, I want the microkernel to be able to COPY the message of arbitrary size from the caller to the receiver, and also the IPC call to be made non-blocking (asynchronous) if the caller so desires.
                Non-blocking aside (seL4 supports that, they call them notifications), realize these aren't necessary (including not necessary for performance), and do think about the effect they would have on the ability of the kernel to provide real-time behavior, and WCET analysis, even ignoring the exponentially increased cost of formal verification and cache line usage penalty on performance that comes with code size increase.

                Originally posted by xfcemint
                my estimation is that it is likely slower in general than a monolithic kernel like Linux.
                "Microkernels are slow" is unfortunately a widespread misconception, very tightly tied to the popularity of the very-slow very-academic-never-should-have-been-used-in-practice MACH system. (XNU/Darwin, etc)

                There's a neat (yet quite incisive) article from 2016 on this subject: https://blog.darknedgy.net/technology/2016/01/01/0/
                Last edited by ayumu; 22 August 2020, 11:46 AM. Reason: non-blocking ipc = notifications

                Comment


                • #48
                  Originally posted by mazumoto View Post
                  Well ... maybe 5.8.x is too large, at least for me. It's the first kernel since a long time that doesn't boot for me (neither 5.8.0 nor 5.8.1) and since it just stops at "Starting linux 5.8.x" and I can't see any further log messages or use my keyboard, I'm not too eager to try to find the reason.

                  (my System: Ryzen 3950X on a Gigabyte Aorus X570 Master, Sapphire R9 290X (Hawaii) with AMDGPU)
                  I wonder if it's the GPU or related kernel-level commands causing the problem? I my R9 390 is partially broken by 5.8; the clocks are no longer reported. I suspect other control and monitoring related issues are present and likely new to 5.8 for second-gen and possibly first-gen GCN GPUs, what with their supported-but-not-actually status.

                  because of this, I'm now on linux-ck 5.7. hope it stays on 5.7 for a bit...
                  (Ryzen 3500x, MSI B450M Gaming, XFX R9 390 (Hawaii) with AMDGPU)
                  Last edited by HenryM; 27 August 2020, 05:28 AM.

                  Comment


                  • #49
                    Originally posted by kpedersen View Post
                    The kernel gets a fair number of additions for niche SoCs and other hardware that normal users can't even buy or the company has sold in tiny batches and ceased manufacturing. This support benefits a tiny group of users and yet bloats the kernel a fair amount.

                    This stuff shouldn't really be accepted into the tree in the first place.

                    Same with additions to GCC to support things like architectures for proprietary games consoles, etc.
                    You literally have no idea what you are talking about.
                    SOCs and drivers have zero effect outside their local sphere. E.g. Changes to Sparc will have zero effect on x86_64 and ARM64, adding 50 network drivers will have zero effect on exiting network drivers and/or other devices.
                    The kernel developers do make substantial changes to the internal APIs, making life outside the tree very difficult at times, but this has nothing to with the absurd comments people throw in this thread.

                    Source: Me, I maintain a fairly large out-of-linux-tree project.
                    oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
                    oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
                    oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
                    Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

                    Comment


                    • #50
                      Originally posted by birdie View Post

                      And that brings about a fundamental problem with the Linux kernel development: it's either all-in or it's out of the treee and it's withering away because it requires an insane amount of work to be maintained. All because the kernel lacks stable APIs/ABIs.

                      And that's the reason why the Linux kernel overall features a ... very bad model of development. It must include support for pretty much everything under the Sun which is not possible, feasible or even rational.
                      As the saying goes: "You live by the knife, you die by the knife."
                      You *choose* to live out of tree, you suffer the consequences.
                      Having a stable API puts huge chains on the kernel development, and can be easily achieved by downstream distributions (E.g. RHEL).

                      Gilboa
                      oVirt-HV1: Intel S2600C0, 2xE5-2658V2, 128GB, 8x2TB, 4x480GB SSD, GTX1080 (to-VM), Dell U3219Q, U2415, U2412M.
                      oVirt-HV2: Intel S2400GP2, 2xE5-2448L, 120GB, 8x2TB, 4x480GB SSD, GTX730 (to-VM).
                      oVirt-HV3: Gigabyte B85M-HD3, E3-1245V3, 32GB, 4x1TB, 2x480GB SSD, GTX980 (to-VM).
                      Devel-2: Asus H110M-K, i5-6500, 16GB, 3x1TB + 128GB-SSD, F33.

                      Comment

                      Working...
                      X