SysVinit 3.11 Released With An "Important Feature" At Long Last

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • wertigon
    Senior Member
    • Jan 2020
    • 299

    #61
    Originally posted by Akiko View Post
    Okay, so let me tell some of my experience as a developer who worked at one of the biggest Linux distributors building professional distributions for (at that time) unique systems. In my office back then I was surrounded by a AlphaSever DS20 and DS20E, HP C3750 and about 2 years ago I killed my HP C8000. I was surrounded by AMD Sledgehammer engineering samples and UltraSparce systems. I had remote access to SGI Altix systems with 512 and 1024 CPUs (Itanium) to investigate and fix bugs. We had a bug where the customer reported this: SGI Altix 1024 CPUs, 1024 GiB RAM, runs HPC application over weeks, highly tuned to run with exactly 1024 threads, but sometimes a CPU runs two threads and another runs nothing, the HPC runtime increased about 25% because of this ... please fix this scheduling issue, because CPU time is expensive.

    To make it short: On HPC system doing serious work, running for weeks or months, you really want a predictable runtime behavior. Having daemons running on the system, which start to do some work like deleting or swapping out journals and eating into your IO or CPU time can become a nightmare. This is the reason why running systemd on a HPC system may be a bad idea. And yes, I know this is an extreme example. But I want to demonstrate you, that I do not throw bullshit around. I tell this, because I encountered these issues. I really try to see the good and the bad.
    I hope you are aware of the PREEMPT_RT, nice, cpulimit and cgroup tools/technology that, combined, nullifies like 99.99999% of your issues?

    Even then, in HPC you have like, what? 32 threads or more to work with? Is it so bad lending one of those threads once every 100th hour for system maintenance work? If you need a consistent throughput, why not simply lock system work to thread #0 and #1 always and use the other 30 threads for workloads?

    It smells like this is a skill issue or a badly configured system issue, not a problem with the technology itself. I have had PREEMPT_RT systems in soft realtime work for months without a single missed deadline, and systemd was part of that setup, so...

    Comment

    • tobias
      Phoronix Member
      • Nov 2016
      • 57

      #62
      Originally posted by ahrs View Post
      It doesn't complicate anything. The two are functionally identical. As long as both result in the process getting restarted after failure then does it really matter?
      You have one more process, (a bit of) code that runs in that process, the build system to build that extra binary, the installer that copies the extra binary, ... . Why have any of that when -- as you say -- the functionality is identical?

      Originally posted by ahrs View Post
      On stability and security, you could argue that taking process supervision out of PID1 simplifies things
      Thats my point: How is it simpler to have a process over *not* having that process?

      Originally posted by ahrs View Post
      and makes it easier to assert that the system is secure (a flaw in the process supervisor can't result in the highly privileged PID1 - which is running as root with the keys to your entire system - being compromised).
      It makes it easier to assert that PID1 is secure. Its always easy to argue over the security of a process that does nothing productive for the system.

      It does help in no way to argue that PID2 is secure and probably makes arguing over the security of PID1 + PID2 a tiny bit harder. And you said before that PID2 may not crash and it obviously also needs to be secure as it also runs as root. So that is the same requirements you want for PID1.

      Comment

      • Akiko
        Phoronix Member
        • Sep 2017
        • 60

        #63
        Originally posted by wertigon View Post
        I hope you are aware of the PREEMPT_RT, nice, cpulimit and cgroup tools/technology that, combined, nullifies like 99.99999% of your issues?

        Even then, in HPC you have like, what? 32 threads or more to work with? Is it so bad lending one of those threads once every 100th hour for system maintenance work? If you need a consistent throughput, why not simply lock system work to thread #0 and #1 always and use the other 30 threads for workloads?
        I give you a hint: Altix systems, 2004/2005, kernel 2.4 or 2.6, not x86 ...

        Originally posted by wertigon View Post
        It smells like this is a skill issue or a badly configured system issue, not a problem with the technology itself. I have had PREEMPT_RT systems in soft realtime work for months without a single missed deadline, and systemd was part of that setup, so...
        To me it looks like a reading comprehension/"search engine usage" issue...
        Last edited by Akiko; 23 October 2024, 04:54 AM.

        Comment

        • ahrs
          Senior Member
          • Apr 2021
          • 550

          #64
          Originally posted by tobias View Post
          And you said before that PID2 may not crash and it obviously also needs to be secure as it also runs as root. So that is the same requirements you want for PID1.
          Your process supervisor does not need to run as root. Obviously, if it is supervising something that runs as root and that thing does not use capabilities(7) then it needs to be ran as root and the same security issues that apply to PID1 apply to the supervisor too, but in the common case you can run the supervisor unprivileged.

          Comment

          • tobias
            Phoronix Member
            • Nov 2016
            • 57

            #65
            Originally posted by ahrs View Post
            Your process supervisor does not need to run as root. Obviously, if it is supervising something that runs as root and that thing does not use capabilities(7) then it needs to be ran as root and the same security issues that apply to PID1 apply to the supervisor too, but in the common case you can run the supervisor unprivileged.
            A service supervisor can by definition manage services on the system. That makes it a security critical task, independent of what user it runs as.

            Comment

            • access
              Senior Member
              • Dec 2019
              • 195

              #66
              Originally posted by Akiko View Post

              I give you a hint: Altix systems, 2004/2005, kernel 2.4 or 2.6, not x86 ...
              Sir, this is 2024. We have better tools now.

              Comment

              • wertigon
                Senior Member
                • Jan 2020
                • 299

                #67
                Originally posted by Akiko View Post

                I give you a hint: Altix systems, 2004/2005, kernel 2.4 or 2.6, not x86 ...
                Wait, so your gripe is that systemd does not support 20 year old legacy systems?

                First, you do know "HPC" stands for "High Performance Computing" right? The i3 14100 based system I built for my dad is not only running circles around those ancient systems, it is doing it while cartwheeling and juggling fire. Those ancient systems were good for their time but today they are basically e-waste.

                These days 16 cores / 32 threads is a bare minimum threshold for HPC. Even the last gen Raspberry Pi 4 have 8 threads to play around with.

                There was a time when systemd critique was justified, but these days it just works and excels at most tasks. And that is why it is slowly becoming as embedded into the Linux world as it is. Because it is just working.

                If you want to run a Linux system on an Arduino or FPGA system with < 64 MB flash storage, then yes, sure, systemd is not a good fit for that (although, one could argue, neither is Linux - go check out FreeRTOS). For all other use cases, it is just amazing compared to the competition.

                Comment

                • intelfx
                  Senior Member
                  • Jun 2018
                  • 1083

                  #68
                  Originally posted by Akiko View Post
                  You don't really read the thread, don't you? I already explained because of systemd becoming a standard that other software dropped init scripts (you now have to rewrite) and even introduced a dependency to systemd (getting this removed is even harder). You need to do more customization.
                  Oh, I did read the thread. And I exceeded my daily facepalm quota while doing so.

                  If you're at a "remove udev" level of customization, then the requirement to write a bunch of init scripts (won't be many, for obvious reasons) for your custom init won't even be a blip on the amount of work you already need to do.

                  Originally posted by Akiko View Post
                  I clearly stated that I had a customer who did serious HPC work. Why do you twist my words?
                  Because it's a distinction without difference. If you worked for a customer who did serious HPC work, then you still oughta know all of this.

                  Originally posted by Akiko View Post
                  Yeah, I see your problem. You got the "I have a hammer and everything looks like a nail" problem, well, in that case "I'm an admin and now I can fix everything by configuration/administration". See, I gave you an example and you should have looked up what an Altix system is, when it was used, what kernels where used at that time by professional distributions (we talk about certifications here which take months and a change of single software would nullify the certification), and then look up what was possible in these kernel versions. I know that today you have a lot more features you can use.
                  No sir, I have an "I'm a generalist" problem. Which means that I have knowledge and know how to apply it at (almost) every level of the technology stack, simultaneously. And it irks me when people who clearly have less of that knowledge talk and opine as if they had more.

                  And I know enough about that brand to realize instantly that it was a meaningless word salad example, simply because they had't contained _any_ of the tech we're discussing here now, and therefore I didn't need to waste my time looking any deeper. Which you have just proven, thanks.

                  Originally posted by Akiko View Post
                  Do you want to fight and get personal or do you actually want to learn something?
                  I always want to learn something. Your posts here simply don't contain anything worth learning.

                  Originally posted by Akiko View Post
                  If the later, just dig into the rabbit-hole "what the introduction of CPU caches, out of order execution and speculative execution means for predictable behavior".
                  Dude. I taught a university-level course on parallel/concurrent programming, with an addend on microarchitectural effects in this context. And I taught a course on Linux architecture. And I can confidently say that you've just thrown out a word salad that's totally, absolutely, incontrovertibly irrelevant to the scale of effects we are talking about.

                  That is to say, given a process that wakes up as infrequently as your typical systemd daemon (of the variety that actually will be present on a HPC cluster, if it was designed by someone other than a complete inept noob), its average effects on microarchitectural state will be exactly nil.

                  And you still haven't said anything about unbound kthreads, which suggests you don't know anything about them.

                  I'll give you a hint, though. You know what will have greater microarchitectural effects? The goddamn timer tick. Unless the CPU is running full tickless, that is, (which it actually should, if you've really got a HPC cluster of the scale and sensitivity you're talking about), in which case there simply won't be any other processes scheduled on that CPU, by definition, because full tickless CPUs are (must be) non-schedulable.

                  Originally posted by Akiko View Post
                  And I will ignore you now until you show some decent human behavior.
                  Oh by all means, go ahead! My responses in this thread are not for you. They are to combat dis-/misinformation for everyone else.​

                  Comment

                  • ahrs
                    Senior Member
                    • Apr 2021
                    • 550

                    #69
                    Originally posted by tobias View Post

                    A service supervisor can by definition manage services on the system. That makes it a security critical task, independent of what user it runs as.
                    A service manager, a la Systemd, does, yes, but a process supervisor does not. A process supervisor simply runs the same task repeatedly. Service management is delegated to the init system. It's a very different way of doing things even though the outcome is the same.

                    Comment

                    • tobias
                      Phoronix Member
                      • Nov 2016
                      • 57

                      #70
                      Originally posted by ahrs View Post

                      A service manager, a la Systemd, does, yes, but a process supervisor does not. A process supervisor simply runs the same task repeatedly. Service management is delegated to the init system. It's a very different way of doing things even though the outcome is the same.
                      So we have PID1, PID2 and PID3 now, with PID2 and PID3 sharing some code to restart processes. That gets more and more complex. In the end the code proving the functionality you want needs to be *somewhere*. We do agree that the functionality provided is similar. Lets assume providing this functionality requires a certain amount of complexity to do. That is the minimim amount of complexity we have in the system independent of how we implement that functionality. Any practical implication will be at least as complex as that, but in practice any implementation will be more complex as the implementation itself will add something on top. We just disagree about that "on-top complexity".

                      I understand your position to be that we need small and simple units of code to review, separated by strong process boundaries. My position is that those process boundaries themselves add complexity and can be avoided unless there is also a security boundary between those bits of code.

                      To me systemd does enough using SW design to keep bits of functionality separated from other bits of functionality, so that I can review small bits of code at a time. That way they avoid the process separation "on top complexity", but they of course add SW design "on top complexity".

                      I doubt we will agree on what is less "on top complexity" overall. I think our individual backgrounds play into this too strongly.

                      Comment

                      Working...
                      X