Announcement

Collapse
No announcement yet.

"pkill_on_warn" Proposed For Killing Linux Processes That Cause A Kernel Warning

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by drjohnnyfever View Post

    A WARN means an assertion was tripped in the kernel. That is the definition of doing something wrong.
    Right, I meant more a buggy kernel module that trips up the kernel. I could be totally wrong by the way in my assumptions. Newbie here. Trying to ask dumb questions so I can learn

    Comment


    • #12
      Originally posted by perpetually high View Post

      Right, I meant more a buggy kernel module that trips up the kernel. I could be totally wrong by the way in my assumptions. Newbie here. Trying to ask dumb questions so I can learn
      A bad program should segfault and die. A kernel warning means some code in the kernel did something unexpected. As far as I understand that shouldn't happen unless the kernel strayed outside intended behavior. It might well be possible for WARNS to be triggered by some other less serious condition but I think there are INFO messages for that kind of thing.

      Comment


      • #13
        Originally posted by drjohnnyfever View Post

        A bad program should segfault and die. A kernel warning means some code in the kernel did something unexpected. As far as I understand that shouldn't happen unless the kernel strayed outside intended behavior. It might well be possible for WARNS to be triggered by some other less serious condition but I think there are INFO messages for that kind of thing.
        I see. Thanks for explaining 👍

        Comment


        • #14
          In any case, I'd think the kernel logs should not be readable by unprivileged users. (And hopefully that wouldn't depend on the distribution.)

          (In general, not only because there might be ways to trigger warnings in other processes and keep running...)

          Comment


          • #15
            t might well be possible for WARNS to be triggered by some other less serious condition but I think there are INFO messages for that kind of thing.
            The INFO messages are the security liability the patch proposer mentioned when explaining why to add the new optional pkill behaviour, while before it was optional panick (not the default, just optional) or default INFO (which will still be the default).

            Sounds like a perfectly sane default and perfectly reasonable hardening choices depending on the deployment environment.

            Comment


            • #16
              Originally posted by drjohnnyfever View Post

              A bad program should segfault and die. A kernel warning means some code in the kernel did something unexpected. As far as I understand that shouldn't happen unless the kernel strayed outside intended behavior. It might well be possible for WARNS to be triggered by some other less serious condition but I think there are INFO messages for that kind of thing.
              AFIAU you have BUG and BUG_ON which print a stack trace and oops the kernel (because state has become corrupted and it is dangerous to continue). Then you have WARN and WARN_ON, which are recoverable, but you still want to see a stack trace and register dump to aid in debugging. The priority of kernel messages are part of the printk() function, which may or may not occur with either of those macros. I certainly do not want my kernel to panic if it can continue safely. (Perhaps disable or reset misbehaving hardware, refuse to mount a corrupt or malicous image, remount read-only, or whatever.)

              Comment


              • #17
                I think there's some unintentional mischaracterization going on: when a warn occurs, shit has already gone horribly wrong. Not bad enough to bring down the system immediately, but it is entirely possible one of two things are happening:

                1) A process on the system that triggered the warn is actively exploiting the undesired behavior to do something evil, like privilege escalation or leaking data

                2) Whatever went wrong is now causing the kernel to clobber important data, for example by writing the wrong pages to disk and corrupting the filesystem or returning incorrect results to processes

                Panic-ing on warn() is not a bad way to deal with either of those scenarios, which may or may not be happening (you don't know, and the kernel definitely doesn't). If you panic the system then any exploit that triggers a warn is no worse than a denial of service. If you panic the system early than hopefully damage from a rampaging kernel will be limited. You have *no way* to know how "safe" the system is in it's broken state.

                I can see a couple of issues with this proposal though, which some others have picked up on:

                1) how do you know what process triggered the warn? how do you know that a process triggered it at all?

                It's entirely possible that an attacker fork()'d before doing evil and so you've only killed the child. It may even be mission accomplished at that point, as the child has succeed in creating the bad state needed for the exploit and the parent can safely finish the job. You may also get a confused deputy situation, where the exploit works by getting some other process to trigger the warn() for a variety of reasons. the condition that was violated might not even be traceable back to the process that caused it at all. You might just kill a process that happened to be the next to make a system call.

                2) you're just going to let a system that's known to be in a bad state hang around? possibly even continue "doing work"?

                In theory if this were a webserver it could continue to (mis-)handle requests! How about we serve random incorrect file contents to users?

                I mentioned above how the system might be open to further attack, or might actively be corrupting data. I get that you might want to examine the system while it's still live to preserve evidence, but it probably wouldn't be a terrible idea to sleep every process and drop into single-user mode and write out a bunch of logs (or even a whole memory dump). It's wrong to be sentimental and try to heal a system with a kernel that has unknown deep-rooted brokenness.
                Developer12
                Senior Member
                Last edited by Developer12; 29 September 2021, 10:44 PM.

                Comment


                • #18
                  Just one process?
                  So, instead of
                  Code:
                  [FONT=Courier New]// do something causing warning
                  // read kernel log[/FONT]
                  attacker will use
                  Code:
                  [FONT=Courier New]auto pid = fork();
                  if (pid == 0) {
                      // do something causing warning
                  } else {
                      wait(pid);
                  }
                  // read kernel log[/FONT]
                  Sure that would fix everything. More strict options like killing process in group or killing all processes of this user will be either useless or destructive almost as kernel panic

                  Comment


                  • #19
                    Originally posted by indepe View Post
                    In any case, I'd think the kernel lots should not be readable by unprivileged users. (And hopefully that wouldn't depend on the distribution.)

                    (In general, not only because there might be ways to trigger warnings in other processes and keep running...)
                    /dev/kmsg (Linux kernel log buffer) permissions come from udev like everything else in that filesystem. From memory distros are fairly split on the world readable bit out of the box, but it's not like this is impossible or even slightly difficult to change.

                    One thing that continues to sadden me about much of the wider modern Linux community these days is this pervasive attitude that distro configuration defaults are somehow gospel and/or unchangeable. I think this attitude has either stemmed from or is the result of the fact that there are now hundreds (if not more) of distributions out there.

                    Just feels like a big part of why I think open source and unix is great is getting lost...

                    Comment


                    • #20
                      this is great. Before we had to choose between reliability and security. this is a good middle step. Ideally tolerance would be setup.

                      3 pkill on warn before panic on warn. add this with good setup;

                      optimal behavior for a pkill on warn would obviously anything not in a whitelist + offending app. so kill anything that isn't trusted, not always viable. but this would be great for security.

                      Comment

                      Working...
                      X