Announcement

Collapse
No announcement yet.

DRM Panic Handling Is Back To Being Talked About

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by ssokolow View Post

    This is about kernel panicking. It's for when the only way to recover the kernel is to reboot, the filesystem drivers are in an unknown state that could potentially corrupt data if called, and writing crash dumps to UEFI non-volatile storage to be recovered on next boot is still effectively a pie-in-the-sky idea.

    Of course, the most impressive approach I've seen is the Haiku (open-source BeOS successor) solution. A crash dumps you into KDL (Kernel Debugging Land) and KDL has a command which you can type to convert the debug output into a QR code for easy copy-pasting via smartphone.
    I know...
    ideally, the system would just recover from any errors and (or, at least) write something to the system log about what happened. However, whenever that is not possible, it is nice to be able to get some information about what happened before rebooting.

    Comment


    • #12
      Originally posted by waxhead View Post

      I am not so sure it is not possible. Have a look at Minix 3 (don't forget the 3). That is in essence a self healing kernel and that may very well be the future for operating systems.
      Minix is a microkernel which means in those cases what it's "self-healing" from aren't kernel panics but userspace crashes, where it can restart the filesystem server and resend it the data to write, or whatever. whereas with Linux and other monolithic kernels a crashing driver can crash the kernel so it's not really the same situation at all.

      Comment


      • #13
        Originally posted by karolherbst View Post

        what are you talking about? this already works and I used this many times on my system already...
        I call it "pie in the sky" because it's lovely when it works, but there are far too many situations where factors outside the kernel developers' control prevent it from working. We need a fallback and will continue to for a while.

        (x86 systems too old for UEFI, non-x86 systems without equivalent functionality, buggy UEFI systems where that functionality is broken, etc.)

        Comment


        • #14
          Originally posted by Luke_Wolf View Post

          Minix is a microkernel which means in those cases what it's "self-healing" from aren't kernel panics but userspace crashes, where it can restart the filesystem server and resend it the data to write, or whatever. whereas with Linux and other monolithic kernels a crashing driver can crash the kernel so it's not really the same situation at all.
          That's one way to think about it. Another way is to recognize that both the Linux kernel and the Minix 3 userspace have GPU drivers. When Minix 3's GPU drivers fail, they crash like a normal process and are restarted (if the system is configured for that). When Linux's GPU drivers fail, the entire system unschedules all tasks, prints a useless error message to the console (if it's even attached), and reboots.

          The whole point of Minix 3 is that these features are enabled by its architecture; it is the same situation handled differently.
          Last edited by microcode; 11 August 2016, 03:12 AM.

          Comment


          • #15
            Originally posted by microcode View Post
            The whole point of Minix 3 is that these features are enabled by its architecture; it is the same situation handled differently.
            This is only partially true, I think you misunderstood Luke_Wolf's point. The architecture gives you a nice restart feature in case of crash, that's undeniable. But if your driver has a bug, or if you put the hardware in a bad state for example, you can restart it over and over again without much success. So basically, panic != crash. Here is an example: in the early days of the nouveau driver, I had my fair share of kernel panics of the form "GPU lockup". When the GPU locks up, the only way to recover is to reboot, this is exactly what panic means.

            Comment


            • #16
              Originally posted by ssokolow View Post

              I call it "pie in the sky" because it's lovely when it works, but there are far too many situations where factors outside the kernel developers' control prevent it from working. We need a fallback and will continue to for a while.
              If distributions do stupid setups, one could open bug reports so that they fix that. But I don't see why anything in the kernel is considered "non existent" if distributions are silly. I've got it working for a long time and all that was required was to enable some configs in the kernel, nothing more. It ain't that hard.

              Originally posted by ssokolow View Post
              (x86 systems too old for UEFI, non-x86 systems without equivalent functionality, buggy UEFI systems where that functionality is broken, etc.)
              well you talked about UEFI, of course the first and second point is invalid then. And the third one means broken efivars, which is really unlikely.

              Comment


              • #17
                I'm not sure how it works, but Windows can also restart GPU drivers when they misbehave.

                Comment


                • #18
                  Originally posted by GrayShade View Post
                  I'm not sure how it works, but Windows can also restart GPU drivers when they misbehave.
                  The same way. Windows GPU drivers are in userspace.

                  Thankfully, if I might add. There were times when some drivers crashed twice a day, and all I got were some black screens for a few seconds, back on XP or earlier it would have been a BSOD.

                  Comment


                  • #19
                    I am still looking forward to the pretty Qr code I was promised a while back

                    Comment


                    • #20
                      Originally posted by Luke_Wolf View Post

                      Minix is a microkernel which means in those cases what it's "self-healing" from aren't kernel panics but userspace crashes, where it can restart the filesystem server and resend it the data to write, or whatever. whereas with Linux and other monolithic kernels a crashing driver can crash the kernel so it's not really the same situation at all.
                      Yes absolutely, I was a bit eager when commenting there and using the word 'kernel' was a mistake.

                      http://www.dirtcellar.net

                      Comment

                      Working...
                      X