GNOME Mutter Switches To High Priority KMS Thread To Avoid Crashes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • varikonniemi
    Senior Member
    • Jan 2012
    • 1070

    #11
    Originally posted by bug77 View Post

    It doesn't seem to be an issue of the initial implementation (at least not directly). See this bit:


    The way I read it, it was happening in weird (rare?) situations. I mean, it's KMS, who would would keep calling that function to the point Gnome's thread is kept in waiting? Stranger things have happened though...
    Well, that tells you the compositor is doing something wrong, flooding the kernel with instructions it is not able to handle in realtime. If it did throttle the commands of only sending more when previous ones finished then there would be no problems.

    Comment

    • intelfx
      Senior Member
      • Jun 2018
      • 1083

      #12
      Originally posted by varikonniemi View Post

      You don't understand. It would not be killed if every piece of work was split up in manageable portions. It's only when a RT marked task spends too much time doing something it gets killed. As it's assumed to have malfuntioned as it was supposed to be RT.
      I'm afraid it's you who doesn't understand. This potentially unbounded time is spent within the kernel atomic modesetting API calls, i.e. it's ultimately a kernel bug, not Mutter's.

      Comment

      • Modu
        Phoronix Member
        • Jul 2014
        • 99

        #13
        Originally posted by woddy View Post
        ...ok so even GNOME had a series of bugs...no, because it seemed that only Plasma had bugs!
        It was the only bug I encountered in all my years with Gnome. And I could fix it by just adding an environment variable:

        Comment

        • energyman
          Senior Member
          • Jul 2008
          • 1750

          #14
          Originally posted by Topolino View Post
          All projects have bugs.

          Some projects also have proper bug management, code reviews, CI, testing and deployment and packaging on the larger distributions. This is the case here.
          No, what we are seeing is the opposite. A piece so broken, they do not bother fixing it. Instead they disable some performance relevant code and hope that their broken mess stops exploding.

          But then, gtk is a well known bad basis for development. No surprise that gnome is in a bad shape. 20 years of removing features, choice and freedom and it is still unstable.
          ​​

          Comment

          • Topolino
            Phoronix Member
            • Jun 2024
            • 97

            #15
            Originally posted by energyman View Post

            A piece so broken, they do not bother fixing it. Instead they disable some performance relevant code and hope that their broken mess stops exploding.
            ​​
            The mutter code is fine. The problem is it exposes a kernel or AMD driver problem. In most cases the focus should be fixing the parent bug but this time the workaround is actually better. Some pretty smart people figured out the KMS thread priority can be lowered at almost zero cost.

            Comment

            • varikonniemi
              Senior Member
              • Jan 2012
              • 1070

              #16
              Originally posted by intelfx View Post

              I'm afraid it's you who doesn't understand. This potentially unbounded time is spent within the kernel atomic modesetting API calls, i.e. it's ultimately a kernel bug, not Mutter's.
              I'm afraid it's you who doesn't understand.

              From your description it would mean that gnome found a kernel bug where AMS is not compatible with RT.

              But let's not get too gnomey in our fandom. The real issue is in how badly gnome is written to trigger such timeouts. Not in a kernel bug. Would you say TCPIP is broken because DDOS is possible? No, it's a feature, not a bug. DDOS leverages malicious code requesting single service wrongly. Gnome leverages buggy code requesting kernel interface wrongly.
              Last edited by varikonniemi; 16 November 2024, 07:27 AM.

              Comment

              • Topolino
                Phoronix Member
                • Jun 2024
                • 97

                #17
                Originally posted by varikonniemi View Post
                From your description it would mean that gnome found a kernel bug where AMS is not compatible with RT.
                Try search up AMD drmModeAtomicCommit() scheduler bugs and you would understand other desktops and even games suffer from this. Workarounds are all over.

                In this case the workaround is well understood, justified and benign. It might even beneficial.

                Comment

                • MrCooper
                  Senior Member
                  • Aug 2008
                  • 622

                  #18
                  Originally posted by bug77 View Post
                  I had a good laugh looking at the comments in that ticket. Whoever wrote them did not know stddev is meaningless if your distribution isn't normal.
                  Originally posted by jabl View Post
                  Hey, at least I didn't just compare one data point each and claimed an x.yzw...% difference, as I see done too often.

                  Originally posted by energyman View Post
                  No, what we are seeing is the opposite. A piece so broken, they do not bother fixing it.
                  The fundamental issues are in the kernel, they can't be fixed in mutter.

                  Instead they disable some performance relevant code and hope that their broken mess stops exploding.
                  This doesn't disable anything, it just switches the KMS thread from real-time scheduling priority to the highest non-real-time one, to avoid getting killed due to the kernel issues.

                  But then, gtk is a well known bad basis for development.​​
                  mutter isn't based on GTK, and doesn't use GTK in the compositor process anymore.​

                  Originally posted by Topolino View Post
                  The mutter code is fine. The problem is it exposes a kernel or AMD driver problem. In most cases the focus should be fixing the parent bug but this time the workaround is actually better.
                  The workaround isn't better per se, real-time priority might still be better in some cases, e.g. when there are other processes with same or higher scheduling priority contending for the CPU.

                  The problem is that it's not just a single kernel issue but many separate ones, and there seems to be very little if any interest in fixing them on the kernel side.

                  Originally posted by varikonniemi View Post
                  From your description it would mean that gnome found a kernel bug where AMS is not compatible with RT.
                  That's correct.

                  The real issue is in how badly gnome is written to trigger such timeouts.
                  That's not the case. The issue is that a single drmModeAtomicCommit call for a simple page flip sometimes results in the kernel keeping the calling thread busy (any sleep would "reset the clock" and avoid the issue) for 100s of milliseconds. It should be obvious that this is quite bad even if it doesn't result in the kernel killing the process (to add insult to injury). Even at 60 Hz refresh rate, it means missing at least 6 display refresh cycles. The colloquial term for that is "stutter".

                  Comment

                  • blackiwid
                    Senior Member
                    • Jul 2008
                    • 2049

                    #19
                    Originally posted by energyman View Post
                    But then, gtk is a well known bad basis for development.
                    ​​
                    By you and some other biased KDE Fanboys, sure.

                    QT is for people with ADHS, huge unpredictable not consistent 1mio buttons randomly ordered with every app differently. Even the programming language(s), mostly C++ a horrible implementation of a horrible idea (OOP) in the first place. But even if you have some Python bindings, which it's also much better to do Python-bindings with C as basis, suck, because python also sucks.

                    It feels like cocaine, this aggressive bright colors and everything, and yes it's deeper than just the default colors of the usual themes, it's the button click animations and everything, gives me nearly epilepsy if I would really have that, but in reality just tires me more.

                    Now don't get me wrong even I like gnome I use it not that much, because I like more of the tiling window manager at least for non-gaming. But I like as example firefox or more specific librefox better than Chromium, sure partially it's about featureset, and I like gimp more than whatever you use with QT Krita? mpv seems to use something else, I guess I use the gtk version of emacs, I never thought ohh I need a certain tool, let's find a QT alternative for it.

                    At best you need (semi or pseudo) exclusive features like kde connect or I think Amarok was at some point pretty praised for it's featureset, I don't need that, but you would then use tools DESPITE and not because they are written in QT.

                    But I addressed users, what about developers, the tooling for interfaces might be better for QT, can't remember using Glade ever but I used qt designer, but I don't think when I hear "as base" the tooling or documentation is referenced, but how stable it is. Also I am not even sure Mutter which would be the problem here uses even itself gtk.
                    It became the default window manager in GNOME 3, replacing Metacity[4] which used GTK for rendering. "Mutter" is a combination of "Metacity" and "Clutter".​
                    Maybe it uses gtk for something else than rendering but for rendering it uses "Clutter" I guess it uses gtk for something, but not that much and I doubt that gtk caused the problem, but maybe I am wrong, even if so, I would expect that if you port a software from a non-RT kernel to a RT kernel that this would create new room for bugs and some seem to mention that plasma also had bugs recently maybe also with RT?

                    If it's for me do all in lisp, but yes something about guis seem to be catered very much to OOP, which of course you can somehow also do with lisp

                    Also I think you start to overdramatize all that, probably only a unstable development version had this bugs or if you mix it manually with a very new kernel or something as normal distro user you never get the combination that causes bugs on your pc.
                    Last edited by blackiwid; 16 November 2024, 09:57 AM.

                    Comment

                    • woddy
                      Senior Member
                      • Feb 2023
                      • 273

                      #20
                      Originally posted by blackiwid View Post
                      By you and some other biased KDE Fanboys, sure.

                      QT is for people with ADHS, huge unpredictable not consistent 1mio buttons randomly ordered with every app differently. Even the programming language(s), mostly C++ a horrible implementation of a horrible idea (OOP) in the first place. But even if you have some Python bindings, which it's also much better to do Python-bindings with C as basis, suck, because python also sucks.

                      It feels like cocaine, this aggressive bright colors and everything, and yes it's deeper than just the default colors of the usual themes, it's the button click animations and everything, gives me nearly epilepsy if I would really have that, but in reality just tires me more.

                      Now don't get me wrong even I like gnome I use it not that much, because I like more of the tiling window manager at least for non-gaming. But I like as example firefox or more specific librefox better than Chromium, sure partially it's about featureset, and I like gimp more than whatever you use with QT Krita? mpv seems to use something else, I guess I use the gtk version of emacs, I never thought ohh I need a certain tool, let's find a QT alternative for it.

                      At best you need (semi or pseudo) exclusive features like kde connect or I think Amarok was at some point pretty praised for it's featureset, I don't need that, but you would then use tools DESPITE and not because they are written in QT.

                      But I addressed users, what about developers, the tooling for interfaces might be better for QT, can't remember using Glade ever but I used qt designer, but I don't think when I hear "as base" the tooling or documentation is referenced, but how stable it is. Also I am not even sure Mutter which would be the problem here uses even itself gtk.


                      Maybe it uses gtk for something else than rendering but for rendering it uses "Clutter" I guess it uses gtk for something, but not that much and I doubt that gtk caused the problem, but maybe I am wrong, even if so, I would expect that if you port a software from a non-RT kernel to a RT kernel that this would create new room for bugs and some seem to mention that plasma also had bugs recently maybe also with RT?

                      If it's for me do all in lisp, but yes something about guis seem to be catered very much to OOP, which of course you can somehow also do with lisp

                      Also I think you start to overdramatize all that, probably only a unstable development version had this bugs or if you mix it manually with a very new kernel or something as normal distro user you never get the combination that causes bugs on your pc.
                      You are right, that is why Qt is used everywhere, while GTK is used almost exclusively in Gnome.
                      Everyone is crazy in this world!

                      The truth is that there is no such thing as bug-free software, and the team that makes GNOME or KDE Plasma or whatever, do their best to make sure it achieves good stability.​

                      Comment

                      Working...
                      X