Announcement

Collapse
No announcement yet.

Torvalds' Comments On Linux Scheduler Woes: "Pure Garbage"

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by tchiwam View Post
    A bit off topic here ...

    The mutli cpu systems I have abused, locks are always evil and never behaved as planned.

    I always hated doing the mutli week investment to work around a lock less solution. But they have always worked faster in the end. The only extra cost was in the memory usage, but that you can just trow money at it.
    I too have been written many threaded programs (for about the last 25 years or so). Even on 60 fps animating programs, mutexes have worked fine. I think the biggest problem is poorly architected programs with dozens of mutexes floating around, thus greatly increasing the likelihood that a thread will encounter a locked mutex.

    Comment


    • #62
      Originally posted by gigaplex View Post
      Stating "And be aware that the likelihood that you know what you are doing is basically nil." is basically saying "you're stupid" rather than "you're wrong". It's a message indicating they have nothing useful to contribute.
      It is really the statement of facts. That is something you have to basically admit to yourself when you want to use spinlocks in userspace. Linus does go on to explain all the factors why in heck there is no way to know if what you are doing is right or wrong.

      There is a handful of people who know enough about the Linux or Windows kernel scheduler and locking who could possible do a spinlock in userspace and set everything right system wide that it behaves correctly.

      You want to read it as saying the person is stupid. Its not saying that its saying that you have to accept the likelihood you know what you are doing is nil and that you will be needing to ask one of the handful of people for advice if you are doing it right or wrong. Linus even goes on to state on usermode spin locks and other locking those people have even got it wrong.

      This is not being stupid. Locking is insanely complex thing to get exactly right for the usage case and insanely simple to get completely wrong.

      Comment


      • #63
        I saw nothing wrong with the comment from Linus. I think he was criticizing the code and the programmer's lack of understanding of where it is being used in the kernel. As another post pointed out, sometimes you have to say things like they are; if you are doing a bad job at work, then your manager is justified in telling you so.

        Given the length of time that this issue has existed in cyberspace, and a few days is a long time in that context, there was plenty of time for that programmer to carefully consider any and all feedback that they got before Linus spoke up.

        Comment


        • #64
          Originally posted by gigaplex View Post

          Stating "And be aware that the likelihood that you know what you are doing is basically nil." is basically saying "you're stupid" rather than "you're wrong". It's a message indicating they have nothing useful to contribute.
          If the programmer in question has never contributed code to kernel locking & scheduling, or been involved in the in-depth in-person & online conversations on that topic, then Linus' comment is technically correct.

          If you read "offense" into my comment or Linus' comment, then I wonder if you also find aliens behind every blinking light in the sky, boogymen under every dark bedframe, and communists in every dark corner of Hollywood.

          Comment


          • #65
            Originally posted by gigaplex View Post

            Stating "And be aware that the likelihood that you know what you are doing is basically nil." is basically saying "you're stupid" rather than "you're wrong". It's a message indicating they have nothing useful to contribute.
            Well, you've cherry-picked this quote to hell and back.

            Originally posted by Linus Torvalds
            I repeat: do not use spinlocks in user space, unless you actually know what you're doing. And be aware that the likelihood that you know what you are doing is basically nil.
            Originally posted by Linus Torvalds
            Because you should never ever think that you're clever enough to write your own locking routines.. Because the likelihood is that you aren't (and by that "you" I very much include myself - we've tweaked all the in-kernel locking over decades, and gone through the simple test-and-set to ticket locks to cacheline-efficient queuing locks, and even people who know what they are doing tend to get it wrong several times).

            There's a reason why you can find decades of academic papers on locking. Really. It's hard.
            He's not saying "You're too stupid to lock correctly", but "locking is too hard to get right on the first few tries. Especially when you don't do locking for a living." Linus says the same thing multiple times in the same email. You literally can't read the actual article and mistake his meaning without doing it deliberately. And Linus also pointed out that anyone who does know about locking doesn't try to do spinlocks in userspace. Skarupke literally proved that he was ignorant by how he started the whole crusade, followed by doubling down on his mistake.

            You're not the only person in this thread to complain about Linus' tone. But you're the last, so I'll start with you, and hopefully cover most of the hypocritical rubberneckers with it. Let's go back to the context of how this all started.

            Originally posted by Malte Skarupke
            So this all started like this: I overheard somebody at work complaining about mysterious stalls while porting Rage 2 to Stadia. The only thing those mysterious stalls had in common was that they were all using spinlocks. I was curious about that because I happened to be the person who wrote the spinlock we were using.
            So, one developer's userland spinlock code was causing problems when being ported. Rather than having had some personal growth in order to realize that his implementation wasn't perfect, he spends hours profiling things and creates a giant blog post to throw shade elsewhere and make himself feel smart.

            Originally posted by Malte Skarupke
            Really the Windows results just shows us that the Linux scheduler might take an unreasonably long time to schedule you again even if every other thread is sleeping or calls yield(). The Linux scheduler has been known to be problematic for a long time.
            The Linux scheduler isn't perfect by any means, but it shouldn't get blamed for bad code. And it shouldn't be used as a scapegoat by the kind of ass that writes 12-page blogposts titled "Measuring Mutexes, Spinlocks and how Bad the Linux Scheduler Really is". He literally started the article with a title trashing the Linux scheduler. If he didn't want to have a blunt conversation, he should have preserved some decorum in the first place. You can't throw a punch and complain when you get afterwards.

            Originally posted by Malte Skarupke
            I know that we were not the only developers who had problems with the scheduler on Stadia. And Google is very aware of the problem. They care a lot about latency because latency is super important for the Stadia experience.
            Eventually they fixed it, but he still continues to be unreasonably obtuse about the fact that his spinlock didn't work.

            Originally posted by Malte Skarupke
            I have to say that I am really weirded out by the ticket_spinlock performing this badly...
            ...most mutex implementations are really good, that most spinlock implementations are pretty bad...
            The only thing those mysterious stalls had in common was that they were all using spinlocks.
            How do you even measure whether a spinlock is better than a mutex, and what makes a good spinlock?
            Even if you are in the situation where there is one software thread per hardware thread and you think you’re doing the right thing by using a spinlock, you might be making future code evolution harder.
            So this whole thing started as a highbrow "my code is perfect" slam on Linux from a game developer. He knows the Linux scheduler is bad, so when his thread is running bad code and doesn't get scheduled right because of it, he blames the kernel and tries to convince everyone else that he's right and the kernel is wrong. It doesn't matter how nicely you put your words together. You're still attacking someone else because you can't accept that you're wrong. This whole 12-page catastrophe could have been replaced with "if you're getting bad performance porting code to Linux or Stadia, make sure you rip out all the spinlocks, first. Linux doesn't like those." It's still not complete, but it's informative without being inflammatory.

            What's Linus' first response?

            Originally posted by Linus Torvalds
            The whole post seems to be just wrong, and is measuring something completely different than what the author thinks and claims it is measuring.

            First off, spinlocks can only be used if you actually know you're not being scheduled while using them. But the blog post author seems to be implementing his own spinlocks in user space with no regard for whether the lock user might be scheduled or not. And the code used for the claimed "lock not held" timing is complete garbage.
            Linus is the type of guy who'll tell you up front when you wrote something bad, but he'll painstakingly explain his point and teach the person why it's bad. It's actually a really nice thing he's doing. Most experienced people won't bother to share their experience. His entire critique is technical. It's not fluffy nice, but it's not mean, either. Just factual to an extreme with hyperbole thrown in to amuse. It's the kind of code review I'd love to have, as it'd help my skills grow. When you hit a certain level, most people just stop critiquing because they're worried about being wrong, or figure it's not their problem. Critique is extremely helpful to skill growth. And as long as the person is willing to listen, you can usually make it rather dry. But when someone spreads out fake news to make themselves look better, you've got to shut that shit down fast before people start believing it. You don't necessarily have time to coddle them at that point.

            Linus is gruff when poked, but he's got rules. He never attacks noobs, he stays out of things that don't escalate or get really visible, and he never just says mean things for the sake of it. All of his complaints are teaching moments. And for people in CS, those teaching moments are valuable. For most people they'd be more valuable one-on-one and with nicer words. And when people reach out one-on-one, he does that. Public teaching only happens when (1) someone who should know better does something that creates work for others, (2) someone doubles down on a mistake instead of fixing it, or (3) somebody brings their mistakes public and blames others for them. It's pretty easy to not have bad interactions with him. Just don't make his job harder. And when you fuck up, claim ownership and fix it.

            Originally posted by Creak View Post
            Just by seeing how some people here consider the Google developer to "showed himself a fool", or that he "sucks", or that he "was talking mostly out of his rear", or even making a general affirmation that "Google is becoming pathetic". This explains why words are important. If Linus would have been less rude, maybe you lot would have been less rude too.

            This, to me, is the exact definition of a toxic community.

            I know other open source communities where you can be told you're wrong without telling you you're stupid.
            You're entirely missing the context where Skarupke tried to use a 12-page article titled "Measuring Mutexes, Spinlocks and how Bad the Linux Scheduler Really is" to avoid responsibility for writing bad code by blaming the Linux kernel for not reading his mind and fixing it for him. His language wasn't nice, either. More importantly, he blamed others for his mistake and tried to convince others to join in with him. That's toxic behavior to a 'T' regardless of the vocabulary, and the only way to deal with that level of conceited hubris is to shut it down hard. If Skarupke couldn't handle a blunt dismissal of his specious and aggressive claims, he should have checked his own tone first. And criticizing the direct response for its "tone" to rude shots fired is sea-lioning at its finest. If you're going to try and police people for etiquette, at least have the basic decency to start at the first breach. Don't play favorites and start in the middle. You don't call people who are assertive and defending themselves "toxic". That's just victim blaming.

            There is something to be said for people to spend less time and effort defending their celebrities. But it goes into the same bucket as people who waste their time mobbing a celebrity without reading the details just to make themselves feel like better people. So you should go fix that before telling other people what they need to fix.


            Comment


            • #66
              Originally posted by Terrablit View Post
              He's not saying "You're too stupid to lock correctly", but "locking is too hard to get right on the first few tries. Especially when you don't do locking for a living." Linus says the same thing multiple times in the same email.
              Additionally, he included himself in this view:
              Because you should never ever think that you're clever enough to write your own locking routines.. Because the likelihood is that you aren't (and by that "you" I very much include myself - we've tweaked all the in-kernel locking over decades, and gone through the simple test-and-set to ticket locks to cacheline-efficient queuing locks, and even people who know what they are doing tend to get it wrong several times).

              Comment


              • #67
                Originally posted by Volta View Post

                While I'm not native English speaker I sometimes doubt Michael knows this language better.. Google developer sucks btw.
                In different threads I have seen this "Google Developer" thing. From what I understood, he is game developer working for a studio targeting Google Stadia. Huge difference.

                Comment


                • #68
                  Originally posted by timofonic View Post

                  Oh yes, because Linux kernel development is a Special Education School. That is, one for mentally disabled people.

                  If a big company like Google isn't able to do proper Linux development, they should get better project managers and better developers. Seriously, do they hire code monkeys these days?

                  Google is becoming pathetic...
                  Like I mentioned in previous post, he was not a Google developer. But this is how internet works. In this thread alone at least 5 or more people call him Google developer.

                  To myself: Welcome to Phoronix where people just randomly rant about corporate, systemd etc. Like it is a therapy tool for their anger management.

                  Comment


                  • #69
                    Originally posted by kiffmet View Post
                    It's 2020 and the old, grumpy Torvalds is back. Oh, how I missed his rudeness
                    1. What? Did we read the same post because he did not go off on one of his old, totally unproductive "you fucking idiot" rants.
                    2. I think its funny that everyone claims to love rudeness until someone is rude to them, especially those who try to rationalize it.

                    Comment


                    • #70
                      Originally posted by bug77 View Post

                      Believe me, the alternative doesn't work. If you try to be polite, supportive and constructive, no matter how serious the problem you're trying to bring up is, it's perceived as a minor issue and action is rarely taken. If you're "lucky" enough, you also get to fix it, because you're the one who brought it up and you were so nice about it.

                      And if you don't believe me, do this exercise: read this thread and count how many posts are about what Linus said vs how he said it.
                      Maybe your communications skills just suck and you have too much of an ego to improve in that manner.

                      Comment

                      Working...
                      X