Announcement

Collapse
No announcement yet.

The Linux Kernel's Scheduler Apparently Causing Issues For Google Stadia Game Developers

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by JPFSanders View Post
    Linux on the other hand comes with a single "default" that is a good compromise, very good for the majority of servers, not always the best choice for desktops, however you can customize as you see fit.
    I really can't see what's the issue with the default scheduler. I often have a frame time graph enabled and usually every game I run in Wine/DXVK runs as good as on Windows regarding spikes and avg values (the latter a bit lower ofc due to abstraction). This is not my wishful thinking, it's what I can observe.
    The same goes for video and audio playback, having zero stutter or crackling. I wouldn't even consider using Linux if this was different.

    Comment


    • #32
      Originally posted by aufkrawall View Post
      I really can't see what's the issue with the default scheduler. I often have a frame time graph enabled and usually every game I run in Wine/DXVK runs as good as on Windows regarding spikes and avg values (the latter a bit lower ofc due to abstraction). This is not my wishful thinking, it's what I can observe.
      The same goes for video and audio playback, having zero stutter or crackling. I wouldn't even consider using Linux if this was different.
      Don't worry about it, this is not a real issue. It's just Google trying to shift the blame for their embarrassing Stadia launch (i.e. a handful of games and nowhere near the performance that was promised over a 25Mbps connection).

      Comment


      • #33
        Originally posted by programmerjake View Post

        Userspace syscalls exist: see https://lwn.net/Articles/604515/#vdso
        It accesses shared data so the system does not have to enter kernel mode, by creating a page of data with the information in the app. Nominally its a system call, but is it a system call or just a shared memory mapping

        Comment


        • #34
          I've tried to read the blog post, and my eyes are bleeding.
          Originally posted by https://probablydance.com/2019/12/30/measuring-mutexes-spinlocks-and-how-bad-the-linux-scheduler-really-is/
          Usually you’ll want to create one software thread per hardware thread and if you only have exactly that, then there really is no benefit of ever going to sleep. (as long as you make sure to not block the other hyper-thread running on the same core) But in practice it doesn’t work out that cleanly. Maybe a third party library has its own threads that run besides your threads (doing audio processing or physics calculations or who knows what) or maybe you’ll have some work that needs to run “at some point, as long as it doesn’t get in the way” which you’ll schedule to run at low priority “in the gaps” when other threads are blocked.
          I've had a look and on my system I've more than 40 kworker threads. And this is not even the half of all kernel threads. So yes, there are always way more threads that will fight for CPUs. Not mentioning all the other programs we're running on a multi-use multi-tasking system. So even if we're alone on the system, one software thread per hardware thread is bullshit!

          So why is this terrible? The first reason is that while we’re spinning like this, we appear to the CPU and to the OS like a very busy thread and we will never be moved out of the way. So if the thread that has the lock is not currently running, we could be blocking it from running, causing it to not give up the lock.
          Yeah, as long as I'm burning the CPU, my process/thread won't be preemted... Uhhh, why? Why should the scheduler avoid preempting a running task? It's the idea of preemtive multitasking, that the scheduler is stopping tasks and running others.

          This all sounds like the author isn't really knowing what he's doing. He's writing about spinlocks in userspace, where they really doesn't make sense. Userspace can always be interrupted. And this makes spinlocks nearly useless.

          Comment


          • #35
            One of the great things about Linux being open-source and Google having practically infinite money to dump onto problems, there's a good chance they might be able to fix up the scheduler so this is no longer an issue. And if they do, who knows, maybe this will yield some performance improvements in other applications. I figure robotics applications would benefit from a more optimized scheduler.

            Comment


            • #36
              Originally posted by PuckPoltergeist View Post
              I've tried to read the blog post, and my eyes are bleeding.


              I've had a look and on my system I've more than 40 kworker threads. And this is not even the half of all kernel threads. So yes, there are always way more threads that will fight for CPUs. Not mentioning all the other programs we're running on a multi-use multi-tasking system. So even if we're alone on the system, one software thread per hardware thread is bullshit!
              Nice one. I didn't even notice this. I'm sometimes compiling the Kernel using eight GCC threads (i7 4 cores and 8 threads), downloading torrent and playing Quake live via Steam's Proton. It seems I'm violating some pre-multitasking law. However, author of the article mentions:

              The problem was that there was a thread that spent several milliseconds trying to acquire a spinlock at a time when no other thread was holding the spinlock.
              Sounds like a bug?

              Comment


              • #37
                Originally posted by schmidtbag View Post
                One of the great things about Linux being open-source and Google having practically infinite money to dump onto problems, there's a good chance they might be able to fix up the scheduler so this is no longer an issue.
                Yes, but no! Yes, Google has money that can be thrown onto problems. Sometimes this is good, sometimes it's bad. Apple was blamed for forking khtml and not giving back in a good manner. Google has forked openssl and nobody cares/everyone is alright with this. Google is reinventing the wheel all time, and is burning so much resources this way. And no, this time it's not a bug, Google has discovered. It's a wrong understanding of a developer how things work. If this is really "fixed" within the scheduler of Linux, it will make it worse.

                And if they do, who knows, maybe this will yield some performance improvements in other applications. I figure robotics applications would benefit from a more optimized scheduler.
                What are performance improvements? Throughput? Latency? Robots may benefit from RT-scheduling. This is available today.

                Comment


                • #38
                  Originally posted by Volta View Post
                  Sounds like a bug?
                  Or the process was simply preemted by the OS. Yes, there may be other processes ready to run and don't bother about this spinlock. The process, that tries to get the lock, will sleep nevertheless. So yes, a design-bug. Userspace-spinlocks can't work!

                  Comment


                  • #39
                    Originally posted by Mario Junior View Post
                    "among the Linux schedulers I tested, this looks to be the best one, since we mostly care about std::mutex and spinlock, and it does best there. The only downside is that ticket_spinlock runs a bit slow"

                    edit: also interesting https://ck-hack.blogspot.com/2020/01...96932850656981
                    Last edited by halo9en; 01-02-2020, 11:52 AM.

                    Comment


                    • #40
                      Originally posted by halo9en View Post

                      "among the Linux schedulers I tested, this looks to be the best one, since we mostly care about std::mutex and spinlock, and it does best there. The only downside is that ticket_spinlock runs a bit slow"

                      edit: also interesting https://ck-hack.blogspot.com/2020/01...96932850656981
                      Maybe this Clear Linux patch for kernel helps:
                      https://ck-hack.blogspot.com/2020/01...96932850656981

                      Comment

                      Working...
                      X