Announcement

Collapse
No announcement yet.

The Linux Kernel's Scheduler Apparently Causing Issues For Google Stadia Game Developers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by Zan Lynx View Post

    The way glibc implements futexes on Linux uses userspace spinlocks. It's not unusual. It's pretty much the standard anywhere.

    A futex (Fast Userspace Mutex) tries to get the lock a few times before going into the kernel. Windows CriticalSection does exactly the same thing.

    Now, some game or database (cough, Postgres, cough) that has their own custom spinlock implementation is probably wrong and buggy on Linux. Replacing it with standard pthread_mutex calls is almost certain to be an improvement since those are properly tuned for Linux expectations.
    The main difference between a userspace spin lock and what glibc does is that threads that have been put to sleep due to a busy mutex will not necessarily waste time running until the lock is released. With a userspace spin lock, the kernel has no way of knowing that a given thread is waiting for a lock held by a thread that's currently sleeping, so it might schedule it to run, in which case it will sit there for its entire time slice just spinning.

    Also, with a proper mutex, it's theoretically possible for the scheduler to elevate the priority of the thread holding the lock since it now knows that a lot of other threads are waiting for it.

    The specifics of how the scheduler handles this are somewhat beyond me, but the main point is that the scheduler has ultimate control over which threads are running, and by using mutexes, you're giving it more information so that it can make more intelligent decisions.

    All that said, there are probably some scenarios where it statistically makes sense to spin rather than use a mutex - likely cases where the probability of completing the critical region within your time slice is very high, which implies extremely small critical regions. Still though, you run the risk of massive latency spikes if the thread holding the spin lock happens to be scheduled out at that very instant, which appears to be what's happening in the blog post.

    This behavior isn't really surprising, and I'm not sure how the kernel could handle it any better without adding something like a fine-grained API to provide scheduling hints to the kernel.

    Originally posted by Zan Lynx View Post
    Replacing it with standard pthread_mutex calls is almost certain to be an improvement
    I agree with you there.

    Comment


    • #12
      Or maybe the kernel could add a true userspace spin lock syscall (no irqsave though), but with hard time limits to prevent deadlocks? That would be interesting.

      Comment


      • #13
        not sure how "getting a picture on a screen every 16ms" is especially hard on Linux, I'm happily running CSGO at a steady (capped) 300 fps with my old r9 390 and new Ryzen 3500X build...

        my new CPU also showed that 1600 FPS is possible in CSGO on linux

        Comment


        • #14
          Originally posted by HenryM View Post
          not sure how "getting a picture on a screen every 16ms" is especially hard on Linux, I'm happily running CSGO at a steady (capped) 300 fps with my old r9 390 and new Ryzen 3500X build...
          How'd you get a Ryzen 3500X? Weren't those a China-only thing?

          Comment


          • #15
            One of the other things that would be useful is exposing linux's yield_to to userspace when you are spinning and know which thread ID to yield to, but don't want to block.

            Comment


            • #16
              Originally posted by programmerjake View Post

              How'd you get a Ryzen 3500X? Weren't those a China-only thing?
              supposed to be, but I found some established sellers on aliexpress had them for like $125 USD shipped, so...

              Comment


              • #17
                Originally posted by Unklejoe View Post
                Or maybe the kernel could add a true userspace spin lock syscall (no irqsave though), but with hard time limits to prevent deadlocks? That would be interesting.
                A userspace syscall is a contradiction. The closest thing to that is the futex and that's already implemented. They are very efficient.

                Comment


                • #18
                  Interestingly, most of the 3500X listings were sold out last I checked, but there are a few still around.

                  it's still weird to me that framerates in CSGO can be counted in kHz. though on my hardware, it's more like the odd frame is rendered in 0.6-0.7 ms, average framerate is more like 900 fps in this case.

                  Comment


                  • #19
                    Originally posted by Zan Lynx View Post

                    A userspace syscall is a contradiction. The closest thing to that is the futex and that's already implemented. They are very efficient.
                    Userspace syscalls exist: see https://lwn.net/Articles/604515/#vdso

                    Comment


                    • #20
                      BMQ v5.4-r1 is released with the following changes 1. Adjust task boost_prio at deactivate&wake_up. This change makes task which gives up...


                      Compile kernel with BMQ and GG!

                      Comment

                      Working...
                      X