Announcement

Collapse
No announcement yet.

The Linux Kernel's Scheduler Apparently Causing Issues For Google Stadia Game Developers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by Almindor View Post

    Linux literally "locks up" during heavy thread activity. It's not just spinlocks it's crappy scheduler. It used to be the other way around btw. windows was unusable under load. Even old Solaris was better than today's linux is under load it's pretty terribad.

    I'm glad some people are finally looking into the finer details with proper testing that can't be handwaived away.
    That sounds rather strange. I'm running several servers at work where all 32 cores on each server is close to fully utilized and latency on those systems is still below 1ms (I would be out of business otherwise).

    When Linux locks up in the way that you describe it's either due to huge stress in the IO-subsystem or large amounts of swap constantly being pushed in and out of RAM/disk.

    Comment


    • #72
      Originally posted by PuckPoltergeist View Post

      Ubuntu 18.04 (4.15.0-72-generic) (in the comment-section) And there was suspected, that a kernel with CONFIG_PREEMPT_NONE was benched. So no, it's not the scheduler from Linux that's bad, but a kernel-config from a distribution that doesn't fit the needs.
      Was the config posted? Because in the Ubuntu 18.04 -generic kernel CONFIG_PREEMPT_NONE is not set.

      Code:
      f.ultra@ubuntu:~$ lsb_release -a
      No LSB modules are available.
      Distributor ID:    Ubuntu
      Description:    Ubuntu 18.04.3 LTS
      Release:    18.04
      Codename:    bionic
      f.ultra@ubuntu:~$ grep PREEMP /boot/config-4.15.0-74-generic
      CONFIG_PREEMPT_NOTIFIERS=y
      # CONFIG_PREEMPT_NONE is not set
      CONFIG_PREEMPT_VOLUNTARY=y
      # CONFIG_PREEMPT is not set
      # CONFIG_PREEMPTIRQ_EVENTS is not set
      f.ultra@ubuntu:~$

      Comment


      • #73
        Originally posted by F.Ultra View Post

        Was the config posted?
        No it wasn't. And I was searching the default config from ubuntu and didn't find anything. So thanx for sharing this.

        Because in the Ubuntu 18.04 -generic kernel CONFIG_PREEMPT_NONE is not set.

        Code:
        f.ultra@ubuntu:~$ lsb_release -a
        No LSB modules are available.
        Distributor ID: Ubuntu
        Description: Ubuntu 18.04.3 LTS
        Release: 18.04
        Codename: bionic
        f.ultra@ubuntu:~$ grep PREEMP /boot/config-4.15.0-74-generic
        CONFIG_PREEMPT_NOTIFIERS=y
        # CONFIG_PREEMPT_NONE is not set
        CONFIG_PREEMPT_VOLUNTARY=y
        # CONFIG_PREEMPT is not set
        # CONFIG_PREEMPTIRQ_EVENTS is not set
        f.ultra@ubuntu:~$
        So it's not the best config, but at least not the worst for latency. Can you share the value for CONFIG_HZ too?

        Comment


        • #74
          grep HZ /boot/config-4.15.0-74-generic
          CONFIG_NO_HZ_COMMON=y
          # CONFIG_HZ_PERIODIC is not set
          CONFIG_NO_HZ_IDLE=y
          # CONFIG_NO_HZ_FULL is not set
          CONFIG_NO_HZ=y
          # CONFIG_HZ_100 is not set
          CONFIG_HZ_250=y
          # CONFIG_HZ_300 is not set
          # CONFIG_HZ_1000 is not set
          CONFIG_HZ=250
          CONFIG_MACHZ_WDT=m

          Originally posted by PuckPoltergeist View Post
          No it wasn't. And I was searching the default config from ubuntu and didn't find anything. So thanx for sharing this.



          So it's not the best config, but at least not the worst for latency. Can you share the value for CONFIG_HZ too?

          Comment


          • #75
            Originally posted by F.Ultra View Post
            grep HZ /boot/config-4.15.0-74-generic
            For this file I was searching, Is it provided by some package or generated later?

            Code:
            CONFIG_NO_HZ_COMMON=y
            # CONFIG_HZ_PERIODIC is not set
            CONFIG_NO_HZ_IDLE=y
            # CONFIG_NO_HZ_FULL is not set
            CONFIG_NO_HZ=y
            # CONFIG_HZ_100 is not set
            CONFIG_HZ_250=y
            # CONFIG_HZ_300 is not set
            # CONFIG_HZ_1000 is not set
            CONFIG_HZ=250
            CONFIG_MACHZ_WDT=m
            So we have only voluntary preempting and only 250Hz timer. For soft-realtime (e.g. audio) this isn't sufficient. I'm wondering, why Ubuntu is using such a config for a desktop distribution. On desktops, latency matters, not throughput.

            Btw. thanx for this info.

            So the whole blog-post is about Ubuntu, not the Linux scheduler. This developer, who wrote the fastest hashtable (I'm wondering how fast a table can be), should bench the kernel and than come back.

            Comment


            • #76
              I extracted the benchmark values from the blogpost for easier side by side comparison:

              Code:
                                                         Windows                                                                       Linux    
              Type                Four longest waits                    Four longest idle times                 Four longest waits                    Four longest idle times
              std::mutex          24.1 ms, 20.1 ms, 17.4 ms, 17.0 ms    0.58 ms, 0.46 ms, 0.27 ms, 0.19 ms      2.9 ms, 2.8 ms, 1.5 ms, 1.4 ms        0.8 ms, 0.28 ms, 0.26 ms, 0.25 ms
              terrible_spinlock   32.2 ms, 30.0 ms, 27.3 ms, 26.1 ms    21.1 ms, 20.9 ms, 18.2 ms, 18.0 ms      103.5 ms, 90.6 ms, 77.1 ms, 75.7 ms   134.8 ms, 124.6 ms, 119.7 ms, 96.5 ms
              spinlock_amd        61.1 ms, 57.8 ms, 57.0 ms, 56.5 ms    17.3 ms, 6.6 ms, 5.0 ms, 4.9 ms         62.3 ms, 61.5 ms, 60.9 ms, 59.8 ms    7.0 ms, 6.9 ms, 0.54 ms, 0.45 ms
              spinlock            48.0 ms, 40.0 ms, 40.0 ms, 36.2 ms    0.16 ms, 0.14 ms, 0.14 ms, 0.14 ms      11.4 ms, 10.8 ms, 10.4 ms, 9.9 ms     1.4 ms, 1.2 ms, 0.33 ms, 0.32 ms
              ticket_spinlock     0.78 ms, 0.34 ms, 0.27 ms, 0.2 ms     0.38 ms, 0.35 ms, 0.26 ms, 0.25 ms      1.5 ms, 1.5 ms, 1.49 ms, 1.48 ms      13.0 ms, 3.3 ms, 2.6 ms, 2.4 ms
              And what strikes me is that I don't really see this "the sky is falling and Linux is terrible" in the actual numbers.

              std::mutex is unsurprising much better on Linux vs Windows and is actually better than any of the others on Windows with the exception of his experimental ticket_spinlock.
              spinlock_amd have very similar but slightly better performance in Linux with much better "idle time" on Linux and "spinlock" is better on Linux but with higher spikes in the "idle time".

              The "terrible spinlock" is much worse on Linux but the performance on Windows makes it not fit his usage criteria anyway (with sub millisecond idle time). So instead of ranting about how bad the Linux scheduler is it looks like the real solution is to swith to std::mutex which would also increase the performance over the "spinlock" version which I assume is the one used in the WIndows-native game.

              Comment


              • #77
                Originally posted by PuckPoltergeist View Post
                For this file I was searching, Is it provided by some package or generated later?



                So we have only voluntary preempting and only 250Hz timer. For soft-realtime (e.g. audio) this isn't sufficient. I'm wondering, why Ubuntu is using such a config for a desktop distribution. On desktops, latency matters, not throughput.

                Btw. thanx for this info.

                So the whole blog-post is about Ubuntu, not the Linux scheduler. This developer, who wrote the fastest hashtable (I'm wondering how fast a table can be), should bench the kernel and than come back.
                According to apt-file the kernel config is supplied by the linux-modules-4.15.0-74-generic package. However do look at my post one up where I've posted the actual numbers from the benchmarks on the blogpage, AFAIK they don't really match the story that is being told.

                edit: the -generic kernel in Ubuntu is installed on both desktop and server editions so I think they choose a middle ground here. They do also provice a -lowlatency kernel though.
                Last edited by F.Ultra; 03 January 2020, 04:15 PM.

                Comment


                • #78
                  Originally posted by F.Ultra View Post
                  So instead of ranting about how bad the Linux scheduler is it looks like the real solution is to swith to std::mutex which would also increase the performance over the "spinlock" version which I assume is the one used in the WIndows-native game.
                  That's the conclusion of most of the comments from the blogpost. This blogpost is telling a story about a developer with a way to big ego and to less knowledge. So Michael, you should change the headline from "The Linux Kernel's Scheduler Apparently Causing Issues For Google Stadia Game Developers" into "A Google Stadia Game Developer can't distinct Linux and Ubuntu and doesn't know his tools".

                  Comment


                  • #79
                    Originally posted by F.Ultra View Post

                    According to apt-file the kernel config is supplied by the linux-modules-4.15.0-74-generic package.
                    That's the package, I didn't looked into. Doesn't make sense to me. I've expected this file in the kernel-package.

                    However do look at my post one up where I've posted the actual numbers from the benchmarks on the blogpage, AFAIK they don't really match the story that is being told.
                    Yep, already replied.

                    edit: the -generic kernel in Ubuntu is installed on both desktop and server editions so I think they choose a middle ground here. They do also provice a -lowlatency kernel though.
                    A middle ground isn't good, neither for desktop than for server. They should provide lowlatency per default and a kernel for servers alternatively.

                    Comment


                    • #80
                      Originally posted by PuckPoltergeist View Post
                      A middle ground isn't good, neither for desktop than for server. They should provide lowlatency per default and a kernel for servers alternatively.
                      In the past they provided kernel better configured for desktop. However, later they took an idiotic decision to use the same kernel for server and desktop. Totally dumb.
                      Last edited by Volta; 03 January 2020, 07:13 PM.

                      Comment

                      Working...
                      X