Announcement

Collapse
No announcement yet.

Is The Linux Kernel Scheduler Worse Than People Realize?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #71
    Originally posted by kebabbert View Post
    How do you mean? Can you elaborate? Are you confirming that Linux is unstable under high loads?
    Linux is fine and stable. No issues there, but if you are running complex DB queries and the machine tries to page out active RAM to disk, your performance takes a huge hit. We always scale the RAM in the boxes to be high enough to prevent paging. Yes, you can turn it off, but Oracle *Requires* at least 16GB of swap, even if you never use it. Sometimes it is about meeting the vendor's expectations, not reality.

    Comment


    • #72
      Originally posted by kebabbert View Post
      There did not exist any such large Linux servers until just a couple of months ago - HP Kraken is a redesigned 16-socket version of their Unix 64-socket server
      https://www.sgi.com/products/servers/uv/uv_3000_30.html
      up to 256 CPU sockets and 64TB of cache-coherent shared memory in a single system.
      ...
      A choice of unmodified SUSE® Linux® Enterprise Server or Red Hat® Enterprise Linux
      Compute node? Yes. Server? Debatable, it has been used for that, yes. This was released ... a year ago? RHEL does not use the sock scheduler, and I would guess SLES follows suit.

      But even as a compute node I would be relying on thread pinning & affinity rather than any OS scheduler. Too much NUMA cross talk otherwise.

      RHEL 7 with the default kernel outperforms much newer unpatched kernels on large machines (I will be testing out the wasted core patch on a 160 thread machine tomorrow).
      Last edited by ahab; 04-25-2016, 12:52 AM.

      Comment


      • #73
        Originally posted by SystemCrasher View Post
        In games you care both about performance and latency. Somehow, they are often opposite of each other. I.e. best bulk performance achieved if you just run code and do not interrupt it at all, so less context switches happen, cache is hot, and so on. But guess what happens to latency? Everything else is stalled for a while, latency suxx. Trying to get better latency could reduce bulk performance a bit. As an obvious example, default ubuntu kernel is not preemptible. It warrants a bit better performance, sure. But this kills latency a lot and user experience coud be crappy under load. If kernel does something for a while and can't be interrupted, you get a really laggy system.
        Games are actually not hugely affected by latency in most situations; remember that you have a target 16ms render time, which is HUGE in computing terms. The key is making sure you don't stall out the GPU, which effectively means that the handful of threads that manage that workload get executed ASAP. Stalling out those one or two threads WILL kill performance.

        On Windows, this isn't a major concern. The Developer just gives those threads higher base priority within the application, and this is enough to make sure they're running 99% of the time. Sure, the threads may jump cores from time to time, but on the grand scale of things, the GPU is always fed [within the limitations of the DX11 API anyways]. On Linux however, your load sharing with other threads on your per-core runqueue, and heaven help you if your programs main rendering thread literally stops running HALF THE TIME because it didn't get load balanced to a core all by itself.

        So I'm going to repeat what I've been saying for years: For light workload, latency based programs, the Linux scheduler if fine. For games, it isn't even woefully insufficient, it's just WRONG. The Linux scheduler, as designed, is not suited for the type of work games demand of if.

        Comment


        • #74
          Originally posted by F1esDgSdUTYpm0iy View Post
          The burden of proof is on you, sir. First, re-read what I wrote. Did I discount your claim? No, I did not. I just asked for citation, nothing more. For proof of your claim.

          So, the next time, before you go on a (repeated) rant-binge, please observe the very vital skill of comprehensive reading. Do not merely read the words one by one. Read the message, understand it. What does it say? What did I say? Did I say -- "No, you're wrong!". No, I did not. Did I imply it? Not the point. I asked for proof, nothing more. You made a claim, the burden of proof most decidedly is on you.
          Well, there are no high end business servers running Linux. How can I prove this? One way would be to email every computer vendor and ask them. I am not going to do that. Instead, there should be links to any high end business servers out there. But such links do not exist, and never have. If you can find ONE single link, on the entire internet, it means I am wrong. But how can I prove there does not exist high end business servers? I can not. How do you prove that the Snowman does not own a bit land on Hawaii? You can not. Maybe he does.

          But if you look at business benchmarks, such as SAP, TPC-C, etc etc - you will not find any Linux servers there. Nowhere. The high end business market is extremely lucrative and everyone wants to go there. For instance, one single IBM P595 32-socket Unix server, costed $35 million. Yes, ONE single server. Compare this to a large cluster with 100s of cpus which is much cheaper. Everybody wants to go into the high end Enterprise market as it is very lucrative. But there are NO Linux servers there. Just google and try to find ONE single link. If you can find it, I will eat my hat. But you won't. For instance, SGI who sells the UV2000 and UV3000 clusters, explain why they won't be able to enter the high end Enterprise market:
          http://www.realworldtech.com/sgi-interview/6/
          "...However, scientific HPC applications have very different operating characteristics from commercial business applications. Typically, much of the work in scientific code is done inside loops, whereas commercial applications, such as database or ERP software are far more branch intensive. This makes the memory hierarchy more important, particularly the latency to main memory. Whether Linux can scale well with a workload is an open question. However, there is no doubt that with each passing month, the scalability in such environments will improve. Unfortunately, SGI has no plans to move into this Enterprise market, at this point in time...."

          To enter this Enterprise market, you need to scale extremely well, to be able to tackle the largest workloads that can only run on 16 or 32 socket servers, with 100s or even 1000s of threads. And we all know that Linux can not scale to that extent. Therefore Linux can not run large enterprise workloads, and therefore such Linux servers does not exist. SGI explains they can not scale well enough for the Enterprise market, as their servers are clusters.

          Comment


          • #75
            Originally posted by ahab View Post

            https://www.sgi.com/products/servers/uv/uv_3000_30.html
            Compute node? Yes. Server? Debatable, it has been used for that, yes. This was released ... a year ago? RHEL does not use the sock scheduler, and I would guess SLES follows suit.

            But even as a compute node I would be relying on thread pinning & affinity rather than any OS scheduler. Too much NUMA cross talk otherwise.

            RHEL 7 with the default kernel outperforms much newer unpatched kernels on large machines (I will be testing out the wasted core patch on a 160 thread machine tomorrow).
            There are no customers running business workloads on UV3000. I havent even checked SGI's website on this, because I know it can not be done. I actually saw a database benchmark on a SGI UV2000 - but it used... 8-sockets. Why did they not try to use 256 sockets for the benchmark? Because, it is a cluster, and in a previous post here, SGI says clusters can not run business workloads. So I invite you to post ONE single link where ANY customer runs business enterprise workloads on ANY SGI server with 100s of sockets. There exist none, because Linux can not scale, according to the SGI link I posted.

            Comment


            • #76
              ... on second thought, just leave it. Not worth it.
              Last edited by F1esDgSdUTYpm0iy; 04-26-2016, 07:32 PM.

              Comment


              • #77
                Originally posted by gamerk2 View Post

                Games are actually not hugely affected by latency in most situations; remember that you have a target 16ms render time, which is HUGE in computing terms. The key is making sure you don't stall out the GPU, which effectively means that the handful of threads that manage that workload get executed ASAP. Stalling out those one or two threads WILL kill performance.

                On Windows, this isn't a major concern. The Developer just gives those threads higher base priority within the application, and this is enough to make sure they're running 99% of the time. Sure, the threads may jump cores from time to time, but on the grand scale of things, the GPU is always fed [within the limitations of the DX11 API anyways]. On Linux however, your load sharing with other threads on your per-core runqueue, and heaven help you if your programs main rendering thread literally stops running HALF THE TIME because it didn't get load balanced to a core all by itself.

                So I'm going to repeat what I've been saying for years: For light workload, latency based programs, the Linux scheduler if fine. For games, it isn't even woefully insufficient, it's just WRONG. The Linux scheduler, as designed, is not suited for the type of work games demand of if.
                Concur. There are already games on windows that routinely employ 6-8 threads. With current scheduler playing such game on Linux would bring forth ton of cursing and quick reboot into windows.

                16ms means roughly 62Hz. It's enough for slow-pacing games but not enough for first person shooters and other fast games. More dedicated gamers are using expensive 120Hz or 144Hz monitors or have 60Hz screens but running their games without v-sync (works with some of the 1ms or 2ms monitors). If you play something like Battlefield 4 you do not want your frames-per-second under 100-120, or you can FEEL the game as becoming slightly sluggish though on screen it might look smooth. It's perception. You see and process the screen, at the same time you manipulate mouse and keyboard. But if both are even slightly out of sync, you can literally feel it. It's sort of input lag. And gamers hate it sorely.

                If you want Linux to be serious about gaming, you would have to reason that in. If gamers are not getting at least equal-to-windows performance out of their rigs, they are going to be using simply Windows. Especially games with money in it (ESL tournaments for example). Even on windows gaming machines are constantly being tuned and tweaked in order to get as much as performance out as possible.

                Comment


                • #78
                  Originally posted by aht0 View Post
                  games without v-sync
                  More than anything, VSync is the "issue". Not the exact FPS or the refresh rate of the monitor. At least, that's been my experience as a gamer. Things like FreeSync were created, amongst other reason for the reason of synchronization being a cause for lag.

                  Comment


                  • #79
                    Here is the only mention I have seen on LKML thus far.

                    https://lkml.org/lkml/2016/4/23/135

                    Comment


                    • #80
                      Originally posted by duby229 View Post
                      Certainly SSD's have more bandwidth and lower latency than HDD's.
                      They have ORDERS OF MAGNITUDE lower latency. Random seeks are killing HDDs to nowhere. SSDs are ok with it.

                      That's true, but regardless they are still horribly slow, in fact the slowest component of your computer by thousands of times.
                      My RAM does ~17GBytes/s on quite powerful desktop. SSD read is like 200MBytes/s. Just some hundreds, eh? Though it comes with worse latency and write speed is worse. But intense writes where it could get noticeable are quite rare (otherwise you'll wear it out really fast anyway).

                      To make things more "funny"... ever seen "text file is busy" message? When you'll try to overwrite running executable, attempt will be denied with this error. The reason is: system relies on ability to re-read pages from file, so it could discard pages from RAM, even if there was no swap. Should OS face severe memory pressure, even swap-less system could get laggy due to pages discards and re-reading from (slow!) HDD. Somehow, I'm not smartass enough to know if there is knob to disable this latency-badass mechanism. I'm very curious about it, if someone knows answer they're really welcome. SSDs make it happen orders of magnitude faster. If something is runaway, OOM killer would kick in 1-2 seconds, making associated lag spike short. Mechanical HDDs would give you some paging-related lags under these circumstances. Nowhere as bad as with swap, but still above of what most users would prefer.

                      The best way to implement a highly responsive system is to reduce bloat by removing all the things you'll never use. Make sure there is free RAM and available cores
                      Sure, in helps. And btw, cpu scheduler isn't bad, so even if cpu is busy and no free cores are available, it hardly going to be laggy, unless someone terribly screwed things up like e.g. running CPU-heavy processes under highest priorities (or even rt). Unless such a suicidal actions in effect, cpu scheduler would do the trick. RAM on other hand is quite important due to mentioned issues.

                      Comment

                      Working...
                      X