Announcement

Collapse
No announcement yet.

Linux 5.9 To Allow Controlling Page Lock Unfairness In Addressing Performance Regression

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by down1 View Post
    Can someone explain why 'fairness' could trump performance in a CS setting? Is this a buzzword change or a real engineering improvement?
    Latency vs throughput is a classic problem.

    Imagine a website that can serve up 1,000 pages per second with no delays - every page is served up immediately. Or you can tweak a setting and it can serve up 5,000 pages per second.

    Much better, right? You need 5 times fewer servers! Except that now 10% of the pages served take 30 seconds before the user sees them while the other 90% are still immediate, leading to massive complaints about the website being down and unresponsive from 10% of your users even though the server is doing a lot more work more efficiently.

    Often times a little unfairness is acceptable, but once it reaches a large enough scale it's no longer worth it. That's the basis behind real-time operating systems, FYI - they are stricter about guaranteeing that low latency/high fairness for critical applications that really need it, while your typical desktop OS is ok with not everything being perfect.
    Last edited by smitty3268; 09-17-2020, 11:14 PM.

    Comment


    • #12
      Originally posted by down1 View Post
      Can someone explain why 'fairness' could trump performance in a CS setting? Is this a buzzword change or a real engineering improvement?
      There's been another meltdown earlier this year regarding Linux spinlocks and unfairness. Some links:And Torvalds says:

      Pretty much every time we picked an unfair - but fast - locking model in the kernel, we ended up regretting it eventually, and had to add fairness
      Last edited by GrayShade; 09-18-2020, 02:19 AM.

      Comment


      • #13
        Originally posted by timrichardson View Post

        In some 'serious' linux forums, Phoronix articles and comments were treated dismissively and even contemptuously, but I haven't seen much of that attitude in the past year or so. I think partly that's because the insights have matured, but significantly because the standard of these conversation threads has improved a lot. Subjectively, I think the level of commentary is much more informed.
        I am pleased to subscribe to help support the work. This regression was a good catch.
        There are quite a number of threads on this forum where specific individuals will post very opinionated views on lots of other stuff. So it's expected that other forums will have people with strong opinions about this site or the benchmarks used too.

        But more people knowing about this site means more people who can oppose the most opinionated posters. And that can also be seen on this site - if the subject is reasonably well known, the biggest mouths aren't allowed to be unopposed with their very subjective views.

        Comment


        • #14
          Originally posted by smitty3268 View Post

          Latency vs throughput is a classic problem.

          Imagine a website that can serve up 1,000 pages per second with no delays - every page is served up immediately. Or you can tweak a setting and it can serve up 5,000 pages per second.

          Much better, right? You need 5 times fewer servers! Except that now 10% of the pages served take 30 seconds before the user sees them while the other 90% are still immediate, leading to massive complaints about the website being down and unresponsive from 10% of your users even though the server is doing a lot more work more efficiently.

          Often times a little unfairness is acceptable, but once it reaches a large enough scale it's no longer worth it. That's the basis behind real-time operating systems, FYI - they are stricter about guaranteeing that low latency/high fairness for critical applications that really need it, while your typical desktop OS is ok with not everything being perfect.
          And this is also why we have things like "nice" and "priority" where it's possible to give hints to the OS about how critical it is to get time slices quickly. A batch work that needs to be done sometime during the night, doesn't need the same priority as the on-the-fly CAN computation before the next screen redraw. Without hints from the user, it will be hard for the OS to know if a program that has consumed 5 hours of CPU time is a irrelevant program that has got stuck in a busy loop or a critical research computation that needs to be finished so the researcher can evaluate the outcome.

          Windows tries to use a very uniform scheduling (which isn't very successful for lots of program types) , while Linux has multiple scheduler algorithms and lots of scheduling tweak parameters.

          Comment


          • #15
            Originally posted by timrichardson View Post

            In some 'serious' linux forums, Phoronix articles and comments were treated dismissively and even contemptuously, but I haven't seen much of that attitude in the past year or so. I think partly that's because the insights have matured, but significantly because the standard of these conversation threads has improved a lot. Subjectively, I think the level of commentary is much more informed.
            I am pleased to subscribe to help support the work. This regression was a good catch.
            this is because most of those non-benchmark articles here are just copy-paste from official announcements, with no added value. i'd rather have something done with a quarter of effort LWN articles are done with. writing with with some insights, even on a small range of topics.

            Comment


            • #16
              Thank you for your work Michael.

              Comment


              • #17
                What's a "Long-term Linus"?
                SCNR

                Comment


                • #18
                  Side note: I think the PLU 1000 times were supposed to be similar to the 5.8 times, which turned out be the case for Threadripper but not for the tested EPYC. In consequence, the PLU 4/5 average is faster than 5.8 only for Threadripper.

                  Originally posted by GreenByte View Post

                  It becomes important when you juggle lots of balls at the same time. Imagine you gotta juggle 5 balls at the same time. But well, one ball is super fast, so instead of slowing down and juggling all balls you just put the other 4 down and juggle that single ball. I mean, yeah it's faster? But you were asked to juggle 5 balls, not one. Here's the isssue with the fairness. In essence, the kernel would allow only one ball to get past over and over while the other 4 would stay still. Overall, performance is greater, but the kernel was asked to juggle 5 balls, not one.
                  I think your comment about one thread having better performance is especially true for throughput tests that effectively hammer on a single critical section. They create a situation where a single thread without locking would be the fastest, and a single thread with a lock and no contention would be the second fastest. I guess unfair locking, in such tests, can create a situation that is effectively similar to a single thread without contention on the lock. Because all that matters is the speedy execution of the critical section that only allows one thread at a time anyway.

                  And if you have 250 user threads but only 32 or 64 CPUs, and an unfair lock that gets a number of threads out of the way (burying them in waiting lists), then that may not have the same negative effect as it could otherwise. In the Apache test this may be balanced by network response for 200/250 users being better with more threads active.

                  Originally posted by smitty3268 View Post

                  Latency vs throughput is a classic problem.

                  Imagine a website that can serve up 1,000 pages per second with no delays - every page is served up immediately. Or you can tweak a setting and it can serve up 5,000 pages per second.

                  Much better, right? You need 5 times fewer servers! Except that now 10% of the pages served take 30 seconds before the user sees them while the other 90% are still immediate, leading to massive complaints about the website being down and unresponsive from 10% of your users even though the server is doing a lot more work more efficiently.

                  Often times a little unfairness is acceptable, but once it reaches a large enough scale it's no longer worth it. That's the basis behind real-time operating systems, FYI - they are stricter about guaranteeing that low latency/high fairness for critical applications that really need it, while your typical desktop OS is ok with not everything being perfect.
                  A related problem is when you have a number of unique threads, instead of just a bunch doing the same thing. Then there can be a problem if a unique thread with a unique function gets stuck in one of the waiting lists because there are always other going through the same lock. To illustrate that, in a car racing simulation where each car has its own thread, one of the cars might stand still for a while. And then all other cars crash into it.

                  Comment


                  • #19
                    Originally posted by smitty3268 View Post

                    Latency vs throughput is a classic problem.
                    Thanks that's something I do understand. It initially sounded like "I bring you fairness.... and it sucks"

                    Comment


                    • #20
                      Originally posted by timrichardson View Post

                      In some 'serious' linux forums, Phoronix articles and comments were treated dismissively and even contemptuously, but I haven't seen much of that attitude in the past year or so. I think partly that's because the insights have matured, but significantly because the standard of these conversation threads has improved a lot. Subjectively, I think the level of commentary is much more informed.
                      I am pleased to subscribe to help support the work. This regression was a good catch.
                      The past reputation of Phoronix I think was also due to younger Michael's writing style, and benchmark choices. It doesn't take long browsing older articles to find "Project X just rolled out new release, still doesn't implement Y" -- there was a lot of subtle digs and editorializing about things a project WASN'T accomplishing, on what was ostensibly a news post. Also since Michael's main product is a benchmark tool, there was a tendency to focus on benchmarks that were easy to run, not necessarily ones that are meaningful or interesting.

                      Comment

                      Working...
                      X