Announcement

Collapse
No announcement yet.

Benchmarking The Ubuntu "Low-Jitter" Linux Kernel

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hrm, I'm surprised that the low-jitter kernel had any overall performance boost at all. I'd expect performance to be down slightly, nearly across the board. low-jitter is nice for gaming and is also nice for GPGPU as you have less chance for the GPU to go idle while waiting on the CPU. For GPGPU, it really doesn't take any CPU work at all to send work to the GPU, but if the CPU's work dispatcher (to the GPU) thread gets delayed by even a single ms, then it could mean losing a massive number of compute cycles on the GPU and a massive drop in GPU compute performance.

    When we wrote the original folding@home wrapper for nVidia GPUs (CUDA) on Linux. Jitter was a big problem, I think partially because for whatever reason we had to use polling to check if the GPU was done with the GPGPU work previously sent. The problem was that we would sleep the work dispatcher thread to prevent wasting CPU, but it might have been many times longer than sleep duration before the GPU dispatcher thread got any CPU again. As a result, the GPU could go idle during that time and we'd lose massive amounts of GPU performance. I think some people fixed our initial work on the GPGPU wrapper by later going back and actively changing the polling interval to account for increasing jitter due to misc. system load. Increasing the amount of polling caused CPU usage to skyrocket on a thread that wasn't even doing any computation and the duration between each poll was still drastically random because of jitter. I had recommended people use low-jitter or real-time kernels for GPGPU work as they were clearly more efficient as we didn't have to use such aggressive polling and the thread would wake up from sleep exactly when it was supposed to. However, I think nVidia had a solution with the drivers where it wasn't necessary to use polling anymore for GPGPU.

    The kernel is *VERY* good at scheduling things to get the most efficient use out of the CPU, but sometimes you don't want to schedule things that way because it could mean starving some very low CPU usage threads that are very time critical, and causing the GPU to idle for *far* too long. You can "kind-of" counter that by being very aggressive with the polling, but that causes CPU usage to skyrocket and it's a lot of wasted CPU cycles when the CPU isn't doing other work. It also doesn't absolutely guarantee the kernel will give you the CPU when your thread needs it most.

    This is the kind of thing where I think Linux can really beat Windows at it's own game (pun intended). Linux can very easily be customized for optimal gaming all the way down to the kernel level (real-time / low-jitter kernels).
    Last edited by Sidicas; 10-15-2012, 11:26 PM.

    Comment


    • #17
      Originally posted by Sidicas View Post
      Hrm, I'm surprised that the low-jitter kernel had any overall performance boost at all. I'd expect performance to be down slightly, nearly across the board. low-jitter is nice for gaming and is also nice for GPGPU as you have less chance for the GPU to go idle while waiting on the CPU. For GPGPU, it really doesn't take any CPU work at all to send work to the GPU, but if the CPU's work dispatcher (to the GPU) thread gets delayed by even a single ms, then it could mean losing a massive number of compute cycles on the GPU and a massive drop in GPU compute performance.

      When we wrote the original folding@home wrapper for nVidia GPUs (CUDA) on Linux. Jitter was a big problem, I think partially because for whatever reason we had to use polling to check if the GPU was done with the GPGPU work previously sent. The problem was that we would sleep the work dispatcher thread to prevent wasting CPU, but it might have been many times longer than sleep duration before the GPU dispatcher thread got any CPU again. As a result, the GPU could go idle during that time and we'd lose massive amounts of GPU performance. I think some people fixed our initial work on the GPGPU wrapper by later going back and actively changing the polling interval to account for increasing jitter due to misc. system load. Increasing the amount of polling caused CPU usage to skyrocket on a thread that wasn't even doing any computation and the duration between each poll was still drastically random because of jitter. I had recommended people use low-jitter or real-time kernels for GPGPU work as they were clearly more efficient as we didn't have to use such aggressive polling and the thread would wake up from sleep exactly when it was supposed to. However, I think nVidia had a solution with the drivers where it wasn't necessary to use polling anymore for GPGPU.

      The kernel is *VERY* good at scheduling things to get the most efficient use out of the CPU, but sometimes you don't want to schedule things that way because it could mean starving some very low CPU usage threads that are very time critical, and causing the GPU to idle for *far* too long. You can "kind-of" counter that by being very aggressive with the polling, but that causes CPU usage to skyrocket and it's a lot of wasted CPU cycles when the CPU isn't doing other work. It also doesn't absolutely guarantee the kernel will give you the CPU when your thread needs it most.

      This is the kind of thing where I think Linux can really beat Windows at it's own game (pun intended). Linux can very easily be customized for optimal gaming all the way down to the kernel level (real-time / low-jitter kernels).
      the best solution would be if the real-time kernel would merged into mainline kernel.

      i think in the long run its the only way to go. if the people do this right there is no need for another kernel any-more.

      Comment


      • #18
        Originally posted by allquixotic View Post
        Me: I concocted this really great variety of beer that's designed to keep for long periods of time without refrigeration (it's got loads of preservatives)
        Noob. Beer is very good at preserving itself without any additives and refridgeration, which is exactly why it is not kept in a freezer in the super market. I know because I tried one - even a unfiltered one where the guaranteed time is shorter, but still in the range of months - about two years after that date and it was still good as ever.
        Prerequisites:
        - bottled in glass bottle, not aluminium can (though that should also work), or worst: PET
        - not opened
        - preferably strong (the stronger the better, obviously, as less germs survive)
        - brewer has to know what he was doing

        ... so in short, US beer is out
        Last edited by YoungManKlaus; 10-16-2012, 02:28 AM.

        Comment


        • #19
          Lol, very exciting. So my kernel performs as good as the standard kernel. If you knew their argument was about throughput, that pretty much goes invalid, for their config. So now you can enable low-latency in your kernel, and not worry about a performance hit. (well if you have a particular work-load you can.) No more stuttering videos or games (or atleast well on the way there.)

          Read also: https://lkml.org/lkml/2012/9/16/83

          http://phoronix.com/forums/showthrea...333#post291333

          Peace Be With You.
          Last edited by Paradox Uncreated; 10-16-2012, 03:20 AM.

          Comment


          • #20
            You know, I have never encountered any stuttering or mouse problems on standard kernels. So the low jitter kernel is not needed for every gamer, that's for certain.

            Comment


            • #21
              Originally posted by 89c51 View Post
              How is jitter defined in computer kernels. I know what jitter is in electronics but kernels?
              That's the question isn't it? I guess it has something to do with process scheduling for the os kernel part but really it's like he spams the word jitter for many things. Like "browser-video jitter" (on http://paradoxuncreated.com/Blog/wordpress/?p=2268).

              I guess what he means mostly is actually high frequncy FPS variation in video frames rendered by an application. So really much the same as the electronics definition for jitter of a clock signal.

              The thing is, he doesn't in any way demonstrate which of his kernel config changes accomplished what results in his tests, if he actually has done any repeatable testing at all.

              Now, maybe you can't get FPS variation or standard deviation out of Doom 3, but shouldn't that be the starting point? Doom 3 is open source now after all.

              I dunno, maybe there's something to this but the documentation and presentation leave something to be desired.

              Comment


              • #22
                Well I commented on this before. Apparently it is not something everyone notices. On windows you have these guys running all the services and daemons, and encouraging others to do the same, when there is a clear difference.
                On my machine, standard kernel can`t even do 30fps video without stutter.
                In these discussions there seems to be several who understand what I am talking about, yet many who talk against.
                And even one argues "Yes, you will have ultrasmooth videos but.." - What are you saying, you WANT your videos to chop, even if throughput is on average the same? That would be retardation.

                Peace Be With You.

                Comment


                • #23
                  Which all leads to single conclusion - we need a test that measures delay within kernel responses and not kernel throughput/raw performance.

                  Comment


                  • #24
                    This was entirely tuned by looking at doom 3 jitter, which I got out. And then observing some simple numbers in glxgears. Lukasz Sokol said he was going to have a look at a benchmark the weekend that was, so try mailing him: Lukasz Sokol <elGODesDAMNcrSPAMMERSgmail.com> remove the secret message.

                    Comment


                    • #25
                      @Paradox

                      Please create an automatic test for this jitter measurement. We have the doom3 source, should be very easy.

                      1. Insert timing calls before and after each frame
                      2. Keep track of max, and average.
                      3. At the end of a timedemo, calculate max - avg.
                      4. Print that difference both as microseconds and as a percentage of the average frame time. "Jitter for timedemo1469 was 1500 usec, or 15%".
                      5. ???
                      6. PROFIT

                      Comment


                      • #26
                        Please read my last post if you didn`t see it. Also doom 3 is uneccesary for getting numbers on jitter. Actually taxing the cpu and gpu as little as possible, displaying only something simple, to only obtain jitter-levels would be the best.

                        Peace Be With You.

                        Comment


                        • #27
                          Wrong Benchmarks

                          Phoronix Test Suite is awesome sauce, of course, but it's no good if you run the wrong tests.

                          In the future, when testing latency, these are the benchmarks that would be most relevant:
                          1. Add Cyclictest from the Realtime Linux wiki to the PTS (if not already present). This is the standard Linux latency benchmark.
                          2. For game benchmarks, report the Minimum FPS, median, and standard deviation instead of average FPS. Most importantly, you want to know the single longest frame time, and 5th percentile would be useful as well.
                          3. Throughput benchmarks should only be included as an afterthought, if at all. They don't measure the important variable.

                          Comment


                          • #28
                            Cyclic-test is nice, but measuring the signalpath all the way to opengl, is what I am most interested in.

                            Peace Be With You!

                            Comment


                            • #29
                              And as I said elsewhere, if it comes down to a choice between latency/max jitter and performance, 0.2ms (200uS) is where I stop caring.

                              Peace Be With You.

                              Comment


                              • #30
                                Originally posted by Paradox Uncreated View Post
                                And as I said elsewhere, if it comes down to a choice between latency/max jitter and performance, 0.2ms (200uS) is where I stop caring.
                                So you play your games on 5000 FPS?

                                Comment

                                Working...
                                X