Low Latency
I have a little scheduling latency tester (I do realtime audio, so it's something of interest).
Basically it runs (zero or N*2) looping background threads (niced) and then 1 or N (high priority/realtime) threads that attempt to be scheduled using nanosleep to wake up at predefined times.
Take the numbers with a grain of salt, I've only used it for internal stuff. The testing could be better too, using locked memory and forcing swapping during execution.
I'm aware I'm not testing response to IO events here, but it does give some interesting numbers:
These figures are for kernels with wakeup frequency around 2756 hz (I test lots more frequencies, but around here is where the CPU shows up some issues for the laptop it's running on):
Quick explanation of the figures:
its = iterations
freq/frq = wakeup frequency
perNan = Nano delta between wakeups
perMic = Microseconds delta between wakeups
perMil = Milliseconds delta between wakeups
nhi = number high priority (SCHED_RR)
nlo = number regular priority (normal linux)
bbac = number background looping threads
fails = number of iterations that failed to meet the wakeup deadline
avr = average "oversleep"
mnan = maximum "oversleep" in nanoseconds
mmic = maximum "oversleep" in microseconds
mmil = maximum "oversleep" in milliseconds
For the above, there are some failures even when using high priority tasks i.e. nhi != 0 and missed it's wake up by an excessive margin
For the above, we see it's a little better, less failures, and on the whole the oversleep amount (avr) a lot less.
For the above, the realtime kernel fairs more or less in the same ballpark as the 3.3.2 kernel, but mostly with average longer wakeup latencies (avr).
Lets look at some heavier loads (higher wakeup hz)
So even under heavy scheduling loads and timing constraints, the 3.3.2 kernel is actually pretty damn good when it comes to scheduling.
I've not yet found tight enough timing constraints that the realtime kernel fairs better than the stock 3.3.2.
Now for gaming it gets tricky - as other posters have mentioned up thread, you really want to be measuring the latency from say
mouse click -> graphical response
So in an ideal world, you create a USB device with a little light on it, and using a high speed camera you record when you click (light comes on) and when the screen displays a result. Then compute the latency based on the number of frames the camera recorded inbetween.
I know I should release it (peer review, eyes over the code etc) but for the moment it's tied to the internal code base here and I have higher priority fish to fry. I just thought some might find the figures interesting.
Cheers,
D
I have a little scheduling latency tester (I do realtime audio, so it's something of interest).
Basically it runs (zero or N*2) looping background threads (niced) and then 1 or N (high priority/realtime) threads that attempt to be scheduled using nanosleep to wake up at predefined times.
Take the numbers with a grain of salt, I've only used it for internal stuff. The testing could be better too, using locked memory and forcing swapping during execution.
I'm aware I'm not testing response to IO events here, but it does give some interesting numbers:
These figures are for kernels with wakeup frequency around 2756 hz (I test lots more frequencies, but around here is where the CPU shows up some issues for the laptop it's running on):
Quick explanation of the figures:
its = iterations
freq/frq = wakeup frequency
perNan = Nano delta between wakeups
perMic = Microseconds delta between wakeups
perMil = Milliseconds delta between wakeups
nhi = number high priority (SCHED_RR)
nlo = number regular priority (normal linux)
bbac = number background looping threads
fails = number of iterations that failed to meet the wakeup deadline
avr = average "oversleep"
mnan = maximum "oversleep" in nanoseconds
mmic = maximum "oversleep" in microseconds
mmil = maximum "oversleep" in milliseconds
Code:
2.6.39-preempt-gentoo-r3 Frequency pass: freq(2756) perNan(362844.70) perMic( 362.84) perMil( 0.36) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(00) fails( 1) avr( 78600.29) mnan( 383190) mmic( 383.19) mmil( 0.38) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(08) fails( 0) avr( 50682.76) mnan( 228595) mmic( 228.59) mmil( 0.23) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(00) fails( 13) avr( 73451.72) mnan( 978501) mmic( 978.50) mmil( 0.98) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(08) fails( 8) avr( 54678.49) mnan( 2271642) mmic( 2271.64) mmil( 2.27) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(00) fails( 5) avr( 36309.15) mnan( 839371) mmic( 839.37) mmil( 0.84) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(08) fails( 1) avr( 4686.12) mnan( 1854080) mmic( 1854.08) mmil( 1.85) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(00) fails( 0) avr( 22805.04) mnan( 172117) mmic( 172.12) mmil( 0.17) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(08) fails( 0) avr( 6372.65) mnan( 63957) mmic( 63.96) mmil( 0.06)
Code:
3.3.2-preempt-c2T7400-2.16 Frequency pass: freq(2756) perNan(362844.70) perMic( 362.84) perMil( 0.36) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(00) fails( 0) avr( 88775.06) mnan( 211109) mmic( 211.11) mmil( 0.21) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(08) fails( 0) avr( 54785.60) mnan( 239303) mmic( 239.30) mmil( 0.24) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(00) fails( 0) avr( 90745.55) mnan( 341535) mmic( 341.54) mmil( 0.34) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(08) fails( 1) avr( 52708.94) mnan( 543801) mmic( 543.80) mmil( 0.54) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(00) fails( 0) avr( 87453.27) mnan( 154366) mmic( 154.37) mmil( 0.15) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(08) fails( 0) avr( 4731.63) mnan( 17300) mmic( 17.30) mmil( 0.02) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(00) fails( 0) avr( 35838.07) mnan( 168112) mmic( 168.11) mmil( 0.17) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(08) fails( 0) avr( 5730.70) mnan( 71996) mmic( 72.00) mmil( 0.07)
Code:
3.0.14-rt31-preempt-c2T7400-2.16 Frequency pass: freq(2756) perNan(362844.70) perMic( 362.84) perMil( 0.36) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(00) fails( 0) avr( 78913.17) mnan( 345092) mmic( 345.09) mmil( 0.35) One pass: its(2000) frq( 2756.00) nhi(00) nlo(01) bbac(08) fails( 0) avr( 52385.20) mnan( 162347) mmic( 162.35) mmil( 0.16) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(00) fails( 0) avr( 88372.10) mnan( 349752) mmic( 349.75) mmil( 0.35) One pass: its(2000) frq( 2756.00) nhi(00) nlo(02) bbac(08) fails( 1) avr( 55378.42) mnan( 751052) mmic( 751.05) mmil( 0.75) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(00) fails( 0) avr( 28074.12) mnan( 109365) mmic( 109.36) mmil( 0.11) One pass: its(2000) frq( 2756.00) nhi(01) nlo(00) bbac(08) fails( 0) avr( 5074.74) mnan( 18434) mmic( 18.43) mmil( 0.02) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(00) fails( 0) avr( 65620.09) mnan( 137933) mmic( 137.93) mmil( 0.14) One pass: its(2000) frq( 2756.00) nhi(02) nlo(00) bbac(08) fails( 0) avr( 5588.25) mnan( 27105) mmic( 27.11) mmil( 0.03)
Lets look at some heavier loads (higher wakeup hz)
Code:
3.3.2-preempt-c2T7400-2.16 Frequency pass: freq(3000) perNan(333333.33) perMic( 333.33) perMil( 0.33) One pass: its(2000) frq( 3000.00) nhi(00) nlo(01) bbac(00) fails( 0) avr( 147932.56) mnan( 189177) mmic( 189.18) mmil( 0.19) One pass: its(2000) frq( 3000.00) nhi(00) nlo(01) bbac(08) fails( 1) avr( 54985.65) mnan( 846406) mmic( 846.41) mmil( 0.85) One pass: its(2000) frq( 3000.00) nhi(00) nlo(02) bbac(00) fails( 1) avr( 99805.41) mnan( 1078824) mmic( 1078.82) mmil( 1.08) One pass: its(2000) frq( 3000.00) nhi(00) nlo(02) bbac(08) fails( 1) avr( 50858.96) mnan( 935250) mmic( 935.25) mmil( 0.94) One pass: its(2000) frq( 3000.00) nhi(01) nlo(00) bbac(00) fails( 0) avr( 86170.02) mnan( 109395) mmic( 109.39) mmil( 0.11) One pass: its(2000) frq( 3000.00) nhi(01) nlo(00) bbac(08) fails( 0) avr( 4755.44) mnan( 38145) mmic( 38.15) mmil( 0.04) One pass: its(2000) frq( 3000.00) nhi(02) nlo(00) bbac(00) fails( 0) avr( 40428.37) mnan( 142389) mmic( 142.39) mmil( 0.14) One pass: its(2000) frq( 3000.00) nhi(02) nlo(00) bbac(08) fails( 0) avr( 5128.04) mnan( 55881) mmic( 55.88) mmil( 0.06)
Code:
3.0.14-rt31-preempt-c2T7400-2.16 Frequency pass: freq(3000) perNan(333333.33) perMic( 333.33) perMil( 0.33) One pass: its(2000) frq( 3000.00) nhi(00) nlo(01) bbac(00) fails( 1) avr( 85895.76) mnan( 364207) mmic( 364.21) mmil( 0.36) One pass: its(2000) frq( 3000.00) nhi(00) nlo(01) bbac(08) fails( 2) avr( 55065.84) mnan( 995272) mmic( 995.27) mmil( 1.00) One pass: its(2000) frq( 3000.00) nhi(00) nlo(02) bbac(00) fails( 2) avr( 96075.35) mnan( 654147) mmic( 654.15) mmil( 0.65) One pass: its(2000) frq( 3000.00) nhi(00) nlo(02) bbac(08) fails( 0) avr( 64900.84) mnan( 146107) mmic( 146.11) mmil( 0.15) One pass: its(2000) frq( 3000.00) nhi(01) nlo(00) bbac(00) fails( 0) avr( 53375.44) mnan( 134737) mmic( 134.74) mmil( 0.13) One pass: its(2000) frq( 3000.00) nhi(01) nlo(00) bbac(08) fails( 0) avr( 5056.19) mnan( 9248) mmic( 9.25) mmil( 0.01) One pass: its(2000) frq( 3000.00) nhi(02) nlo(00) bbac(00) fails( 0) avr( 93658.56) mnan( 140582) mmic( 140.58) mmil( 0.14) One pass: its(2000) frq( 3000.00) nhi(02) nlo(00) bbac(08) fails( 0) avr( 5230.94) mnan( 9641) mmic( 9.64) mmil( 0.01)
I've not yet found tight enough timing constraints that the realtime kernel fairs better than the stock 3.3.2.
Now for gaming it gets tricky - as other posters have mentioned up thread, you really want to be measuring the latency from say
mouse click -> graphical response
So in an ideal world, you create a USB device with a little light on it, and using a high speed camera you record when you click (light comes on) and when the screen displays a result. Then compute the latency based on the number of frames the camera recorded inbetween.
I know I should release it (peer review, eyes over the code etc) but for the moment it's tied to the internal code base here and I have higher priority fish to fry. I just thought some might find the figures interesting.
Cheers,
D
Comment