Announcement

**oiaohm** · 26 July 2022, 07:14 PM

Originally posted by arQon View Post

Not sure about the "might" there. You *never* want threads actually running at realtime, unless you do. You're sharing a finite resource, and you can absolutely break a system if you try to schedule more RT threads than you have coes. (and sometimes even with just a single RT thread).

This is a question for RTOS design. How should this RTOS fail in this case. Yes the choice is basically stop processes dead or go soft. When you get to safety-critical certified hard real-time stopping complete system because software as allocated too much to the CPU is not allowed.

The reality if you can totally break the system by attempt to run too many threads you cannot get safety critical certification under IEC 61508 in many cases.

IEC 61508 turns out to be a real pain in the but for design a RTOS system. Yes stopping processes from starting can make a solution fail IEC 61508 same with going soft and letting system go soft.

Remember the killing processes part from IEC 61508 is that you go into a controlled fail not a uncontrolled fail. Same with letting system go soft for a short while to perform a controlled crash.

Yes the safety critical being in the Linux kernel version of Real-time does mean if are pushing too hard and have not set cgroup that hard limits applied you fall though to soft realtime when too many threads for the cpu to handle is pushed this is not error on the Linux kernel side. This is error of not understanding safety critical hard real-time kernel behavour that you have to declare fail if CPU is overloaded otherwise fail to soft is to be presumed. Yes the declared hard limits means that OS should stop you from start threads that could possible overload the CPU.

Safety critical Hard Realtime RTOS has some extra defined behavours over a Hard Realtime OS. Yes Hard Realtime RTOS where you can break the system completely because you tried scheduled too many threads is not a safety critical RTOS.

Redhat and other parties for cars and so on want Safety Critical RTOS solutions.

Linux Kernel real-time behavour does not make sense to some because they are not think that Linux kernel is being designed for safety critical usages long term or that Linux kernel as SIL 1 and SIL 2 certifications in safety critical done and is working on SIL3.

Fail though to soft real-time from hard real-time mode is something people don't understand safety critical requirements think is a error when in fact its a functional requirement. Also not knowing safety critical requirements means you think that trying to schedule more RT threads than you have cores to handle is a valid failure to completely crash the system as it up to the person developing the real-time solution to avoid this problem why its not valid under safety critical is because you have to try to mitigate developer errors as well as this is safety critical meaning uncontrolled errors equal someone dead so all errors if possible must be controlled failure . Controlled failure still might kill someone by it way less likely to kill someone then uncontrolled failure.

Originally posted by arQon View Post

It's worth pointing out that the permissions around any negative use of nice kinda suck. If you're hoping that this can be used for games, you very much don't want that: it would take us back to W95 behavior with everything running as root. However, if you've got spare cores and a preempt kernel, you should end up with much better results even without any code using this directly, just because of Linux's latency issues. It's "really" for systems use though.

This has been a on going side effect of more realtime work being added. Yes overall jitter/latency of the Linux kernel reducing.

Please note PREEMPT_RT getting mainline is not end of story either then its getting more automated testing up to make sure new patches are not breaking the real-time support.

Realtime and Safety Critical happen to be different things but where they overlap create some horrible OS/solution designer headaches.

**cj.wijtmans** · 26 July 2022, 07:59 PM

Originally posted by arQon View Post

It does, but

> Or are realtime threads still not realtime enough without realtime kernel fixes?

They aren't. Often they aren't even close.

> On a system you might not want all threads real time whereas you want others real time scheduled.

Not sure about the "might" there. You *never* want threads actually running at realtime, unless you do. You're sharing a finite resource, and you can absolutely break a system if you try to schedule more RT threads than you have coes. (and sometimes even with just a single RT thread).

> [if] you want code executed at every 16.6ms is a pita to program and practically impossible.

That's what vysnc is for.

It's worth pointing out that the permissions around any negative use of nice kinda suck. If you're hoping that this can be used for games, you very much don't want that: it would take us back to W95 behavior with everything running as root. However, if you've got spare cores and a preempt kernel, you should end up with much better results even without any code using this directly, just because of Linux's latency issues. It's "really" for systems use though.

Vsync is nice for a video thread but what if you have the game logic separated into a thread(game ticks are not the same as framerate, especially in a server). In my case i want the thread to wait for the next tick. Impossible since sleep() will sometimes take so long it will miss the next tick. You can make a tickless game logic thread ofcourse but there are a few issues with that. No accurate simulation, if you want accurate simulation you need schedule tasks and make sure those tasks are done within a tick. Secondly it makes time dilation kinda tricky if not impossible. Sure i could abuse vsync. But that doesnt work on a server.

**wertigon** · 27 July 2022, 12:44 AM

Originally posted by cj.wijtmans View Post

In my case i want the thread to wait for the next tick. Impossible since sleep() will sometimes take so long it will miss the next tick.

If this is important then you could perhaps mitigate by doing:

Code:

const int period = 1000;   // one ms
int nextWake;

while (1) {
    nextWake = time() + period;
    do_game_stuff();

    do usleep(10); while (time() < nextWake);
}

This is not guaranteed to work all the time every time but this will pretty much guarantee your simulation runs on time unless the system is overscheduled all the time. At which point you have bigger problems anyway.

Sometimes you might get a couple ms stutters though.

**coder** · 27 July 2022, 04:07 AM

Originally posted by cj.wijtmans View Post

Vsync is nice for a video thread but what if you have the game logic separated into a thread

I don't know what to make of the VSync point, because it seems to me that the only way you could tie into a true hardware interrupt is in the driver. And in a hardware ISR, you usually can't do very many interesting things.

Besides, in the age of VRR displays, VSync means a lot less than it used to.

Originally posted by cj.wijtmans View Post

In my case i want the thread to wait for the next tick. Impossible since sleep() will sometimes take so long it will miss the next tick.

I haven't thought much about game engine design since my early days, but it seems to me you're wrestling with a fundamental tradeoff between low latency and high consistency. The only way I see to negotiate it, without going deep into thread scheduling & prioritization, is to give up a little latency by working ahead. I heard someone mention a certain game (I forget which) that pipelines work on up to 4 consecutive frames, concurrently.

**coder** · 27 July 2022, 04:19 AM

Originally posted by wertigon View Post

If this is important then you could perhaps mitigate by doing:

So, your assumption is that by sleeping at a such a fine granularity, usleep() won't actually yield the timeslice? Because, if it does, and isn't immediately resumed, then this might be no solution at all. I think kernel sleeps use the underlying kernel clock, which is typically just around 1 kHz. It wouldn't be too hard to run this experiment and see what happens under CPU contention.

In the worst case, your solution could be counter-productive, in that by burning more cycles, the task could be more likely to lose out in cases of CPU contention.

**wertigon** · 27 July 2022, 08:16 AM

Originally posted by coder View Post

So, your assumption is that by sleeping at a such a fine granularity, usleep() won't actually yield the timeslice? Because, if it does, and isn't immediately resumed, then this might be no solution at all. I think kernel sleeps use the underlying kernel clock, which is typically just around 1 kHz. It wouldn't be too hard to run this experiment and see what happens under CPU contention.

The basic idea here is that in the event of temporary congestion, the simulation will stutter but rapidly catch up when it falls behind.

Originally posted by coder View Post

In the worst case, your solution could be counter-productive, in that by burning more cycles, the task could be more likely to lose out in cases of CPU contention.

Sure, but you can easily change to 100 us intervals, 250 us, 500 us. Some experimentation might be in order.

**oiaohm** · 27 July 2022, 10:33 AM

Originally posted by wertigon View Post

Sure, but you can easily change to 100 us intervals, 250 us, 500 us. Some experimentation might be in order.

Time slice gets a lot more tricky with Linux.

CFS Scheduler — The Linux Kernel documentation

https://docs.kernel.org/scheduler/sched-design-CFS.html#some-features-of-cfs

Different schedulers have no concept of time slice.

You idea of usleep you way around the problem is bad one. Think about someone hibernating/power saving laptop this can equal many missed ticks in a row and this is natural normal on consumer systems.

High resolution timers and dynamic ticks design notes — The Linux Kernel documentation

https://docs.kernel.org/timers/highres.html#dynamic-ticks

Yes usleep is based on the hrtimers and if the sched tick stops so does usleep. Yes you have to accept at times that if user powersaves or system load stuff will not be on time if you are not in real-time mode. Yes real-time you normally end up basically disabling all the power saving options to get it.

Yes going to sleep and checking the time that you have come back when you were expecting can be a good thing. Just in case you are now like 1000 ticks into the future. There have been come creative cheats in games caused by a person being able to jump many ticks into the future.

So the code you kind of did was a complete waste cpu time and made it more likely that you would overload CPU so not be on time.

Not real-time in soft general scheduler you need to accept be it either the computer power-saving or system load that timer event(this includes usleep) is not always going to be 100 percent on time.

Yes your code end up creating a works for me case where due to check so often and wasting cpu time the skipped clocks can come less visible. Yes setting usleep exactly to 1000 would be giving system no slack room. So usleep for 950 so you can start processing 50 early for one MS tick can be the solution. Of course you still need to be checking time to make sure you some distance more into future than expected.

usleep(3) - Linux manual page

https://man7.org/linux/man-pages/man3/usleep.3.html

Yes pays to read the manual as well

The usleep() function suspends execution of the calling thread for (at least) usec microseconds. The sleep may be lengthened slightly by any system activity or by the time spent processing the call or by the granularity of system timers.

I like how the manual says can be lengthened slightly by any system activity. They don't define slightly. The scary thing is this slightly lengthened could be a complete existence time of the universe and still obey the manual. Yes the "at least" in the manual of usleep is critical. Also non real-time timers also have the "at least" property. So you application will not be notified early but it totally valid for application to be notified late and application should check the current time to know if it late. Yes needing to check time normally means you want shorter than target time so you have time to check the clock and see if you are being messed with or not.

**wertigon** · 27 July 2022, 02:14 PM

Originally posted by oiaohm View Post

Time slice gets a lot more tricky with Linux.

https://docs.kernel.org/scheduler/sc...eatures-of-cfs
Different schedulers have no concept of time slice.

You idea of usleep you way around the problem is bad one. Think about someone hibernating/power saving laptop this can equal many missed ticks in a row and this is natural normal on consumer systems.
https://docs.kernel.org/timers/highr...#dynamic-ticks

Easy on the keyboard there, I wasn't trying to solve every problem + world hunger here, my solution was specifically targeted at userspace for strict non-realtime requirements. I agree the proper solution is a usleepuntil() call, since no such call exists, well... You make do with what you have.

As for hibernation / sleep / (re)setting system clock, you could resolve that by setting a maximum periods allowed, e.g. an amendment that looks something like this;

Code:

const int period = 1000; // one ms
const int maxWaitTime = 5*period; // wait max 5 periods
int nextWake;

while (1) {
    nextWake += period;
    do_game_stuff();

    do { usleep(100); } while (time() < nextWake);
    if ( time() > nextWake + maxWaitTime ) {
        nextWake = time();
    }
}

Again not attempting to cover every use case or nook here, this is only for something that does not require realtime but still require regularity of some sort. A proper sleepuntil() would eliminate 99% of the problem space here.

Thanks for pointing out the errors in my thinking though, much appreciated!

**binarybanana** · 27 July 2022, 02:24 PM

I tried the RT patches first time with 5.18 to see how it fares regarding latency (audio) and general desktop performance. I didn't do any exhaustive benchmarks, but the 7z bench mode actually showed improved performance, by ~7%. That was without even using real time priority. No other changes to the system in between either. Just recompiled the kernel and reboot. Looking over qlop logs shows no obvious drop in throughput when compiling stuff, either. I'm not sure what to make of this. Maybe it's actually slower in other workloads but for me there has been no noticeable downside to it so far.

**coder** · 27 July 2022, 03:03 PM

Originally posted by wertigon View Post

The basic idea here is that in the event of temporary congestion, the simulation will stutter but rapidly catch up when it falls behind.

I don't understand that, because if you're behind, the obvious thing to do would be to process however many iterations needed before going back to sleep.

Aside from tweaking thread priorities the approach I'd take is:

Compute at least one iteration ahead.
Try to wake up a little early.

Regarding point #2, you might round down your sleep time to the nearest lower multiple of /proc/sys/kernel/sched_min_granularity_ns. I think that would reduce the chance of oversleeping.

Originally posted by wertigon View Post

Sure, but you can easily change to 100 us intervals, 250 us, 500 us. Some experimentation might be in order.

My current thinking is that it's best if you can simply avoid sleep loops. A much better approach is to simply use a backpressure-based structure, where you're ultimately throttled by your ability to generate more frames of output. In that sense, I guess you would ultimately be limited by vsync, as arQon said.

Announcement

PREEMPT_RT Might Be Ready To Finally Land In Linux 5.20

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment