Announcement

Collapse
No announcement yet.

Ubuntu's Real-Time Kernel Approaching GA Status

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    I've been running a RT kernel for a few months, just for testing purposes. Performance wise I haven't noticed any downsides for compiling, compression, encoding or gaming (in a VM) workloads. I tested the 7z benchmark mode and it was actually a bit faster. Latency-wise I dunno. I guess PulseAudio likes it but even without I can go so low that the audio driver in the kernel starts to freak out.
    I guess if you use a fully preemptible kernel already there isn't much to gain for most people.

    Comment


    • #32
      Originally posted by uid313 View Post

      Can't you get the best of both worlds?
      Get great performance for non-realtime stuff like games, listening music, watching movies, etc but get benefits of real-time for audio production, etc?
      I have thought about that for some time now. Theoretically it should be possible if you have an OS that supports extensive priority management (from tasks down to RAM and I/O-priority). Then you could guaranty certain resources to perform real time stuff while everything with lower priority runs in the remaining time slices. If you don't issue RT-priority tasks then it acts like a normal (non RT) OS.
      The scheduler needs to run as RT as soon as there is any RT task (or always).

      What do more advanced programmers think of this idea?

      Edit: Turboboost and throttling will always be a challenge for RT.
      Last edited by Anux; 11 January 2023, 07:11 AM.

      Comment


      • #33
        Originally posted by erniv2 View Post

        Preempt should be in the kernel for years now, they sorted out the kernel functions that can stay in line for more critical things ages ago ?

        And i bet every distribution nowdays is atleast build with voluntary preemptible, meaning kernel functions that know about their state can go lower on the queue for other executions for some time.

        About the real RT stuff i used ubuntu studio ages ago and it was a horrible expierience, so nope aslong you dont need it dont use it.

        If you compile yourself you should be fine with voluntary preemptible and 300 HZ jiffeys i think thats the middle ground for a desktop system, but that was probably 3 years ago~~.
        If you are using NVIDIA drivers stay away from it :-D

        Comment


        • #34
          Originally posted by FPScholten View Post
          If you do not know what RT kernels do, what they can be used for and how certain software will definitely not work on them, you should not even think about installing an RT kernel. RT kernels do not magically boost performance, on the contrary for many applications it will even cause hangs or stuttering performance.
          If you need low latency, go ahead and use the software designed for that. If you want performance an high data throughput, by all means stay away from RT kernels.
          I have similar experience. I was messing with RT kernel once it became mainline.
          I did not notice any difference as dekstop user. Not only that, I noticed that one specific game became unplayable.(Pillars of eternity Deadfire). It's FPS fallen from steady 40 in build up areas, to slide show of 20. So I changed it to 'no-preemption', because why not give it a go? And FPS is back in POE 2.
          At time i was a little addicted to Aphex Legends. After kernel changes, I did not notice any negative impact on gameplay/KD ratio. So now I don't bother with RT kernels.
          For the record, I have : CPU Brand: AMD Ryzen 9 3950X 16-Core

          Comment


          • #35
            Originally posted by Ironmask View Post
            This wouldn't work because that core would still need access to CPU registers, so the core would still need to be managed by the scheduler, and if the scheduler isn't real-time then nothing is real-time ...
            CODESYS actually does what you describe that cannot be done. Take a Windows OS, install their custom driver and it will remove a CPU core from Windows and designated to Real-Time PLC with deterministic scheduling. Say you have a 4-core Celeron PC, designate 1 CPU core to CODESYS and Windows will only see and have access to 3 cores. Any process that slows down Windows does not slow down the Real-Time software PLC.​ They also have solutions for Linux too. Only worked with Windows versions so far.

            Originally posted by Ironmask View Post
            I actually find the idea of a Ubuntu RTOS kind of laughable considering Ubuntu is pretty far removed from the robotics field in general ...
            Walking through Pack Expo, one of the larges automation trade shows, this year and found your statement to be incorrect. Found a dynamic picking automated arm that was being managed by a Ubuntu docker cluster. I didn't ask if they where using a RT kernel or standard, didn't think of it at the time. Picking different alternating geometric objects, in-realtime, and placing them into a box or bag for shipping is still hard to do.

            Comment


            • #36
              Originally posted by dimko View Post

              I have similar experience. I was messing with RT kernel once it became mainline.
              I did not notice any difference as dekstop user. Not only that, I noticed that one specific game became unplayable.(Pillars of eternity Deadfire). It's FPS fallen from steady 40 in build up areas, to slide show of 20. So I changed it to 'no-preemption', because why not give it a go? And FPS is back in POE 2.
              At time i was a little addicted to Aphex Legends. After kernel changes, I did not notice any negative impact on gameplay/KD ratio. So now I don't bother with RT kernels.
              For the record, I have : CPU Brand: AMD Ryzen 9 3950X 16-Core
              Unfortunately, your AMD Ryzen 9 3950X is a particularly bad choice for achieving low latencies, regardless of kernel configuration.

              But don't take my word for it -- here's what the developers of the PS3 emulator RPCS3 have to say about it:



              One of the first questions one may ask after seeing the graph is how a 3800X is performing better than a 3950X even though it has twice the cores and cache? The answer to that is due to increased latency from the 3950X’s multi-chiplet design. While the 3800X only has to communicate across two 4-core CCXes, the 3950X takes it a step further, and has two chiplets each with two 4-core CCXes it has to communicate across.

              Unlike other software, RPCS3’s PPU & SPU threads need to communicate constantly which results in a major bottleneck if these threads are split across multiple CCXes / chiplets. That ends up with the CPU hitting this bottleneck constantly with all the data moving around. This is why we do not recommend Ryzen CPUs unless they have a 3 or 4 core CCX design (6-8 core Ryzen CPUs, or a 4 core Ryzen APU). A 4 core CCX design is ideal as RPCS3 can fit all the PPU & SPU threads onto a single CCX, allowing users to bypass inter-CCX latency bottleneck entirely, provided the PPU & SPU threads are being scheduled properly to be placed on a single CCX.​

              Comment


              • #37
                Originally posted by Linuxxx View Post
                Unfortunately, your AMD Ryzen 9 3950X is a particularly bad choice for achieving low latencies, regardless of kernel configuration.
                That's thread communication latencies -- not task latencies. RTOS focuses on predictable task latencies, which are probably on the scale of milliseconds or so. The thread-communication latencies you're talking about are below 100 nanoseconds.


                Source: https://www.anandtech.com/show/16214...5700x-tested/5

                As the quoted text says, RPCS3 is not like normal software. In order to faithfully emulate the Cell processor, they need to do a lot more inter-thread communication than well-architected software would normally do.
                Last edited by coder; 12 January 2023, 03:21 AM.

                Comment


                • #38
                  Originally posted by Yndoendo View Post
                  install their custom driver and it will remove a CPU core from Windows and designated to Real-Time PLC with deterministic scheduling. Say you have a 4-core Celeron PC, designate 1 CPU core to CODESYS and Windows will only see and have access to 3 cores. Any process that slows down Windows does not slow down the Real-Time software PLC.
                  Once that core goes outside of its L2 cache, it's subject to contention with the other cores (i.e. for memory and I/O).

                  A least some newer Intel CPUs have a mechanism for classifying different threads, and then enabling non-exclusive L3 cache partitioning on a per-class basis. But, I'm not sure if there are analogous mechanism for managing memory & I/O bandwidth.

                  There's also the potential for power or thermal-throttling to affect cores other than those doing the heavy-lifting.

                  Comment


                  • #39
                    Originally posted by erniv2 View Post

                    And to stay in the gaming setting if you play online 60fps is 16.6ms how many ppl have a constant 16.6 ping to a game server? I have a 200mbit line and nope i do get pings around the 20ms it is physically impossible no matter how you look at it, unless you have ftth with 1gbit.
                    Ill byte, so tell us, how come other in 'blin d test' can say if they are on 60 FPS or above, or 100 and above, 140 and above with consistency?
                    Specifically gamers? https://www.youtube.com/watch?v=OX31kZbAXsA

                    Comment


                    • #40
                      Originally posted by coder View Post
                      Once that core goes outside of its L2 cache, it's subject to contention with the other cores (i.e. for memory and I/O).

                      A least some newer Intel CPUs have a mechanism for classifying different threads, and then enabling non-exclusive L3 cache partitioning on a per-class basis. But, I'm not sure if there are analogous mechanism for managing memory & I/O bandwidth.

                      There's also the potential for power or thermal-throttling to affect cores other than those doing the heavy-lifting.
                      I could be completely wrong here because I don't code on anything that could ever run on RTOSs, but it seems obvious to me that if you want to run RTOSs then software will need to be made machine sympathetic. You will need to use data structures that will fill the cache so they will get kicked out the cache because they were next to other commands waiting to be completed that couldn't in the cpu cycle. Much like the type LMAX libraries give for java. But I am just learning about RT kernels out of a passing curiosity and only have my own non-RT understanding to go on.

                      Comment

                      Working...
                      X