Announcement

Collapse
No announcement yet.

Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

    Phoronix: Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

    The Restartable Sequences "RSEQ" system call merged a few years ago into the Linux kernel and is now used by the GNU C Library and friends for faster user-space operations on per-CPU data. Now coming next year with Linux 6.3 is set to be some notable additions to the RSEQ support...

    https://www.phoronix.com/news/RSEQ-I...ents-Linux-6.3

  • #2
    Can someone please explain what x86-64 instructions are used to implement this RSEQ functionality, since it apparently doesn't use atomics?

    It sounds a lot like HLE (Hardware Lock Elision), but didn't Intel effectively discontinue it? IIRC, they went as far as disabling it from newer microcode, even in CPUs where it worked correctly, since it was vulnerable to side-channel attacks.

    Comment


    • #3
      Originally posted by coder View Post
      Can someone please explain what x86-64 instructions are used to implement this RSEQ functionality, since it apparently doesn't use atomics?

      It sounds a lot like HLE (Hardware Lock Elision), but didn't Intel effectively discontinue it? IIRC, they went as far as disabling it from newer microcode, even in CPUs where it worked correctly, since it was vulnerable to side-channel attacks.
      RSEQ is for per-CPU data. Unlike HLE, concurrent access is not supported. No special hardware support is required at all.

      Comment


      • #4
        Originally posted by archkde View Post
        RSEQ is for per-CPU data. Unlike HLE, concurrent access is not supported. No special hardware support is required at all.
        How does it work, then? Is it reading some kind of timeslice counter and ensuring the value is the same between the beginning and end of the "sequence"? That would guarantee that you haven't been migrated to another CPU, but at the expense of the "sequence" taking a non-deterministic amount of thread time, to complete. As long as it's for trivially short & simple operations, you could reasonably assume it'd run at most twice.

        Update: I found a manpage for it. I hadn't realized it was a feature of the kernel scheduler. So, that explains how atomicity is achieved.

        https://kib.kiev.ua/kib/rseq.pdf
        Last edited by coder; 27 December 2022, 02:59 PM.

        Comment


        • #5
          It works pretty similar to the Intel's hardware RTM: RTM ensures a certain code sequence executes absolutely uninterrupted and any read data is not changed concurrently, and restarts it in case of interruption.
          RSEQ simply ensures a certain usercode sequence executes completely on the same CPU without being interrupted by any other usercode, and restarts it in case of interruption (unlike RTM it obviously cannot transact and atomically publish/rollback writes inside the critical section so one has to be careful about what is written inside the critical section and when). It is done by having Linux scheduler check whether the userspace thread is inside a RSEQ critical section upon scheduling any thread off the CPU for migration or preemption or upon delivering an async signal.
          If you still have questions about how it works or what it does, just google it. There's been quite a few talks on the topic, there are plenty of articles describing it and its usage in various OSS.

          Comment


          • #6
            Originally posted by arteast View Post
            and restarts it in case of interruption
            The manpage I linked + other info I found says it calls a second function in the event that it's interrupted. That makes sense to me, because you'd have to implement your own cleanup before restarting it.

            What's interesting about that manpage is that it references actual instruction pointers in the struct, as u64's, rather than the function pointers I'd have expected. Then again, it is just a thin wrapper around a syscall, so I guess that's fair. Plus, the scheduler needs to have some notion of the sequence extents. In some sense, it reminds me a little of setjmp()/longjmp().

            Originally posted by arteast View Post
            There's been quite a few talks on the topic, there are plenty of articles describing it and its usage in various OSS.
            I wasted some time looking at that stuff. It doesn't help to see what people are building on it, when I was really just trying to get my head around the foundation, itself.
            Last edited by coder; 27 December 2022, 03:48 PM.

            Comment


            • #7
              Originally posted by coder View Post
              The manpage I linked + other info I found says it calls a second function in the event that it's interrupted.
              True, the kernel moves thread's IP to a separate "abort" address, and that user-provider code would usually restart rseq critical section, or maybe recalculate something and then restart.

              Originally posted by coder View Post
              ​you'd have to implement your own cleanup before restarting it.
              I think that shouldn't be the case. If you need to cleanup results of aborted rseq CS then you probably doing something wrong. CS can be preempted at any time and another thread could see per-CPU state before CS had a chance to cleanup anything, so per-CPU state must be consistent throughout CS execution; abort handler cannot assume per-CPU state hasn't been seen or it hasn't changed, and the abort handler can be preempted itself at any time... So usually CS will end with a single write that changes (invariant-related/user-visible) per-CPU state.
              Separate "abort" routine allows to write helpers to some common scenarios to wrap all the nitty-gritty details of rseq machinery; they'd just return something indication whether critical section succeeded or if it was aborted.

              Originally posted by coder View Post
              ​​What's interesting about that manpage is that it references actual instruction pointers in the struct, as u64's, rather than the function pointers I'd have expected.
              Well yeah, rseq critical section is a contiguous block of machine code; it is assumed that the store/commit would be made the very last machine instruction in that block. This means that the actual CS has to be written in assembly language... Which is why the aforementioned helpers are kind of important for RSEQ adoption.

              Comment


              • #8
                I still don't understand what this is even useful for. It can't handle multi-threaded concurrency, so what the hell is the point of it these days?

                As I can see, your thread gets notified when it's preempted and the callback is called. Why would this even need special handling? Ok, a signal interrupted your thread, so what? What's the use case? I want concrete examples, not a toy example.

                I just fail to see why you would even use a lock if you aren't going for concurrency. (by lock I don't mean atomic lock, I mean like a spinlock etc)

                Comment


                • #9
                  It's used for per-core buffers in lttng, ie no "real" concurrency but the possibility of the scheduler interrupting.

                  what this effectively allows is something similar to disabling interrupts/the scheduler(on one core), something costly and bad outside of microcontrollers, but by just trying untill the operation run interrupted.

                  You get some costly retries, but your successful operations will run with the least interference possible.

                  Comment

                  Working...
                  X