Announcement

**coder** · 27 December 2022, 12:21 PM

Can someone please explain what x86-64 instructions are used to implement this RSEQ functionality, since it apparently doesn't use atomics?

It sounds a lot like HLE (Hardware Lock Elision), but didn't Intel effectively discontinue it? IIRC, they went as far as disabling it from newer microcode, even in CPUs where it worked correctly, since it was vulnerable to side-channel attacks.

**archkde** · 27 December 2022, 02:13 PM

Originally posted by coder View Post

Can someone please explain what x86-64 instructions are used to implement this RSEQ functionality, since it apparently doesn't use atomics?

It sounds a lot like HLE (Hardware Lock Elision), but didn't Intel effectively discontinue it? IIRC, they went as far as disabling it from newer microcode, even in CPUs where it worked correctly, since it was vulnerable to side-channel attacks.

RSEQ is for per-CPU data. Unlike HLE, concurrent access is not supported. No special hardware support is required at all.

**coder** · 27 December 2022, 02:37 PM

Originally posted by archkde View Post

RSEQ is for per-CPU data. Unlike HLE, concurrent access is not supported. No special hardware support is required at all.

How does it work, then? Is it reading some kind of timeslice counter and ensuring the value is the same between the beginning and end of the "sequence"? That would guarantee that you haven't been migrated to another CPU, but at the expense of the "sequence" taking a non-deterministic amount of thread time, to complete. As long as it's for trivially short & simple operations, you could reasonably assume it'd run at most twice.

Update: I found a manpage for it. I hadn't realized it was a feature of the kernel scheduler. So, that explains how atomicity is achieved.

https://kib.kiev.ua/kib/rseq.pdf

**arteast** · 27 December 2022, 03:07 PM

It works pretty similar to the Intel's hardware RTM: RTM ensures a certain code sequence executes absolutely uninterrupted and any read data is not changed concurrently, and restarts it in case of interruption.
RSEQ simply ensures a certain usercode sequence executes completely on the same CPU without being interrupted by any other usercode, and restarts it in case of interruption (unlike RTM it obviously cannot transact and atomically publish/rollback writes inside the critical section so one has to be careful about what is written inside the critical section and when). It is done by having Linux scheduler check whether the userspace thread is inside a RSEQ critical section upon scheduling any thread off the CPU for migration or preemption or upon delivering an async signal.
If you still have questions about how it works or what it does, just google it. There's been quite a few talks on the topic, there are plenty of articles describing it and its usage in various OSS.

**coder** · 27 December 2022, 03:42 PM

Originally posted by arteast View Post

and restarts it in case of interruption

The manpage I linked + other info I found says it calls a second function in the event that it's interrupted. That makes sense to me, because you'd have to implement your own cleanup before restarting it.

What's interesting about that manpage is that it references actual instruction pointers in the struct, as u64's, rather than the function pointers I'd have expected. Then again, it is just a thin wrapper around a syscall, so I guess that's fair. Plus, the scheduler needs to have some notion of the sequence extents. In some sense, it reminds me a little of setjmp()/longjmp().

Originally posted by arteast View Post

There's been quite a few talks on the topic, there are plenty of articles describing it and its usage in various OSS.

I wasted some time looking at that stuff. It doesn't help to see what people are building on it, when I was really just trying to get my head around the foundation, itself.

**arteast** · 27 December 2022, 05:11 PM

Originally posted by coder View Post

The manpage I linked + other info I found says it calls a second function in the event that it's interrupted.

True, the kernel moves thread's IP to a separate "abort" address, and that user-provider code would usually restart rseq critical section, or maybe recalculate something and then restart.

Originally posted by coder View Post

you'd have to implement your own cleanup before restarting it.

I think that shouldn't be the case. If you need to cleanup results of aborted rseq CS then you probably doing something wrong. CS can be preempted at any time and another thread could see per-CPU state before CS had a chance to cleanup anything, so per-CPU state must be consistent throughout CS execution; abort handler cannot assume per-CPU state hasn't been seen or it hasn't changed, and the abort handler can be preempted itself at any time... So usually CS will end with a single write that changes (invariant-related/user-visible) per-CPU state.
Separate "abort" routine allows to write helpers to some common scenarios to wrap all the nitty-gritty details of rseq machinery; they'd just return something indication whether critical section succeeded or if it was aborted.

Originally posted by coder View Post

What's interesting about that manpage is that it references actual instruction pointers in the struct, as u64's, rather than the function pointers I'd have expected.

Well yeah, rseq critical section is a contiguous block of machine code; it is assumed that the store/commit would be made the very last machine instruction in that block. This means that the actual CS has to be written in assembly language... Which is why the aforementioned helpers are kind of important for RSEQ adoption.

**Weasel** · 28 December 2022, 09:33 AM

I still don't understand what this is even useful for. It can't handle multi-threaded concurrency, so what the hell is the point of it these days?

As I can see, your thread gets notified when it's preempted and the callback is called. Why would this even need special handling? Ok, a signal interrupted your thread, so what? What's the use case? I want concrete examples, not a toy example.

I just fail to see why you would even use a lock if you aren't going for concurrency. (by lock I don't mean atomic lock, I mean like a spinlock etc)

**discordian** · 28 December 2022, 06:31 PM

It's used for per-core buffers in lttng, ie no "real" concurrency but the possibility of the scheduler interrupting.

what this effectively allows is something similar to disabling interrupts/the scheduler(on one core), something costly and bad outside of microcontrollers, but by just trying untill the operation run interrupted.

You get some costly retries, but your successful operations will run with the least interference possible.

Announcement

Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

Restartable Sequences "RSEQ" Additions Expected For Linux 6.3

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment