Announcement

**Weasel** · 16 July 2018, 11:00 AM

Originally posted by ssokolow View Post

You're missing the point. The key detail isn't "spinlock" or "loop". The answer to your "yield without a syscall" question is "do [i]something[i] that the kernel can recognize as 'waiting on a lock' and wait for the inevitable interruption from the preemptive scheduler to substitute for a syscall as the means to jump into kernel code."

Yeah, that something is called a syscall. I mean that's exactly how you tell the kernel something, even interacting with virtual filesystems (because open and write are syscalls).

**ssokolow** · 16 July 2018, 08:51 PM

Originally posted by Weasel View Post

Yeah, that something is called a syscall. I mean that's exactly how you tell the kernel something, even interacting with virtual filesystems (because open and write are syscalls).

You have a broader definition of a syscall than we do, apparently.

As far as we're concerned, it devalues the term "syscall" to call "do something to our address space (eg. set a variable, create and enter a stack frame for a certain function, etc.) and then do something else while waiting for the preemptive scheduler to notice" by the same term as a function call that's expected to block until the kernel has seen it.

(So acquiring a futex could be technically considered a syscall if you don't require that "syscalls" mean things that use the platform's standard syscalling mechanism, but it's comprised of a spinlock and a means of signalling kernel space without an explicit syscall, so it only counts as a syscall because the spinlock satisfies the "blocks until the kernel has seen it" requirement on a technicality.)

Sure, one could argue that it's the same difference between cooperative multitasking (traditional syscalls) and preemptive multitasking (set a flag and wait to be interrupted), but, terminologically, it's closer to Reference Counting vs. Garbage Collection, where RC is technically a type of GC, but it's much more useful to use Garbage Collection as a shorthand for "forms of Garbage Collection requiring a Garbage Collector" in common conversation.

**oiaohm** · 17 July 2018, 12:02 AM

Originally posted by Weasel;n1036375[B

No[/B]! We're avoiding spinlocks here! That was the WHOLE point dude.

Really there is no point to totally avoiding spinlock. The preemptive scheduler will auto yield processes at kind of fixed intervals. A part time-slice is high ineffective for the scheduler to allocate. In fact most schedulers drop more part time slices than they allocate. So you might as well spinlock use out the remaining time-slice until the kind of fixed interval kicks in. But you don't want a spinlock resulting in consuming multi time-slices.

I was very clear that a Futex is not a light weight mutex. Futex is a mutex design to function in a way that is effective for the Linux schedulers.

Now a normal spinlock based mutex you could have a thread waiting on a lock could consume like unlimited time-slices before it gets the lock. At worst a futex is going to consume 1 time slice waiting before getting lock. Even if you do a syscall yield that part time slice can be just dropped by the scheduler any how so you have still consumed 1 time slice except this time doing absolutely nothing because you used syscall yield.

Totally avoiding spinlocks is bad design. There are times when the correct thing is spinlock just make sure its not spinlock forever. Spinlocking out the remaining bit of a timeslice is not a problem that futex does as this is the most effective usage of the remainder of the time-slice.

**Weasel** · 17 July 2018, 10:43 AM

Originally posted by ssokolow View Post

You have a broader definition of a syscall than we do, apparently.

As far as we're concerned, it devalues the term "syscall" to call "do something to our address space (eg. set a variable, create and enter a stack frame for a certain function, etc.) and then do something else while waiting for the preemptive scheduler to notice" by the same term as a function call that's expected to block until the kernel has seen it.

No, a syscall means using the syscall instruction (or int 0x80 on 32-bit, or sysenter from vdso, whatever).

How would the pre-emptive scheduler "notice" the memory change in the first place or even what to look for? In fact, if we're talking about the lock itself, it's in shared memory, so it wouldn't know which threads to switch out in the first place (one of them is running the mutex-protected code, the others are spinlocks or what?).

Originally posted by ssokolow View Post

(So acquiring a futex could be technically considered a syscall if you don't require that "syscalls" mean things that use the platform's standard syscalling mechanism, but it's comprised of a spinlock and a means of signalling kernel space without an explicit syscall, so it only counts as a syscall because the spinlock satisfies the "blocks until the kernel has seen it" requirement on a technicality.)

Acquiring the futex uses no syscalls at all, unless the lock is contended (in which case it's the kernel's job). The point is waiting on the lock to become free is what needs a syscall. Only the kernel can put threads to sleep. And syscalls is how you can communicate with the kernel, even if it's just "put me to sleep until X happens".

Originally posted by oiaohm View Post

Really there is no point to totally avoiding spinlock. The preemptive scheduler will auto yield processes at kind of fixed intervals. A part time-slice is high ineffective for the scheduler to allocate. In fact most schedulers drop more part time slices than they allocate. So you might as well spinlock use out the remaining time-slice until the kind of fixed interval kicks in. But you don't want a spinlock resulting in consuming multi time-slices.

Arguing with you is like talking to a wall, literally: you repeat the same stuff you said the first time around no matter what I bring up against it.

I already made it especially clear that we were not talking about a spinlock because that scenario (short locks) was not the subject. I even gave an example: you have two apps, synchronized with IPC and talk to each other via same IPC.

App A needs buffer data, which takes half a second for App B to fill.

So in this fucking scenario, you're not going to spinlock for half a second, that's just retarded. Kernel will interrupt it but it will make it run again because it's unaware it's in a spinlock (even if it was aware, how would it know when the lock becomes free?). You say you can give magical info to the kernel to provide it. Show simple code and say exactly how and what you do to tell the kernel without a syscall. How do you even communicate with the kernel without a syscall?

Some say I over-use the bold on this forum, but it's stuff like this that makes me think not even I go far enough to place emphasis it seems.

Originally posted by oiaohm View Post

I was very clear that a Futex is not a light weight mutex. Futex is a mutex design to function in a way that is effective for the Linux schedulers.

Show some code for one, I want to see what it does when waiting on the lock.

For example (no I'm not sarcastic), maybe there's some kernel syscall that places some memory region for the scheduler to "notice" and links it to a mutex (idk, just saying it's a possibility). In this case you still need a syscall but you do it when you create the mutex, not when you "lock" it, so only once outside of the loop.

This is a possibility in theory, but I'm skeptical until you show some code (no, it doesn't have to be your code!). Who knows, maybe I'll learn something?

**oiaohm** · 17 July 2018, 09:32 PM

Originally posted by Weasel View Post

App A needs buffer data, which takes half a second for App B to fill.

So in this fucking scenario, you're not going to spinlock for half a second, that's just retarded. Kernel will interrupt it but it will make it run again because it's unaware it's in a spinlock (even if it was aware, how would it know when the lock becomes free?). You say you can give magical info to the kernel to provide it. Show simple code and say exactly how and what you do to tell the kernel without a syscall. How do you even communicate with the kernel without a syscall?

All this is what is futex. Futex is implement and provided by a kernel header file. There are variations between architectures.

Linux Futex_wait on most architectures is a kind of spinlock with a few special feature. There are a few key differences. When spinlock starts spinning waiting for a lock key values are placed either in the stack or in the registers(this changes between platforms). These key values allows when the schedulder auto yield comes along to go saves thread state work out that the thread is in a Futex wait and get what futex the thread is in fact waiting for. The fact a futex has to be 32 bit counter and incremented in a particular way is so that schedulder before walking up a thread can query a futex and work out if it still locked or not. If it is still locked leave thread in the sleeping on event stack and choose a different thread.

Basically there are items like registers and memory areas kernel has to store when you perform a yield and restore when you restart a process. A Linux futex is exploiting those areas so you don't need to perform a special memory create because its reusing what already existed. You do need shared memory between your processes using a futex but this is not to share information about what futex your program is waiting on.
\
There is a bug in the Linux futex design. Declaring time of Linux futex was not include in the design. So you have to syscall for waits the following case.

App need buffer data but if it does not have buffer data in 30 second quit waiting. The userspace futex in Linux only knows how to wait forever. Yes a place was create while spinning to store the futex you are waiting coming free but a space was not created to record time out. If you were designing a futex for a new OS this is most likely something you would do difference. Create 2 slots 1 for storing lock and 1 for storing how long to wait for lock instead of the Linux futex of 1 being the lock.

Please note most of the code that does Linux futex is pure assembler. Worse gas assembler so really not the most pleasant thing to read and work out exactly the trick its up to for those who cannot read that. Yes it would involve read the yield and restore code. So we are talking quite few thousand lines of assembler code to understand how a futex works at code level.

Please note Linux futex was rewritten 5 times before getting to the current form and the current form has about 8 different platform variations yes there are a few where the platform is a cooperative scheduler because those platforms cannot run preemptive scheduler where it does a futex syscall to yield. Linux platforms with preemptive are not syscall out a futex because there is no real performance gain just set the information on stack or in registers so the scheduler knows what in heck is going on.

**Weasel** · 18 July 2018, 08:36 AM

Originally posted by oiaohm View Post

All this is what is futex. Futex is implement and provided by a kernel header file. There are variations between architectures.

Where's this header? Last time you linked to pthread, which is a library header, not a kernel header.

And no, as far as I'm aware, a futex is simply the code I showed you. A fast check for uncontested case, and only falling back to a syscall if the lock is contended. Plain and simple. Like I said, show some code (totally fine to not be yours, cause I know you don't code, I don't ask for yours just links and such) for informative purposes, because honestly at this point, your vague claims are way too confusing and annoying, code speaks to me best. Code should have no ambiguity and will settle this.

(assembly is fine too, or a disassembly, since most likely it will need some asm instructions, really anything works, just show how it's done ffs)

**oiaohm** · 20 July 2018, 12:56 AM

Originally posted by Weasel View Post

Where's this header? Last time you linked to pthread, which is a library header, not a kernel header.)

The pthread library links to the platform headers and uses platform dependant code.

futex(2) - Linux manual page

http://man7.org/linux/man-pages/man2/futex.2.html

pthread_cond_signal is a noted example of the usage of the linux/futex.h header.
https://github.com/torvalds/linux/bl.../linux/futex.h

#define FUTEX_OP(op, oparg, cmp, cmparg) \

(((op & 0xf) << 28) | ((cmp & 0xf) << 24) \

| ((oparg & 0xfff) << 12) | (cmparg & 0xfff))

When you look inside pthread you will find this macro with right options in a loop in the Linux platform section.

Do take note of the "Per-thread list head" that has what the current thread futex is for the scheduler to look up.
The idea it fall back to syscall when lock is contended there would be no need for "Per-thread list head". It that bit that allows you spinlock out the timeslice.

Really weasel you have presumed a futex is done a particular way yet you have not looked at the code. You will start seeing oddities like the Per-thread list head that don't add up for a light weight locking system using just normal yield.

Futex is half implemented in the kernel and half implemented in glibc providing pthread. So depending on the question alters if I need to point to Linux kernel headers or pthread source. To understand how to implement a futex without using glibc you need to read the pthread section of glibc so you do it right.

Yes a lot of UAPI from the Linux kernel is not directly design to be used by applications instead to be wrapped behind libraries.

futex(2) - Linux manual page

http://man7.org/linux/man-pages/man2/futex.2.html

Do also note that glibc uses a futex a particular way then does not provide end user with how it does it. So to use the glibc method you have to use pthread. If you don't use the glibc method you have to syscall futex or implement the stuff yourself.

**Weasel** · 20 July 2018, 11:04 AM

Originally posted by oiaohm View Post

The pthread library links to the platform headers and uses platform dependant code.

futex(2) - Linux manual page

http://man7.org/linux/man-pages/man2/futex.2.html

pthread_cond_signal is a noted example of the usage of the linux/futex.h header.
https://github.com/torvalds/linux/bl.../linux/futex.h

#define FUTEX_OP(op, oparg, cmp, cmparg) \

(((op & 0xf) << 28) | ((cmp & 0xf) << 24) \

| ((oparg & 0xfff) << 12) | (cmparg & 0xfff))

When you look inside pthread you will find this macro with right options in a loop in the Linux platform section.

Those are just constants to encode an operation in the futex syscall.

Even the comments say it: /* Second argument to futex syscall */

Seriously dude.

Also, from your man page:

the majority of the synchronization operations are performed in user space. A user-space program employs the futex() system call only when it is likely that the program has to block for a longer time until the condition becomes true.

Do you even read what you link to? The underlined & bolded part means it will use a syscall to block (wait).

**oiaohm** · 28 July 2018, 11:29 AM

Originally posted by Weasel View Post

Also, from your man page

o you even read what you link to? The underlined & bolded part means it will use a syscall to block (wait).

I also told you that you need to look inside the pthread implementation as well. Sorry the futex call only sets up the information.

You missed understood the word likely. . So its not using a syscall futex to block/wait straight away. Its using a syscall futex to register futex with scheduler and if when the time slice has ended has it not acquired the futex then its blocked. Yes the program syscalls but its time slice was not taken away.

Futex syscall is not yield/block. Its register. When timeslice runs out and the auto yield happens then block/wait happens.

Really if you could read it would help Weasel.

**Weasel** · 29 July 2018, 08:42 AM

Originally posted by oiaohm View Post

Its using a syscall futex to register futex with scheduler and if when the time slice has ended has it not acquired the futex then its blocked.

Where's this code? That would be interesting (it was one of my theoretical assumptions but idk how efficient it is to implement).

Sorry, I'm not going to look for code that may not exist. You claimed it does, so show me where it is (line of code or just link to a mirror, i don't care).

Or at least link to the article that taught you this, I'm sure it will have references to the code. I mean, where the heck did you get this information from about registering the futex? Just link it then?

Announcement

Google's Gasket Driver Framework Landing For Linux 4.19

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment