Announcement

**pal666** · 10 July 2018, 09:42 PM

Originally posted by Weasel View Post

But the first and last points should be excluded because they happen without FUSE too. I don't see what's the point in including them?

So without those it's only an extra 2-way communication, not 4 times more as before.

with your stupid comparison it is 2 vs 0, i.e. infinitely slower compared to 4 vs 2 ( twice as slow )

Originally posted by Weasel View Post

Furthermore, any userspace drivers (from micro kernel) will do the exact same thing if it's implemented with syscalls.

that is why microkernels suck

**Weasel** · 11 July 2018, 11:52 AM

Originally posted by pal666 View Post

with your stupid comparison it is 2 vs 0, i.e. infinitely slower compared to 4 vs 2 ( twice as slow )

Yeah, and a program that executes in 1 clock cycle (less than 1 nanosecond) is infinitely slower than one that doesn't execute at all. Splendid way of looking at things.

**Weasel** · 11 July 2018, 01:36 PM

Originally posted by oiaohm View Post

It depends on the type of IPC how bad it is. Shared memory IPC avoided need to context switch other than to set it up. On multi core systems you are exploiting mmu to copy data.

https://blog.linuxplumbersconf.org/2016/ocw//system/presentations/4185/original/LPC%20HWC%202.0%20&%20drm_hwcomposer%20.pdf

Next if you call IPC bad android is bad. Surface finger to hardware compositor on Android is IPC. OpenGL ES can also be implemented on top of IPC on Android devices in fact Android binary only graphics drivers are meant to be implemented this way and libhybris supported using those drivers under normal Linux.

Yes the reason why samsung and other want to do usermode drivers for android is to avoid syscall overhead. IPC overhead is part of the Android API/ABI. With the prior goof ups with security those wanting to userspace drivers for Android need a safe way to-do it.

So there is a context thing. If you have to-do IPC anyhow avoid syscalls can have benefit.

Well, in my opinion, Android is horrible. But that's beside the point.

How do you do IPC synchronization without syscalls? The only way I see with shared memory is with spinlocks. Not very efficient.

I mean you gotta have a way to tell another process "tell me when this thing is filled up, until then I'll just wait".

**oiaohm** · 11 July 2018, 08:14 PM

Originally posted by Weasel View Post

Well, in my opinion, Android is horrible. But that's beside the point.

How do you do IPC synchronization without syscalls? The only way I see with shared memory is with spinlocks. Not very efficient.

IPC synchronisation is done a few different ways.

RCU Implementation of System V IPC

https://www.usenix.org/legacy/publications/library/proceedings/usenix03/tech/freenix03/full_papers/arcangeli/arcangeli_html/node9.html

RCU Implementation of System V IPC

RCU can be implemented just with shared memory in userspace. Basically your in kernel IPC behind the syscall is a shared memory IPC. So you have a shared memory IPC with the over head of syscall using the kernel one.

IPC does not require syscalls or locks to implement. IPC protected by syscall/in kernel space does have some security advantages and some ability to tweak the scheduler. Other than scheduler tweaking and security advantages a user space implemented well implemented IPC is identical to a well implemented kernel space IPC and I mean possibility identical in that you can use the same code in userspace and kernel space some operating systems do this.

Locks and notifications are not effective on multi core systems. Sync between cpu cores the most effective ways are done by shared memory.

Originally posted by Weasel View Post

I mean you gotta have a way to tell another process "tell me when this thing is filled up, until then I'll just wait".

That is not how efficient IPC works. To people application coding using system provided IPC will API look like that is what has happened. Really what happening is looking up if buffer has changed the application running your code where you had a wait. So yes spinlocking happens when you wait on any efficient IPC that supports multi core. The cost of the spinlocks is way lower than disrupting all cores attempting to send out notifications. Yes IPC waits are bad and are basically spinlocks. Coding you application to avoid doing waits helps.

Those attempt to get kdbus and bus1 into linux kernel had the same magical belief that once you were in kernel space the IPC rules changed. Some of the problem with dbus daemon was being coded too much around wait primitives in IPC.

Remember syscall always has to have functional code to perform request in. Most of what syscalls do that don't require privilege in kernel space is absolutely identical to what you can do in user space. So in most cases you cannot code some thing well in user space transfer it to kernel space will not help. Yes passing a task to kernel space does not magically make it faster unless there is something privileged the kernel can do to make it faster.

Items that can make in kernel stuff faster avoiding context switches and bias scheduler directly and that is about it. Of course coding in user space you can do some of that. Weasel syscalls are not magic solve all items.

**Weasel** · 12 July 2018, 08:35 AM

Originally posted by oiaohm View Post

IPC synchronisation is done a few different ways.
https://www.usenix.org/legacy/public...tml/node9.html

RCU can be implemented just with shared memory in userspace. Basically your in kernel IPC behind the syscall is a shared memory IPC. So you have a shared memory IPC with the over head of syscall using the kernel one.

IPC does not require syscalls or locks to implement. IPC protected by syscall/in kernel space does have some security advantages and some ability to tweak the scheduler. Other than scheduler tweaking and security advantages a user space implemented well implemented IPC is identical to a well implemented kernel space IPC and I mean possibility identical in that you can use the same code in userspace and kernel space some operating systems do this.

Locks and notifications are not effective on multi core systems. Sync between cpu cores the most effective ways are done by shared memory.

That is not how efficient IPC works. To people application coding using system provided IPC will API look like that is what has happened. Really what happening is looking up if buffer has changed the application running your code where you had a wait. So yes spinlocking happens when you wait on any efficient IPC that supports multi core. The cost of the spinlocks is way lower than disrupting all cores attempting to send out notifications. Yes IPC waits are bad and are basically spinlocks. Coding you application to avoid doing waits helps.

Those attempt to get kdbus and bus1 into linux kernel had the same magical belief that once you were in kernel space the IPC rules changed. Some of the problem with dbus daemon was being coded too much around wait primitives in IPC.

Remember syscall always has to have functional code to perform request in. Most of what syscalls do that don't require privilege in kernel space is absolutely identical to what you can do in user space. So in most cases you cannot code some thing well in user space transfer it to kernel space will not help. Yes passing a task to kernel space does not magically make it faster unless there is something privileged the kernel can do to make it faster.

Items that can make in kernel stuff faster avoiding context switches and bias scheduler directly and that is about it. Of course coding in user space you can do some of that. Weasel syscalls are not magic solve all items.

Spinlocks are not efficient in the slightest unless they wait for really small amounts of time (100 nanoseconds or so). They're performant, but waste a lot of energy since they peg one CPU core while it's waiting.

Lastly, I don't understand what you mean by "IPC is done by shared memory". I never said you do it without shared memory? I'm asking about synchronization man. How does one process know that the buffer is ready or not?

By repeatedly checking a byte in the shared memory? That's a spinlock, and isn't as "efficient" (which means power use per performance) as being put to sleep if the wait is long enough.

The reason you need syscalls for longer wait times is because only the kernel can put threads to sleep and obviously you should sleep while waiting to not drain useless processing power repeatedly checking a byte (spinlock) if the wait is long. Spinlocks are good for realtime systems but not for things that just receive a signal once in a while. Shared memory doesn't matter here at all for synchronization.

If it's 1 microsecond or more waiting time then spinlocks are a total waste of power.

**oiaohm** · 12 July 2018, 11:24 AM

Originally posted by Weasel View Post

Spinlocks are not efficient in the slightest unless they wait for really small amounts of time (100 nanoseconds or so). They're performant, but waste a lot of energy since they peg one CPU core while it's waiting.

Lastly, I don't understand what you mean by "IPC is done by shared memory". I never said you do it without shared memory? I'm asking about synchronization man. How does one process know that the buffer is ready or not?

RCU Implementation of System V IPC

https://www.usenix.org/legacy/publications/library/proceedings/usenix03/tech/freenix03/full_papers/arcangeli/arcangeli_html/node9.html

RCU Implementation of System V IPC

That is all covered in here. That is using sleeping locks. What is basically a veration on a spin lock. Instead of spinning the cpu you just yield to the scheduler when stuff is not ready.

How one process knows that the buffer is ready is simple it got time from scheduler it check shared memory status that buffer has changed and is ready. If ready it executes if not it simple does a yield and lets scheduler move on to next task. Process could have waited for this by sleeping lock or spinlock or a combination of both. Really does not make much differences. Sleeping lock or spinlock still results in the same designed processes for userspace IPC. A real-time IPC in userspace is likely to choose spinlock stuff the extra cpu time burn. Non realtime iPC is likely to choose sleeping lock that is just using yield.

Originally posted by Weasel View Post

The reason you need syscalls for longer wait times is because only the kernel can put threads to sleep and obviously you should sleep while waiting to not drain useless processing power repeatedly checking a byte (spinlock) if the wait is long.

Yield to perform a sleeping lock https://linux.die.net/man/2/sched_yield is not a IPC function. Yes a the in kernel with Linux IPC is on a wait is doing the same thing checking the RCU data if there is nothing ready just yield the process.

Only things that userspace IPC cannot do by it self is items needing privillage like telling the scheduler to yeild. But you don't need a kernel space IPC to tell scheduler to yield.

Yes a long waiting spinlock userspace looks horrible until you work out that scheduler Preemption will it at some point will kick in and the task will go to sleep. Compared to using cpu to cpu message and stopping everything(yes I do mean IPC done by CPU to CPU direct interrupts) the over head of doing IPC with a spinlock is nothing. Of course is better to-do IPC with sleeping locks where you yield cpu time to other tasks instead of consuming cpu time with a spin lock.

Reality is this is exactly the same as what you would be doing inside kernel space hidden behind a syscall. The over head difference between IPC done in userspace using yield/sleeping locks and IPC done in kernel space hidden behind syscalls is a suprise. The waiting on the kernel based IPC is slightly lighter. Reading the transferred information on the userspace IPC is lighter because you are not using a syscall to access the transferred message. So over all there is bugger all difference.

Basically its been a long time since we have used interrupt based IPC. Most people has not noticed that about 30 years go industry started to change interrupt based IPC to memory based IPC with memory data structures holding the sync information with applications either spin locking or sleep locking and individually look up on status.

Yes when you choose a modern IPC you have to choose between performance or efficiency and the difference is per application spinlocks or sleeping locks..

**Weasel** · 12 July 2018, 03:05 PM

Originally posted by oiaohm View Post

https://www.usenix.org/legacy/public...tml/node9.html
That is all covered in here. That is using sleeping locks. What is basically a veration on a spin lock. Instead of spinning the cpu you just yield to the scheduler when stuff is not ready.

How one process knows that the buffer is ready is simple it got time from scheduler it check shared memory status that buffer has changed and is ready. If ready it executes if not it simple does a yield and lets scheduler move on to next task. Process could have waited for this by sleeping lock or spinlock or a combination of both. Really does not make much differences. Sleeping lock or spinlock still results in the same designed processes for userspace IPC. A real-time IPC in userspace is likely to choose spinlock stuff the extra cpu time burn. Non realtime iPC is likely to choose sleeping lock that is just using yield.

...but yielding is a syscall? So idk what your point is. This is a syscall IPC implementation.

It's a so-called "Lightweight mutex" or semaphore or whatever (look it up, or I can provide links, but honestly who cares). The thing is that the initial check doesn't require syscall in case the buffer is already ready. Many implementations also spinlock for a small amount of time first, before yielding (syscall).

This approach isn't perfect though. To gain anything you'll literally need each process to execute in its own CPU core. If your system is overloaded with other stuff it will be bad and it should yield immediately otherwise it's wasting CPU cycles for nothing (literally).

**oiaohm** · 13 July 2018, 05:16 AM

Originally posted by Weasel View Post

...but yielding is a syscall? So idk what your point is. This is a syscall IPC implementation.

If you have done any formal study on the topic yield is not classed as enough to make something a kernel based IPC. Its a optimisation to software based IPC that could have remained with a spinlock.

"Lightweight mutex" does not exist under Linux.

404 Not Found

http://www.infradead.org/~mchehab/kernel_docs/unsorted/pi-futex.html

PI-futexes there is another way to do a lock under Linux. It fairly much stays userspace most of the time.

PI-Futexes in fact avoids doing syscall at all. There are other ways to trigger a yield. Yes there are other transparent optimisations where scheduler can find out what the user space is up to and auto yield. This is something people coding for windows are not use to doing. Where particular actions can be done without syscall yet kernel scheduler can understand how to correctly respond without doing a syscall or anything extra..

**Weasel** · 13 July 2018, 07:56 AM

Originally posted by oiaohm View Post

If you have done any formal study on the topic yield is not classed as enough to make something a kernel based IPC. Its a optimisation to software based IPC that could have remained with a spinlock.

"Lightweight mutex" does not exist under Linux.
http://www.infradead.org/~mchehab/ke.../pi-futex.html
PI-futexes there is another way to do a lock under Linux. It fairly much stays userspace most of the time.

How do they not exist when you can literally implement them yourself easily?

All you need is an atomic instruction (xadd, cmpxchg, or simple stuff like lock inc) to see if you even have to yield in the first place. If the lock is not contended, then you avoid any syscalls, presumably in 99.9% of cases. This has nothing to do with Linux, it's a simple trick relying only on x86 instructions.

But I forgot, most "developers" these days just can't implement anything themselves, all they do is plumb a few libraries together. If it's not in a library, then it "doesn't exist" right? :roll:

Originally posted by oiaohm View Post

PI-Futexes in fact avoids doing syscall at all. There are other ways to trigger a yield.

Like what? Genuinely curious. Yielding is only done by the kernel so I can't see a way without a syscall or a context switch (which is just as bad/same thing).

Note that spinlocks are very bad if you have to wait a lot. They're ok in realtime systems where the cores can't be used by anything else but even there they're wasteful (energy, not performance).

The reason is that a thread in a spinlock is considered "running" by the kernel and thus when it runs its slice, nothing else gets to run at all that could use that core (unless it's on a different core). So unless you have a dedicated core for each thread, spinlock makes your system quite slower (unless the wait time is less than a context switch, say, in the nanosecond range instead of microsecond), since other threads won't run as much due to the spinlock and will wait for no reason. Avoiding syscalls isn't always a good idea.

**oiaohm** · 13 July 2018, 03:55 PM

Originally posted by Weasel View Post

Like what? Genuinely curious. Yielding is only done by the kernel so I can't see a way without a syscall or a context switch (which is just as bad/same thing).

Yielding is done by the kernel scheduler. You can avoid syscall to trigger it. Scheduler preemption how this is triggered can get complex. You cannot avoid a context switch that the scheduler triggers when it changes between tasks.

Thing to remember yes you can implement a lot of stuff in userspace. But if the kernel knows a particular form of userspace implementation to-do the task that userspace uses this can avoid needing to syscall and allow the kernel scheduler to intelligently yield based on system load..

"Lightweight mutex" does not exist under Linux. << I really should write as should not exist. Just because you could implement that does not mean you should. Microsoft provides a recommend design of lightweight mutex windows scheduler understands. When on a different operating system you have to use what that supports..

Also depends on scheduler if yield is productive on not. Yes some schedulers a yield you don't switch to another task instead its a do nothing spinlock until time slice is used up because the scheduler cannot issue part time slices. Yes fixed time frame schedulers there is no way to do a early context switch. What better sitting in a do nothing spinlock so wasting the complete time slice or sitting in a user space spinlock that might trigger so use the allocated time slice. Please note spin-lock will have scheduler preemption at some point.

Yes if a userspace spinlock is bad or good completely depends what scheduler the Linux kernel is using. Something windows developers are not use to having 20 different schedulers to choose from.

Announcement

Google's Gasket Driver Framework Landing For Linux 4.19

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment