Announcement

**oiaohm** · 05 April 2021, 08:03 AM

Originally posted by rene View Post

good luck porting all the plethora of applications to io_uring and readfile. I will be surprised if your system boot up will clock in in advance of measurement accuracy ;-) Maybe it will even come out slower with all the io_uring setup overhead ;-)

5.12 and newer Linux kernels using io_uring where you can always is lower overhead than using a syscall route.

There is a lucky part here a lot of the service init usage on Linux is using a small number of applications over and over again. So less than 20 applications altered will give 90% of the gain there. Like using sysctl to to read a stack of values.

How many small file do shell built-in open?

This is really a pointless question because all it shows is you don't know the use case rene. Most Linux shell script interactions with small files is not be shell built in features. We are talking like the packages that provide sysctl and other items that interface with small files that are used a hell of a lot. Of course systemd it self interacts with a huge number of small files. You move into KDE/Gnomes process management things again another huge stack of interactions by limited number of applications with large number of small files in /proc. You run the ps or top command another case of huge number of small files read.

**rene** · 05 April 2021, 08:18 AM

Originally posted by oiaohm View Post

5.12 and newer Linux kernels using io_uring where you can always is lower overhead than using a syscall route.

There is a lucky part here a lot of the service init usage on Linux is using a small number of applications over and over again. So less than 20 applications altered will give 90% of the gain there. Like using sysctl to to read a stack of values.

This is really a pointless question because all it shows is you don't know the use case rene. Most Linux shell script interactions with small files is not be shell built in features. We are talking like the packages that provide sysctl and other items that interface with small files that are used a hell of a lot. Of course systemd it self interacts with a huge number of small files. You move into KDE/Gnomes process management things again another huge stack of interactions by limited number of applications with large number of small files in /proc. You run the ps or top command another case of huge number of small files read.

Yeah, since 20+ years I contribute and maintain a source distribution (t2, previously known as ROCK Linux) scripted in Shell, but I don't know shell. Again, this few system calls saved is still not eve a percentile of the CPU cycles spend. Also nice try throwing large numbers like 90% when the total speedup to gain if any is 0.00001%. So yeah, port some 20 applications and get total speedup of 0,000009%. Maybe. Maybe the additional setup overhead and libu_ioring also contribute to a 0.00001% slowdown as again this small file access is nothing in the total grand scheme of boot time you are talking about. have a good day.

**Guest** · 05 April 2021, 03:40 PM

Instead of all of this speculation, why don't you just profile and debug, and figure out where and why the bottlenecks occur, so that you can actually fix them?

**indepe** · 05 April 2021, 04:19 PM

The two ideas are not mutually exclusive. Even if there was a vectored syscall, I would still like to use readfile() inside that vectored syscall.

So if utility or app wants to read thousands of sysfs values, it can issue a single vectored call with thousands of readfile() commands. That seems easier to use, avoids any mental acrobatic about passing on the file descriptor, and combines efficiency improvements of both.

**oiaohm** · 05 April 2021, 08:37 PM

Originally posted by rene View Post

Yeah, since 20+ years I contribute and maintain a source distribution (t2, previously known as ROCK Linux) scripted in Shell, but I don't know shell. Again, this few system calls saved is still not eve a percentile of the CPU cycles spend. Also nice try throwing large numbers like 90% when the total speedup to gain if any is 0.00001%. So yeah, port some 20 applications and get total speedup of 0,000009%. Maybe. Maybe the additional setup overhead and also contribute to a 0.00001% slowdown as again this small file access is nothing in the total grand scheme of boot time you are talking about. have a good day.

This is still you not understanding the problem. Readfile syscall does not just gain because 3 syscalls come one. Remember a normal open syscall has to add 1 to the ulimit file count and close subtracts 1 the Readfile syscall can skip this. Remember current ulimit values have to be shared so changing them can be quite performance costly.

Small file reads are in fact used a lot even in a sysvinit. You think PIDfile checking to prevent starting a service twice. There are quite a few small file usage presure points once you go looking for them. These are all hidden in utility programs that are used over and over again. Systemd with more complex cgroup process management is reading more small files. The effects of ulimit changes having to be shared also gets bigger. Think systemd is going to be used by KDE and Gnome around every started application.

The reality here is a straight up vectored syscall will not deal with the ulimit issue caused by open and close because you have it performing a normal syscall. A vectored syscall to properly handle error events is going to require setup so you cannot be sure u_ioring will be any worse than doing a vectored syscall due to how heavy doing proper error handing is going to be.

Yes readfile vs a vector syscall doing (open,read,close) the readfile is going to be faster because the vectored is going to have allow for the possibility it does not contain a close so change ulimit on open or close or process what vectored syscall has been passed to attempt to work out if it does not have to do ulimit change.

Vectored syscall is more flex able but it that flexibility that comes back and kicks you where it hurts with the open read close problem. Dealing with that flexibility issues does come with its own overhead problem.

Really I would like to see a writefile as well.

Those asking for a readfile syscall in Linux have correctly looked at the problem. Yes just like indepe said if you wanted vectored syscalls having readfile syscall is not mutually exclusive. You have failed to notice the all advantages the readfile syscall gets its reduction in syscalls it also reduction in atomic locking on ulimit a valve due to not needing to increase and decrease the file open count. Yes this is another bit of error handling how to handle the problem of max open file count.

For a everything is a file operating system having a open read close and open write close as single syscalls has their place.

**coder** · 05 April 2021, 08:58 PM

Originally posted by Volta View Post

That would be nice. Of course they chose GPL incompatible library for this.

The point isn't about any one implementation or even GCD, itself, but rather the mere fact of having kernel support for work-stealing.

Originally posted by Volta View Post

Btw. isn't OpenMP alternative to it?

libgomp, specifically, is a piece of garbage, in my opinion. It uses userspace spinlocks, by default (though some distros disable this). And the OpenMP standard just focuses on the user interface for exposing concurrency, whereas my concern is about having the necessary platform support for efficient, scalable, and well-behaved implementations of it (and other APIs, too).

**coder** · 05 April 2021, 09:03 PM

Originally posted by rene View Post

How many small file do shell built-in open?

Depends on which built-in. But, the speed of shell scripts is typically limited by using non-builtin commands, which is why I'm concerned with how quickly they execute.

**coder** · 05 April 2021, 09:09 PM

Originally posted by indepe View Post

The two ideas are not mutually exclusive. Even if there was a vectored syscall, I would still like to use readfile() inside that vectored syscall.

So if utility or app wants to read thousands of sysfs values, it can issue a single vectored call with thousands of readfile() commands. That seems easier to use, avoids any mental acrobatic about passing on the file descriptor, and combines efficiency improvements of both.

Some of us have mentioned using readfile() via io_uring, which is the same basic idea and likely to be possible as soon as readfile() is merged in.

**indepe** · 05 April 2021, 11:32 PM

Originally posted by coder View Post

Some of us have mentioned using readfile() via io_uring, which is the same basic idea and likely to be possible as soon as readfile() is merged in.

Certainly.

Also, in any case, regarding passing on file descriptors in io_uring:

If using the IOSQE_FIXED_FILE flag on IORING_OP_OPENAT, the addr2 field could be used as an "index into the files array registered with the io_uring instance". That might even generally be an easier way to initialize and/or update the registered files array. So it would just use existing features and data structures, just by adding that flag/option to the open file operation.

However the application would need to maintain the indexes, so in our use case readfile as its own operation would be much easier.

**rene** · 06 April 2021, 03:23 AM

Originally posted by coder View Post

Depends on which built-in. But, the speed of shell scripts is typically limited by using non-builtin commands, which is why I'm concerned with how quickly they execute.

I have news for you: readfile() will not affect the speed of which shell scripts and especially non-builtin commands execute at all.

Announcement

READFILE System Call Rebased For More Efficient Reading Of Small Files

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment