Announcement

**markg85** · 24 November 2020, 05:14 PM

I still see great value in having this. Just some cases where i think this would be really beneficial!

Loading all data for your desktop! Think about mainly icons and fonts. Specifically the icons are a LOT of them in fairly small files. And that's going on continuously while you're using you're desktop. So the benefit isn't "just" at startup but (to a lesser degree) throughout your desktop usage.
1. Don't forget all the icons that need to be loaded when browsing files. Or the thumbnails that need to be generated from images (the source image and the generated thumbnails can all easily be handled by this syscall).
Making system monitoring less taxing
1. We all know the irony in system monitoring. Where, when you open the monitor, it itself is often the top cpu user. That's mainly because it needs to continuously load and parse tiny process files. READFILE will quite likely show a measurable difference there.
Configuration file loading of your applications. Though this should really be hidden away in the toolkit you use to give you a "magical performance boost" during startup.
Compilation very likely too (loooooooots of small files)

In all fairness, I think you're only going to see and actual difference (observable) in the case of a file browser with a small setting (meaning you see a lot of files in one view, like a thousand). There you'll probably be able to notice just a very tiny fraction of a speedup. But i'm still talking about a couple 100 milliseconds at most.

But.. all of the above helps reduce CPU usage and, for the CPU, just get things done faster. That on it's own is a nice saving in terms of CPU load and therefore in power usage. If you're whole desktop would make proper use of it, i'd be willing to bet that there would be a measurable benefit in battery life (for a battery powered device that is). Do not underestimate how much IO is going on on your PC while you're seemingly doing nothing! There is much more happening in the background. The savings of this syscall might be tiny but that adds up over time!

I am apparently very much in favor of this :P

**Ironmask** · 24 November 2020, 05:25 PM

Originally posted by curfew View Post

So they need a new non-portable "filesystem" call to hack around performance issues related to a VIRTUAL filesystem sysfs? Sounds like scary shit, poor design and bad practises. I know the UNIX philosophy is about exposing all kinds of stuff as "files", but if it leads to last-resort hacks like this on the other side, then what the hell is the point?

Linux has some pretty fancy internal APIs, I wish it would just abandon all semblance of UNIX (or whatever's left of it's bloated, mangled corpse in the kernel) and just implement a clean, efficient, OS-specific API for software to use like Windows does. Both Linux and Windows have nice object management features, the difference is that Windows software actually makes use of it while Linux has to desperately pretend that it's working with files and not objects.
I really do not get UNIX/POSIX, it's a software standard that's already decades old and woefully obsolete. We stopped using COBOL, why are we using UNIX? Well, we're not, but we're desperately trying to pretend we are, which is even more embarrassing.

**AlDunsmuir** · 24 November 2020, 06:12 PM

Originally posted by Ironmask View Post

We stopped using COBOL, why are we using UNIX?

You may not be using COBOL, but much of the software used by financial institutions is written in that language, especially on mainframes running z/OS.
IBM continues to introduce new versions with new features that actual COBOL programmers find useful.

**kreijack** · 24 November 2020, 06:37 PM

Originally posted by uid313 View Post

But if you accidentally read a file that is larger than your RAM then it is going to blow up your computer. 💥

In what this

Code:

readfile(0, "/file that is larger than your RAM", buffer, size-that-is-larger-than-your-RAM, 0);

would be different from

Code:

fd = open("/file that is larger than your RAM", O_RDONLY);
read(fd, buffer, size-that-is-larger-than-your-RAM);
close(fd);

?

**zxy_thf** · 24 November 2020, 06:50 PM

Originally posted by Ironmask View Post

I really do not get UNIX/POSIX, it's a software standard that's already decades old and woefully obsolete. We stopped using COBOL, why are we using UNIX? Well, we're not, but we're desperately trying to pretend we are, which is even more embarrassing.

POSIX is useful when you want to port your applications to BSDs, but I do agree it's not that useful in practice.
For example it's close to impossible to have a http server with decent performance by only using POSIX.

**baryluk** · 24 November 2020, 07:05 PM

Originally posted by uid313 View Post

But if you accidentally read a file that is larger than your RAM then it is going to blow up your computer. 💥

False. You need to specify the size of the buffer. If the file is bigger, the buffer will be filled properly, and the readfile will return how many bytes were read.

Code:

ssize_t readfile(int dirfd, const char * pathname, void * buf, size_t count, int flags );

DESCRIPTION

readfile () attempts to open the file specified by `pathname`
and to read up to `count`
bytes from the file into the buffer starting at `buf` .

It is to be a shortcut of doing the sequence of open () and then
read () and then close () for small files that are read frequently, such as those in
procfs or sysfs .

If the size of file is smaller than the value provided in `count` then the whole file
will be copied into `buf` .

If the file is larger than the value provided in `count` then only
`count` number of bytes will be copied into `buf` .


...



RETURN VALUE

On success, the number of bytes read is returned.

It is not an error if this number is smaller than the number of bytes
requested; this can happen if the file is smaller than the number of
bytes requested.

...

BUGS

None yet!

After, all Greg KH knows what he is doing. Detecting buffer being full is also easy, and you can then fallback to other code path, if you wish.

**baryluk** · 24 November 2020, 07:07 PM

Originally posted by curfew View Post

So they need a new non-portable "filesystem" call to hack around performance issues related to a VIRTUAL filesystem sysfs? Sounds like scary shit, poor design and bad practises. I know the UNIX philosophy is about exposing all kinds of stuff as "files", but if it leads to last-resort hacks like this on the other side, then what the hell is the point?

It is just performance optimisation. readfile can be emulated on any POSIX system fully in about 5 lines of code. You don't need to use it either.

There is also nothing stopping other systems implementing similar optimisations.

Also, I can assure you, that if you are dyling with sysfs, then your program is already Linux specific... You will still need to check if readfile is supported on specific kernel you are running probably, if you want to support older kernel, but that is also pretty easy.

It is like that glibc will automatically detect readfile in kernel or not, and automatically use it or emulate using open+read+close. But hiding it in the library like this is a bit so-so, because it will temporarily create file descriptors, be prone to signal handlers, etc. I don't know.

**Emmanuel Deloget** · 24 November 2020, 08:52 PM

Originally posted by uid313 View Post

But if you accidentally read a file that is larger than your RAM then it is going to blow up your computer. 💥

Other pointed out that you have to specify the size of your buffer. And in any cases, it cannot copy more than what read(2) can copy, and there is an implicit limit here of 2GB. See read(2) notes for further information (not to mention that this fact is also mentionnned in the man page for readfile).

So unless you actually *want* to excercise the OOM killer, you should be as safe as if you use open/read/close to do the exact same thing

**curfew** · 24 November 2020, 11:32 PM

Originally posted by baryluk View Post

It is just performance optimisation. readfile can be emulated on any POSIX system fully in about 5 lines of code. You don't need to use it either.

There is also nothing stopping other systems implementing similar optimisations.

The "optimization" part is also dubious, as also noted by the author of the readfile patch. While simple synthetic benchmarks show an improvement, his own conclusion was that within any real apps the "system noise" will mask any benefits anyway.

Also, the use case presented by Intel was based on a four-year-old benchmark on their Knights Landing platform, claiming that reading sysfs was "41x slower" when performed on all of the 64 cores of KL simultaneously, compared to a single file read on one core alone.

Originally posted by baryluk View Post

Also, I can assure you, that if you are dyling with sysfs, then your program is already Linux specific... You will still need to check if readfile is supported on specific kernel you are running probably, if you want to support older kernel, but that is also pretty easy.

Well this rather sounds like some invasion technique by Microsoft.. Inject a piece of code that only works on your platform, on the pretense that everyone will know to avoid it when writing "portable" apps, but of course most people either don't know or don't care, and so the vendor lock-in happens.

(Especially when, curiously enough, the function declaration was placed in the POSIX header file unistd.h...)

Originally posted by baryluk View Post

It is like that glibc will automatically detect readfile in kernel or not, and automatically use it or emulate using open+read+close. But hiding it in the library like this is a bit so-so, because it will temporarily create file descriptors, be prone to signal handlers, etc. I don't know.

It will read to code rot on a compiler level and also make the feature compiler-specific, not even OS-specific. It's even dumber than the original idea.

**curfew** · 25 November 2020, 12:16 AM

Originally posted by markg85 View Post

I still see great value in having this. Just some cases where i think this would be really beneficial![LIST=1][*]Loading all data for your desktop! Think about mainly icons and fonts. Specifically the icons are a LOT of them in fairly small files. And that's going on continuously while you're using you're desktop. So the benefit isn't "just" at startup but (to a lesser degree) throughout your desktop usage.[LIST=1][*]Don't forget all the icons that need to be loaded when browsing files. Or the thumbnails that need to be generated from images (the source image and the generated thumbnails can all easily be handled by this syscall).

Definitely no. You are talking about files in the region of ten or a hundred kilobytes. On this scale the file handle performance becomes completely irrelevant. The point of readfile would be to optimize access in scenarios where the files contain minuscule amounts of data. For example the sysfs virtual files often contain a single character or word. In addition there would probably have to be a requirement of polling the files constantly for changes, as otherwise it would make more sense to cache them internally.

Announcement

Linux READFILE System Call Revived Now That It Might Have A User

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment