Originally posted by jacob
View Post
Announcement
Collapse
No announcement yet.
Linux Work Culminating On A "READFILE" Syscall For Reading Small Files Efficiently
Collapse
X
-
Last edited by tildearrow; 25 May 2020, 12:51 AM.
- Likes 1
-
Originally posted by jacob View Post
Here is one I always wanted: be able to pass through a file descriptor. Say you want all input on fd X to be directly forwarded to output fd Y (by the kernel itself, without jumping into and back from user space). The splice() functionality kinda sorta lets you do that but it's clunky and doesn't work in all scenarios.
You're welcome
- Likes 5
Comment
-
-
Originally posted by markg85 View Post
- Likes 3
Comment
-
Originally posted by jacob View Post
And the reason that we do have dog-slow CPUs is precisely because they have slow to decode addressing modes and a CISC instruction set that allows generating bloated code, that only Intel and AMD can implement at a reasonable speed. RISC (one operation = one instruction) has proven useful for anything Intel didn't manage to kill with Itanium.
Comment
-
Originally posted by xpue View PostPerhaps also should consider making system calls to work with multiple files/whatever in one call, in batches.
After initially setting up the io_uring ring buffers at application startup, you'd have to prepare a bunch of "instructions" in the ring buffer to open e.g. 100 files, which happens in user space and is just writing to ring buffer memory. Then you tell the kernel "hey, I've prepared 100 new work items for your in the ring buffer. Please wake me up when you've finished processing all of them/some of them/I'll do other work in the meantime and check back with you later". This is just one syscall. Then when the kernel has finished opening the files, you receive the FDs on the other ring buffer. So now you can put 100 read "instructions" for those FDs on the ring buffer, and possibly add 100 close "instructions" too if you no longer need the FDs after reading. Then you call the kernel again to say "please work on those new 100/200 IO instructions now" and you can again choose to sleep until all or some of the work is done, or do other work while the kernel is busy reading (and possibly closing) the files.
io_uring is really great.
Apparently they want to add a way so you can do the same task in just one syscall, where the read and close "instructions" wouldn't have to explicitly state the FD they want to read/close, but instead you could say "use the FD that was opened in this 'instruction' I've put on the ring buffer previously". So that would mean you could do everything in one submit and just wait for the kernel to fill the buffers you've supplied.
- Likes 7
Comment
-
Originally posted by jacob View PostNo aversion, but they shouldn't be introduced at whim.
Originally posted by jacob View PostIdeally only really new concepts should have new syscalls, new operations on existing object types should be implemented using existing generic interfaces.Last edited by pal666; 25 May 2020, 04:57 AM.
- Likes 3
Comment
-
Originally posted by xpue View PostPerhaps also should consider making system calls to work with multiple files/whatever in one call, in batches.
Comment
-
Originally posted by pal666 View Postwhy? do you always add another switch case to same function with void pointer argument instead of introducing new function?
this is really crazy and unfounded idea. it has no benefits, but it has real costs: you are losing type information by sending garbage arguments (...). ioctl was created to allow device drivers which can't introduce system calls to still be able to provide some driver-specific functionality, i.e. they differ per fd, while subj is global
You are right that ioctl semantics depend on the particular fd but that's the whole point, to implement ad-hoc functionality that only makes sense for certain kernel objects. When using BTRFS for example file and directory fd's support all sorts of specific ioctls (clone, reflink etc.) The precise ioctl mechanism is a POSIX relic that I'm not particularly fond of, it could and should be replaced by some better interface management model, but the idea of having all objects referenced by fds and each fd offering methods that make sense for that particular object is IMHO sound and better than introducing more and more syscalls that only make sense in some cases and for some objects.
Comment
Comment