Originally posted by ryao
View Post
Announcement
Collapse
No announcement yet.
Linux Kernel To Get AIO Performance Improvements
Collapse
X
-
Well you don't have asynchronous sendfile or splice. You don't have asynchronous stat/opendir/readdir/... as well. And the interface is not the same depending on the fd type. I don't think that AIO will work on something different from regular file.
Do you have any news about syslets?
Comment
-
IMHO asynchronous file I/O is still a mess on Linux.
As already stated in this thread the POSIX-AIO is implemented in user-space with a pool of blocking threads. The reason why it is not using the Kernel-AIO is the lack of features in AIO.
There are 3 reasons not to use the Kernel-AIO:
1. It only works with the O_DIRECT-flag, which means no buffering and very slow I/O.
2. Kernel-AIO needs support of the file-system in the kernel. Which file-systems are supported is a little bit of a mystery. Does CIFS or NFS support it?
3. When you finally use it it only runs on Linux, but not on BSD-systems.
I come from a windows background and I have to admit that the IO-Completion-Ports (IOCP) is a much superior API compared with what Linux has to offer. It works since Windows 2000, supports sockets and ALL kind of file-handles.
The Kernel-AIO moves the right direction, but the progress is very slow and hardly anybody is using it.
I wrote a high-performance-server on windows and wonder how I would have achieved the asynchronous file-I/O on Linux.
Comment
-
Originally posted by curaga View PostWhy would you need AIO specifically - ie, why would it be better than say epoll + threads?
Having blocking threads doing the read/write operations does not scale.
An asynchronous reactor- or proactor-pattern is a much better approach.
Comment
-
Originally posted by curaga View PostIt's also much more complex and so bug-prone.
I have worked with big projects which used blocking I/O and with ones that used a completely asynchronous approach. In the end the asynchronous design was accepted by much more developers because they saw the benefit.
IMHO it is not hard to develop programs which use an asynchronous I/O approach. The main reason that a lot of people think it is more complex and bug-prone is because pretty much ALL examples of I/O (sockets or files) are blocking and it is hard to find good examples of the async way. That's why everbody starts with blocking I/O until they find out that it is not the best solution.
I can encourage everybody who is interested in the topic to look at the boost::asio library which is an excellent platform-independent API for asynchronous I/O. (but no async file I/O on linux)
Originally posted by curaga View PostOkay, so say you're doing a lot of writing of files to a slow HDD. What do you gain from AIO when compared to regular O_NONBLOCK?
Comment
-
Originally posted by jwilliams View Post
Do you have a lot of RAM?
It took me a long time to figure it out, but I noticed on my system (32GB RAM) that the settings for VM dirty_ratio and dirty_background_ratio are not nearly low enough -- even 1 is not low enough. What seems to be happening is that during heavy write activity, the system caches writes until it has hundred of MiBs or more in the cache, then decides to write it to storage and that seems to block other processes and can cause the system to nearly freeze for seconds at a time.
My partial solution is to use dirty_bytes and dirty_background_bytes instead. I currently set them to 96MiB and 32MiB:
echo "100663296" > /proc/sys/vm/dirty_bytes
echo "33554432" > /proc/sys/vm/dirty_background_bytes
That much can be written to storage in less than a second. It seems to help. The mulitple second freezes are mostly gone.
Comment
-
Originally posted by RomuloP View Post
A better solution may be to use a multi-queue scheduler like BFQ and stay with insane sized cache... Unfortunately small cache affects some apps like OBS and Chrome heavily even with BFQ if for example any file transfer is happening, and I suspect it is about synchronous functions, I suspect AIO would help here.
Thinking back, at the time I wrote my original post I was running an i7-2700k desktop with 8-16GB ram and a 500GB spinning-rust SATA drive (which was later upgraded to an SSD). This was also before BFQ even existed in the mainline kernel, so there's that
Things are much better these days, and I credit the move to SSDs for a large majority of that improvement. That being said, my current work laptop is practically equivalent to the desktop it replaced in performance (haswell quad i7 laptop, 16GB ram, 500GB SSD). It's up for replacement soon, but the main improvements I expect are going to come from jumping to an NVMe SSD and improved battery life from 5-6 years of laptop design improvements.
Comment
-
Ops, sorry, sort of result of a old problems still contemporary and lack of attention.
SSDs and multi-queue schedulers helped, but problems still exist. A global dirty page limit sucks for slow devices like pendrives, and still today, Linux memory management washes its hands and give all the dirty page limit to cache a X GB ISO that is going into a 20 MBps device. To worsen things, the queueing overhead just make the transference even slower, not to point no file manager can estimate the transference progress correctly.
It is hard to point a culprit/solution, with smaller dirty limits that put some sanity for external device transferences, some aggressive synced programs like OBS and Chrome without user space cache for some things, lag/drop frames easely, when something like aggressive file transfer dominates, also in my opinion IO intensive programs doing things like file transfer could offear PSI aware techniques to avoid overpressuring the system or device so much and make a better job with slow devices. The kernel could more intelligently deal with dirty pages, allowing per device/process limits would help a lot. I'm just waiting to see what solution comes first.
Comment
Comment