Announcement
Collapse
No announcement yet.
Windows 11 vs. Linux Performance For Intel Core i9 12900K In Mid-2022
Collapse
X
-
Originally posted by NobodyXu View PostI think this is still better than frequent faulting, which is very expensive and blocks the computation none the less.
Originally posted by NobodyXu View PostIt might be also better than simply read them into userspace if the file is large enough, since it would require a lot of copying between kernel space and user space.
Let's say you read 1 MiB of data from a NVMe drive. The syscall overhead is somewhere around 0.6 microseconds[1], the fastest PCIe 4.0 NVMe drives[2] would copy the data in 143 microseconds, the access latency is between 60 - 80 microseconds, and copying the data on a PC with dual-channel DDR4-3200 is 20-39 microseconds (since it fits in cache). Worst case would be more like 59 microseconds. However, the real kicker is that even if you eliminate the kernel -> userspace copy, you're still going to hit most of that 20-39 microseconds, because devices typically copy straight to memory and then whether it's the kernel -> userspace copy or just userspace accessing it directly, you're still going to have to fetch it into the cache hierarchy. Anyway, we're talking about 10% to 19% (29%, at the worst), though much of that you can't avoid even by cutting out the copy. And that's one of the fastest PCIe 4.0 NVMe SSDs. If we're talking about commodity SSDs or even hard drives, then the % of time copying would drop by a couple orders of magnitude.
The point is that for a system with a single SSD, kernel <-> userspace copies just aren't a big bottleneck. They're not immeasurable, but definitely not dominant. If we're talking about GPUs, then the story is a little different, but I think graphics APIs are already designed to minimize the amount of such copies. The main time you care about zero-copy is for many-core servers with lots of SSDs and high-bandwidth networking. Memory often is a significant bottleneck, in those machines.
Sources:- A PTS benchmark I'm too lazy to look up.
- https://www.anandtech.com/show/16505...0-ssd-review/3
- Likes 2
Leave a comment:
-
Originally posted by coder View PostOkay, then it's definitely not a win for this use case, because you don't get to overlap that I/O with any computation, which we know will happen if mmap'd memory is subject to read-ahead. You'd only do it if you planned to do lots of random access to a file -- enough that the up-front cost of pre-loading it would tend to be much less than all the page faults you'd expect.
It might be also better than simply read them into userspace if the file is large enough, since it would require a lot of copying between kernel space and user space.
Leave a comment:
-
Originally posted by NobodyXu View PostThe man page of MAP_POPULATE says that it will block the mmap syscall
Leave a comment:
-
What I was getting at is the priority boost that Windows give focused windows. (ugh, that's an ugly sentence). It even has a GUI for it and is changeable at runtime, which I think is fair to say beats your "most convenient way imaginable" claim of *editing the command line in grub* by several miles, to put it mildly.
From a user perspective it results in the same "fundamental change to behavior" in that it gives you a responsive browser / etc while still torrenting pr0n in the background - and it did it back in the single-core era. That responsiveness is something that desktop Linux has struggled with since before some of the commenters here were born.
Like I say, it's great that you've found something that works for you. But it's something that's not even a year old, and requires unreasonable arcana to achieve at all. What I'm trying to get at is that you should balance that enthusiasm with an understanding that there's more than one way to skin a cat, and that there are people in the world who are not only Not You but also want, need, and deserve to have computers that "just work". Not because they're stupid, but because their lives orbit around something other than IT. It's on us to make that happen, not them, and we should be doing a better job of it than we are.
Leave a comment:
-
Originally posted by coder View PostIt'd be interesting to know what this does with files approximately the same size as the machine's RAM or larger. In a pathological case of a program that can't use the data as fast as it's read, you could have the old blocks of the file getting evicted before they could be used, leading to nearly 2x the I/O.
So I guess this will at the very least do no harm, and it might reduce page faults and IO.
Leave a comment:
-
Originally posted by birdie View Post
1. Absolute most people nowadays are on SSDs. 2. Microsoft has recently started to require SSDs for laptops sold with Windows 11. 3. For the second time now: people start their computers normally just once a day, barely anyone cares about boot speed.
Windows 10 from 2017 onward and APFS are two examples of this.
2. That's a result of the aforementioned point.
3. Not true in areas where power outages are common and laptops aren't.
Originally posted by birdie View PostBUT YEAH LINUX IS SO MUCH BETTER THAN WINDOWS IN TERMS OF BOOT SPEED EXCEPT NO ONE HERE HAS TESTED IT.
Originally posted by birdie View PostGod damn it. I really really really hate when people try hard to prove that something is bad but they conveniently forget to provide the data that the opposite is actually good.
- Likes 1
Leave a comment:
-
Originally posted by NobodyXu View PostIt seems that using `MAP_POPULATE` is actually a good idea for de/compression since it tries to populate the entire file if possible, which is definitely more efficient than faulting.
Leave a comment:
-
Originally posted by Linuxxx View Post
Simple, on an interactive desktop/workstation preempt=full should always be the default, unless you actually enjoy your computer jerking around instead of you doing the same.
Also, you've got me curious there:
Which other OS allows changing the kernel-level preemption model without recompiling?
Leave a comment:
Leave a comment: