Announcement

**Volta** · 17 October 2021, 11:51 AM

Originally posted by blackshard View Post

I mean: I could take a 5900X and do 8M IOPS. Then I overclock the 5900X to an higher stellar frequency and do 9M IOPS, and so I reach a new record; but what matters? The api/algorithm below isn't any better, just throwing out a bigger useless number.

In this case api/algorithm is better, so your example is plain stupid. The link below describes why IOPS is very important:

Why do IOPS matter?

https://newbedev.com/why-do-iops-matter

Solution 1: This is because sequential throughput is not how most I/O activity occurs. Random reads/write operations are more representative of normal system ac

**WorBlux** · 17 October 2021, 11:52 AM

Originally posted by sdack View Post

These gains are great, but this is just a precursor to a more fundamental shift in design that is coming. Samsung is already developing memory chips that combine DDR with FLASH technologies, and other manufacturers will follow with their own innovations. It is now only a matter of time until main memory becomes persistent and software no longer has to load and store data explicitly on storage devices, and when all data will become available at all times.

I know some game designers are already waiting for such a change desperately where the data does not have to be streamed from a drive into main memory or a game world has to be cut up into sections just to fit into main memory.

Of course, some people will hold on to the classic design, because of their worries and "old school"-thinking, but when people's workflow changes and it is no longer a "load, work, save"-process but people can jump straight to the "work" part then it will cause a shift in designs. Old schoolers will still want to load and save their documents, and count files on a drive like these were eggs in a basket.

Cost, latency, and cache design would like to interject some words to the contrary...
Persistent memory is unlikely on the desktop anytime soon. Certain types of specialized data proccessors-maybe.

Also a world were a memory leak can fill a drive w/junk?

And if you loose power (and hence the program counter and register values) how do you find the data you want later? (Answer: you still need a file system, key-value map, or similar structure to organize things into discrete name-referenced bags of bits (files)) Game programmers aren't going to start programing to direct memory address again. At best it would be a post-install optimization pass.

Also even then the virtual memory the program model sees is a lie. Virtual memory is mapped to physical memory in chunks which are dynamically allocated.

So no, this flat static view of memory is old school, and there are good reasons nobody want's to do it for a general purpose computing.

What they want instead is to avoid unnecessary data copying. Instead of disk -> memory -> GPU they want to do disk->GPU. Doing this DMA off persistent memory is simpler, but in no way reduce the conceptual need to differentiate between working memory and storage.

**sdack** · 17 October 2021, 12:28 PM

Originally posted by blackshard View Post

I understand, but it is not the idea of optimizing the api that I'm criticizing, but the numbers!
As long as there is not a serious benchmark with consistent variables, all those numbers (7M, 7.4M, 8M IOPS...) are just trash...
I mean: I could take a 5900X and do 8M IOPS. Then I overclock the 5900X to an higher stellar frequency and do 9M IOPS, and so I reach a new record; but what matters? The api/algorithm below isn't any better, just throwing out a bigger useless number.

You want to be careful with your choice of words and not shit on this effort just because you do not get what you are looking for.

We started with switches and punch cards, then magnetic tape until we arrived at spinning disks with a magnetic coating. All these had a downside compared to transistor- and capacitor-based storage: they are very slow to access. Even when one can transfer hundreds of megabytes per second from a single spinning disk these days, does it still require milliseconds before a head is positioned over the right track. This makes traditional storage devices very sensitive to random access compared to sequential access.

The block layer of the Linux kernel, but practically all operating systems, is designed with this bottleneck in mind. It never mattered much how fast a first bytes gets accessed as long as the overall throughput stays high, because huge access times simply make it pointless. It sometimes even gets traded for higher throughput on purpose.

Now things change, and we have storage systems that tear down this bottleneck of the access time. Storage devices are no longer moving mechanical contraptions, but are manufactured with similar lithography processes as the main memory, and we see technologies overlap and merge. Operating systems need to catch up to it and need to adjust. This is what is happening here and it is only the beginning.

So calling these numbers "trash" and "useless", because you are looking for traditional benchmark numbers so you can compare it to a hard drive or SSD is just narrow-minded and insulting. What matters is that the block layer opens up to allow very fast access speeds.

Being able to do 8 million I/O operations per second means one can transfer 8 million random blocks of i.e. 512 bytes into main memory at effectively 4GB/sec, all while the main memory, being the designated "random access memory", and having a peak rate of just 25GB/sec (i.e. DDR4-3200). And we are using software (OS, block layer, file system) to perform this transfer. It should make you think and allow you to appreciate the work, and not steep to insults.

**sdack** · 17 October 2021, 12:39 PM

Originally posted by WorBlux View Post

Cost, latency, and cache design would like to interject some words to the contrary...

Those are just again the worries of the old schoolers. We once had a time where every PC needed to have MS Office installed and it then had to preload itself at boot up, just so people could work on their documents faster. Now Office is a service on the Internet. You do not even know where your "files" are being saved, if they are saved, if they are actually files or perhaps records in a database. Much of the "old school"-thinking is dying and we are seeing it on mobile phones, too, where many apps not really quit but persist in the background.

And since you have mentioned it, yes, we do have now three levels of cache, and yet do you want to hold on to the separation of storage from main memory as if this was somehow majorly important. It is not important to the users when they just want to get their work done, and ideally not having to worry where and how something gets stored.

So you may feel triggered by the idea of the "old school"-models falling away, because of your worries and perhaps an inability to imagine new ways, but others will and it is happening already.

**cb88** · 17 October 2021, 12:44 PM

Originally posted by George99 View Post

Not only the same guy but also same hardware. Otherwise it would be pointless.

Actually not... he upgraded hardware not long ago and that was part of getting to millions of IOPS.... that said his work probably would make the original system go faster also. He upgraded I think it was because his CPU couldn't keep up with Optane etc...

**coder** · 17 October 2021, 01:14 PM

Originally posted by MastaG View Post

But could this also have a positive effect for a regular desktop user running a webbrowser and playing some games on Steam for example?

Let's be honest: no, probably not.

If some of these optimizations aren't specific to io_uring, then the potential exists. However, what he's essentially doing is shaving already tiny overheads here and there, and the only way you're going to see the effect is when some extremely IOPS-intensive stuff. The I/O that games do is going to be optimized to be more sequential and less IOPS-intensive, specifically so they run well on machines with far lower-performance SSDs and without oodles of RAM for lots of caching.

And web browsers are going to do mostly synchronous I/O, when updating their cache, history, and the persistent state of web apps. The main hit I think you're seeing from them is the overhead of updating indexing structures, be they at the filesystem metadata level or within file-level databases.

**coder** · 17 October 2021, 01:20 PM

Originally posted by Yttrium View Post

It needs to be said that IO is inherently DISK LIMITED.

No, not if the data is in cache.

For his tests, he's likely using O_DIRECT, to force I/O to bypass the cache. That makes the benchmark relevant for accessing databases too big to fit in memory. So, if that's what you're doing, then the limiting factor for non-exotic storage devices is going to be the storage device, itself.

However, it's certainly possible for someone to use io_uring on slower storage, with an access pattern that exhibits a high cache hit-rate. That's a case we could actually see in things like Samba, which was one of the first to trial an io_uring backend.

**Emmanuel Deloget** · 17 October 2021, 01:21 PM

Originally posted by sdack View Post

Those are just again the worries of the old schoolers.

This is the worries of the people that provide you with the services you want to use. This is the worries of OS developpers ; this is the worries of database developpers ; this is the worries of browser developers ; this is the worries of gateway developers ; this is the worries of VM developers ; this is the worries of nearly every developer who wants hi/her code to be fast and useful.

Originally posted by sdack View Post

We once had a time where every PC needed to have MS Office installed and it then had to preload itself at boot up, just so people could work on their documents faster. Now Office is a service on the Internet. You do not even know where your "files" are being saved, if they are saved, if they are actually files or perhaps records in a database.

And for this to correctly work, you are using software written by people who (at least) tried to do their best. If you think again about what you just said, you implied that the following developers worked to make you happy: OS developers, browser developers, web site developers, web server developers, database developers. That's a ton of "old schoolers" who worried about coast, latency and cache design.

Originally posted by sdack View Post

And since you have mentioned it, yes, we do have now three levels of cache, and yet do you want to hold on to the separation of storage from main memory as if this was somehow majorly important. It is not important to the users when they just want to get their work done, and ideally not having to worry where and how something gets stored.

Is *is* important to end users, they just don't know it.

Originally posted by sdack View Post

So you may feel triggered by the idea of the "old school"-models falling away, because of your worries and perhaps an inability to imagine new ways, but others will and it is happening already.

Good for them. I don't know what would have triggered the person you're answering to ; what triggers me is that in a world where there is no question that our climate is changing due to man activity ; in a world where there is no question that datacenters are greedy energy-wise ; in a world where computing facilities are prevalent and vital to a large part of the population ; in this very world where we all live, there are still people who think that trying to optimize software is at best a loss of time, at worst some kind of offense made to people like you.

Reading your prose might even make me think that I'm glad I'm an "old schooler".

**coder** · 17 October 2021, 01:31 PM

Originally posted by sdack View Post

These gains are great, but this is just a precursor to a more fundamental shift in design that is coming. Samsung is already developing memory chips that combine DDR with FLASH technologies, and other manufacturers will follow with their own innovations. It is now only a matter of time until main memory becomes persistent and software no longer has to load and store data explicitly on storage devices, and when all data will become available at all times.

Aside from the points others have made, write-endurance remains a key roadblock to this vision. So far, Optane has way better write endurance than NAND flash, and it's still nowhere near the level of DRAM.

Originally posted by sdack View Post

I know some game designers are already waiting for such a change desperately where the data does not have to be streamed from a drive into main memory or a game world has to be cut up into sections just to fit into main memory.

Also, programming models around persistent memory need to be different. If your game crashes, you can always just restart it and pick up from the last checkpoint. If all your game state is persistent, then every update to in-memory state needs to be done in a transactional fashion, and that going to add some overhead all its own.

I wonder how deeply thees game designers have contemplated these issues and their implications.

**sdack** · 17 October 2021, 01:35 PM

Originally posted by Emmanuel Deloget View Post

Reading your prose might even make me think that I'm glad I'm an "old schooler".

You are just not seeing the forest because of all the trees in it. UNIX/Linux systems have always dominated the server market, because of their persistency. No other OS could deliver the reliability and thus uptimes as UNIX/Linux could. Despite the fact that all data from main memory is lost did we use every trick to achieve persistency, to keep the server up and running for as long as possible, and to provide people with near 100% reliability and 24/7 operations. Now is the industry developing memory systems that hold their data until it is overwritten. You think of it is a problem because of the way we currently do things. It is not. Persistency has always been the goal and the hardware and software is adjusting to it.

Announcement

Axboe Achieves 8M IOPS Per-Core With Newest Linux Optimization Patches

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment