Axboe Achieves 8M IOPS Per-Core With Newest Linux Optimization Patches

coder replied

17 October 2021, 02:26 PM
Originally posted by WorBlux View Post

And background processes? Please, multitasking has been and thing in computing for a long time. Every *nix does it as well as every version of windows. The early design of Android and iOS was an intentional design to save power.

sdack is referring to they way they're unloaded by the OS, and then resumed on demand. That's not something every *nix or any non-recent version of Windows did. And I don't consider it equivalent to swapping, because it's more sophisticated than that.

Anyway, the whole debate is happening at a silly level of abstraction. What would make it productive is if we had a specific technology or API with specific performance and functional tradeoffs vs. conventional methods. Without anything concrete, you can debate design philosophy and usage models interminably.
Likes 3
Leave a comment:
coder replied

17 October 2021, 02:19 PM
Originally posted by sdack View Post

we tried and UNIX/Linux being at the centre of it with its design of making everything into a file and using a virtual memory model rather than a plain physical model, sometimes to the point of agony (if you have ever used a diskless SUN workstation that is swapping over the network then you know what I mean), just to be free of the limitation set by the main memory.

Virtual memory doesn't serve only one purpose. It also underpins the security model. It happens to be a convenient way to do memory-mapped I/O, IPC via shared-memory, and system hibernation, as well.
Likes 1
Leave a comment:
sdack replied

17 October 2021, 02:11 PM
Originally posted by coder View Post

You're comparing the speed of a ...

No, I was not. I was mentioning the numbers to give you an idea where we are right now.

DDR4-3200 only achieves 25GB/sec as a peak when you transfer 64 bytes or more iirc, meaning, a cache line. You then do not compare the speed of caches with the speed of your RAM, but the point should be clear: you do not think about the specific sizes of all your caches, because of how they work. It is the beauty of their design. But you are forced to think about the size of your RAM, or perhaps use a swap device to avoid that. The controversy is really that we have been using this concept for so long that you cannot think of doing it any other way. And yet have we tried and UNIX/Linux being at the centre of it with its design of making everything into a file and using a virtual memory model rather than a plain physical model, sometimes to the point of agony (if you have ever used a diskless SUN workstation that is swapping over the network then you know what I mean), just to be free of the limitation set by the main memory. And we have been doing it on desktops, laptops and now mobiles, too, where we now often suspend the hardware rather than to turn it off, because we want persistency.

It is ok when you think nothing will change, while you probably tell users that their files are no longer stored on their computer or on a hard drive, but on a "cloud". But I am sure of it and we will see long-standing concepts fall just to be replaced by better ones.

Last edited by sdack; 17 October 2021, 02:17 PM.
Likes 1
Leave a comment:
WorBlux replied

17 October 2021, 01:59 PM
Originally posted by sdack View Post

Those are just again the worries of the old schoolers. We once had a time where every PC needed to have MS Office installed and it then had to preload itself at boot up, just so people could work on their documents faster. Now Office is a service on the Internet. You do not even know where your "files" are being saved, if they are saved, if they are actually files or perhaps records in a database. Much of the "old school"-thinking is dying and we are seeing it on mobile phones, too, where many apps not really quit but persist in the background.

A file was never a physical thing. It's just a bag of bits. In order to find that bag you have to give it a name and place it somewhere. Whether that is "Myfile, disk 3, inode 2034" or "Myfile, Database USA North, record 679832"

Sure there are ways to do that which decrease cognitive load, but have severe limitations and tradeoffs. The biggest of which, how do you get work done without an internet connection, and how to make programs cooperate. And the cloud is just other peoples computer, which an individual can only access under very lopsided user service agreements, which generally allow the cloud provide to delete or limit your data for any reason. Unless you have a few million dollars of business to offer and can negotiate service, uptime, and export guarantees into the contract.

Sure it's damn convenient, but far from being a superior process.

And background processes? Please, multitasking has been and thing in computing for a long time. Every *nix does it as well as every version of windows. The early design of Android and iOS was an intentional design to save power. User installed daemons are nothing new.

Originally posted by sdack View Post

And since you have mentioned it, yes, we do have now three levels of cache, and yet do you want to hold on to the separation of storage from main memory as if this was somehow majorly important. It is not important to the users when they just want to get their work done, and ideally not having to worry where and how something gets stored.

But performance (especially latency) is important to users, and using the cache efficiency is a big part of that (show me a persistent tech that can access and flip bits as fast as SRAM, or even DRAM). Knowing what needs to be on deck and what is likely to be accesses sparingly is huge.

And the lower levels of abstraction is essential to efficient systems programing. Even if hardware allowed you to physically merge storage and memory, you would 100% keep the abstraction on a general purpose machine.

Also the sort file by app method, is just forcing a sort of truncated tree (bush?) oganization of the user, with files forced to sort by application + name or date. It's a step up from "put everything on the desktop" method I see a lot of people using. (And let's be honest, this is your typical computer user), but it's far less flexible and powerful than the full tree abstraction. Also each application needs to be heavier in order to let the user share or copy files, and each application turns into it's own full stack, rather than a full system of many small parts acting cooperatively.

Organization by metadata (think hashtag+search) might be more useful still, but it takes significant time and effort to curate that.

Originally posted by sdack View Post

So you may feel triggered by the idea of the "old school"-models falling away, because of your worries and perhaps an inability to imagine new ways, but others will and it is happening already.

Hardly, a general purpose computer is a great tool and 60+ years of design experiments have given us many abstraction of appropriate level and systems that are well understood and documented.

And many of these "new school" changes you talk of decrease the generality of the computer, while increasing the profit margins of giant multi-national corporations. Maybe not a bad tradeoff to allow more people to more effectively leverage a limited subset of what computers can do. But please don't pretend it's an advance in computer technology when there are clear drawbacks and tradeoffs.
Likes 4
Leave a comment:
tildearrow replied

17 October 2021, 01:55 PM
Originally posted by blackshard View Post

I'm still puzzled of how can this matter can be a news...
I mean, measuring throughput of an algorithm/api/whatever is sensible only when all the variables around stay the same.

Here I see: Guy 1 achieves xxxx IOPS, Guy 2 achieves yyyy IOPS, Guy 3 achieves zzzz IOPS... ok but who cares?

Everyone cares. Performance optimizations are welcome.
Likes 2
Leave a comment:
coder replied

17 October 2021, 01:52 PM
Originally posted by sdack View Post

Now Office is a service on the Internet. You do not even know where your "files" are being saved, if they are saved, if they are actually files or perhaps records in a database.

The app itself is still saving the data in a discrete operation. What's being persisted is not the raw, in-memory representation.

Originally posted by sdack View Post

Much of the "old school"-thinking is dying and we are seeing it on mobile phones, too, where many apps not really quit but persist in the background.

On phones, they use OS hooks to load and unload themselves. Again, it's not as if the OS is swapping out the entire process and just swapping it in, again.

What you're seizing on is not a change in the way programs are being written, so much as changes in usage models.

And call me oldschool if you will, but I hate not having a filesystem for my data. I don't like being walled in by apps who decide & completely control how I interact with it. That's one complaint against Apple devices that really resonates with me (though I can't speak from experience).
Likes 4
Leave a comment:
coder replied

17 October 2021, 01:41 PM
Originally posted by sdack View Post

Being able to do 8 million I/O operations per second means one can transfer 8 million random blocks of i.e. 512 bytes into main memory at effectively 4GB/sec, all while the main memory, being the designated "random access memory", and having a peak rate of just 25GB/sec (i.e. DDR4-3200).

You're comparing the speed of a 2-device Optane PCIe 4.0 x4 RAID-0 with a single channel of DDR4. Not a very relevant comparison, for most purposes. If you really want to compare the native performance of Optane vs. DDR4, then look at their raw performance in DIMM sockets.

Originally posted by sdack View Post

And we are using software (OS, block layer, file system) to perform this transfer.

This raises an interesting question. For his specific benchmark, is the filesystem involved in every I/O? Or does the code path through the kernel side just traverse the block layer?
Likes 1
Leave a comment:
sdack replied

17 October 2021, 01:35 PM
Originally posted by Emmanuel Deloget View Post

Reading your prose might even make me think that I'm glad I'm an "old schooler".

You are just not seeing the forest because of all the trees in it. UNIX/Linux systems have always dominated the server market, because of their persistency. No other OS could deliver the reliability and thus uptimes as UNIX/Linux could. Despite the fact that all data from main memory is lost did we use every trick to achieve persistency, to keep the server up and running for as long as possible, and to provide people with near 100% reliability and 24/7 operations. Now is the industry developing memory systems that hold their data until it is overwritten. You think of it is a problem because of the way we currently do things. It is not. Persistency has always been the goal and the hardware and software is adjusting to it.

Last edited by sdack; 17 October 2021, 01:38 PM.
Likes 1
Leave a comment:
coder replied

17 October 2021, 01:31 PM
Originally posted by sdack View Post

These gains are great, but this is just a precursor to a more fundamental shift in design that is coming. Samsung is already developing memory chips that combine DDR with FLASH technologies, and other manufacturers will follow with their own innovations. It is now only a matter of time until main memory becomes persistent and software no longer has to load and store data explicitly on storage devices, and when all data will become available at all times.

Aside from the points others have made, write-endurance remains a key roadblock to this vision. So far, Optane has way better write endurance than NAND flash, and it's still nowhere near the level of DRAM.

Originally posted by sdack View Post

I know some game designers are already waiting for such a change desperately where the data does not have to be streamed from a drive into main memory or a game world has to be cut up into sections just to fit into main memory.

Also, programming models around persistent memory need to be different. If your game crashes, you can always just restart it and pick up from the last checkpoint. If all your game state is persistent, then every update to in-memory state needs to be done in a transactional fashion, and that going to add some overhead all its own.

I wonder how deeply thees game designers have contemplated these issues and their implications.
Likes 1
Leave a comment:
Emmanuel Deloget replied

17 October 2021, 01:21 PM
Originally posted by sdack View Post

Those are just again the worries of the old schoolers.

This is the worries of the people that provide you with the services you want to use. This is the worries of OS developpers ; this is the worries of database developpers ; this is the worries of browser developers ; this is the worries of gateway developers ; this is the worries of VM developers ; this is the worries of nearly every developer who wants hi/her code to be fast and useful.

Originally posted by sdack View Post

We once had a time where every PC needed to have MS Office installed and it then had to preload itself at boot up, just so people could work on their documents faster. Now Office is a service on the Internet. You do not even know where your "files" are being saved, if they are saved, if they are actually files or perhaps records in a database.

And for this to correctly work, you are using software written by people who (at least) tried to do their best. If you think again about what you just said, you implied that the following developers worked to make you happy: OS developers, browser developers, web site developers, web server developers, database developers. That's a ton of "old schoolers" who worried about coast, latency and cache design.

Originally posted by sdack View Post

And since you have mentioned it, yes, we do have now three levels of cache, and yet do you want to hold on to the separation of storage from main memory as if this was somehow majorly important. It is not important to the users when they just want to get their work done, and ideally not having to worry where and how something gets stored.

Is *is* important to end users, they just don't know it.

Originally posted by sdack View Post

So you may feel triggered by the idea of the "old school"-models falling away, because of your worries and perhaps an inability to imagine new ways, but others will and it is happening already.

Good for them. I don't know what would have triggered the person you're answering to ; what triggers me is that in a world where there is no question that our climate is changing due to man activity ; in a world where there is no question that datacenters are greedy energy-wise ; in a world where computing facilities are prevalent and vital to a large part of the population ; in this very world where we all live, there are still people who think that trying to optimize software is at best a loss of time, at worst some kind of offense made to people like you.

Reading your prose might even make me think that I'm glad I'm an "old schooler".
Likes 4
Leave a comment:

Announcement

Axboe Achieves 8M IOPS Per-Core With Newest Linux Optimization Patches

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: