Originally posted by L_A_G
View Post
People who work on big datasets doing design and visualization also don't just load a big lump in at the start of the workday and then work on it for the rest of the day. They're usually too big to fit into memory all at once anyway so the application will be swapping data in and onto the disc. Add to that the fact that they're processing this data so they'll also be writing additional data onto the disc as part of that work. For some tasks this may even be more data than the original dataset.
Maybe, but that's not the case until you start getting into the territory of datasets so big they don't even fit on a single consumer-grade disc. When you start moving into the realm of literally multiple terabytes of data. In video production you usually keep an archive of the footage on the server and then make local copies of that as you're editing a new project. Fast discs on the server means you can get your even faster local copies faster.
Once again; Pointing out clear factual errors and correcting misunderstandings when they're used as arguments isn't nitpicking.
You remind me of those people who comment against a layman's explanation of physics, like "planets orbit the sun because they want to move away but gravity pulls them in" and then there's you saying "AKSHULLY, planets don't want to move anywhere as they have no ability to desire anything, and not all planets exist in our solar system".
On topic; No. A 25 GB database is not big by today's standards. Nor is the access rate you described. I'm running something fairly similar on what is essentially the lowest end cloud instance that Google currently provides and it's almost overkill for the job.
The point is, there's a certain point to how big your dataset is where whether the table were 25GB or 2500GB, because you're not going to fit it all into RAM and if you handle the data appropriately, you don't need to fit it all into RAM and you don't need blazing fast access either.
What? Is basic arithmetic too much to ask now?
Hypothetical examples? All of the ones I pointed out are very much real. The fact that whatever office you support doesn't do that kind of work doesn't mean that there aren't a lot of those kinds of use cases. The 250GB dataset that I mentioned? That's a LIDAR scan of the Helsinki area tram network commissioned by the city planning department and has been used for planning out the maintenance, improvement and expansion of the Helsinki tram network.
So yeah, I still say you're speaking hypothetically.
The main point of btrfs is anything but compression. It's forte is stuff like file integrity, avoidance of and auto-defragmentation, dynamic volume re-sizing, online load balancing. It's compression is primarily native support for things like zlib, which has been pretty much ubiquitous in games for the last 15 years. The devkits of the Playstation 3, Xbox 360 and Wii all had it included as a standard library from the beginning.
To put it as simply as I can; Your primary example is a game from before the use of zlib became standard. Your solution; To use what's already an industry standard.
A. The Playstation 5 and Xbox Series consoles all have a fast PCIe SSD as standard so you can almost guarantee that people have them.
B. Its something Sony's early PC ports of Playstation 5 games have already begun to do that. Housemarque's Returnal has 32 GB of RAM as a requirement
C. Zlib has been an industry standard for over 15 years so the kind of gains you're thinking of don't exist beyond stuff like generated assets.
Oh and before you start going on about the potential of generated assets and stuff like this 177k on disc demo, I'll have to point out that they always spend more time generating those assets than it's take to read them off a half-decent disc.
B. Its something Sony's early PC ports of Playstation 5 games have already begun to do that. Housemarque's Returnal has 32 GB of RAM as a requirement
C. Zlib has been an industry standard for over 15 years so the kind of gains you're thinking of don't exist beyond stuff like generated assets.
Oh and before you start going on about the potential of generated assets and stuff like this 177k on disc demo, I'll have to point out that they always spend more time generating those assets than it's take to read them off a half-decent disc.
B. See my last comment. Regardless of that, using zlib isn't going to be as effective as purpose-built compressors. There are many ways to compress textures and audio. Seems to me, devs could be taking advantage of this.
C. Who said anything about industry standards? No wonder you think there's no room for more compression...
EDIT:
Didn't even think about the last thing.
Comment