EXT4 Getting Faster Case-Insensitive Performance

oiaohm replied

01 July 2019, 12:19 AM
Originally posted by xfcemint View Post

Of course, and we won't ever need any new applications. Those existing applications are all we need for the next thousand years. It's just Wine that needs this feature, when that's fixed it's all done.

Wine need both features case sensitive and case insensitive.

EXT4 Getting Faster Case-Insensitive Performance - Phoronix Forums

https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1110097-ext4-getting-faster-case-insensitive-performance?p=1110518#post1110518

Phoronix: EXT4 Getting Faster Case-Insensitive Performance The Linux 5.2 kernel brings optional per-directory case-insensitive filenames/folders while with the Linux 5.3 kernel that new EXT4 feature will see better performance... http://www.phoronix.com/scan.php?page=news_item&px=EXT4-Case-Insensitive-Faster

Read my prior post. There are windows applications out there that expect case sensitive. Can be kind of important when you are putting like checksums letter encoded in filenames.
Leave a comment:
anarki2 replied

30 June 2019, 06:34 PM
Originally posted by carewolf View Post

How would losing the difference between the two be good?

How is losing the ability to do stupid things good? Well, it's good, because it's less stupid? How else can I explain this?

Originally posted by carewolf View Post

and if you are actually storing the case, then IT IS CASE SENSITIVE, what you have are just different APIs acting in different and confusing ways, a colision API acting contrary to the naming API.

You're confusing case-sensitive with case-preserving, HTH.

Originally posted by carewolf View Post

Btw. Have you ever had a database or emails on your computer? You will notice the files generated are often saved with a hash-name with random characters... It is very nice that those different files with random names stays separate files and are not randomly treated as the same file by broken file system.

Oh I'm running databases, both on Linux and Windows, as I'm a sysadmin, and I'm sorry but I'm not following you. How will a database have any problems with a case insensitive filesystem? No database on Earth will generate case sensitive random files on the system. You're basically saying PostgreSQL, Oracle, MySQL et al are all broken on Windows. No one would make a random file name generator that uses both lower and uppercase patterns at the same time.

You know why? Because anyone doing so would be considered a complete idiot. It's called common sense, it's really not that hard. Same goes for hashes. A hash is either uppercase or lowercase, there's no such thing as mixed case hash, because it'd be DAMN STUPID.

If you made two files, foo.txt and FoO.TXt, and you said oh yeah, the info is in foo.txt! Literally everyone would ask, okay, BUT WHICH ONE? That's exactly why you never do that. Because it makes no sense, is counter-intuitive, redundant, annoying, and confusing, amongst other things.

Name just ONE person on Earth who actually likes to type 'foo' in the terminal, press tab tab, then realize oh sh*t it's 'Foobar', not 'foobar', delete all the stuff she already typed, then start over, then don't let go of Shift in time, so now it's 'FOo', so let's do it AGAIN, and eventually manage to do it with proper case. It's a time consuming useless piece if cr@p, nothing else.

Originally posted by carewolf View Post

Also really nice that you don't need 22Mbyte ICU database

You're extrapolating from ext4's implementation, but that doesn't make case sensitivity any less idiotic and useless thing in any case (pun intended).

Originally posted by carewolf View Post

that has a tendency to have security holes (macOS finds a new one in the unicode case-handling crap every other year) just to do simple file operations

ICU is used by many OS components (on Linux too) so it needs to be patched and kept up-to-date anyway. At least now it'll be more tested and polished. The "it may contain secholes" argument can be applied to anything beyond the complexity of HelloWorld(), so it's not an argument per se.

You're trying really hard to explain how it may lead to _problems_ to introduce case insensitivity to Linux (none of them were real problems, as I already explained), but you're dodging my actual original question: how is case sensitivity useful?
Likes 1
Leave a comment:
starshipeleven replied

30 June 2019, 05:02 PM
Originally posted by xfcemint View Post

It can be done by a library at the program side, but usually it is not fine.

Yes it is. Why would you need to write files with randomized case from your own application. Only usecase is to go find user-created files.

No, you don't.

Yes you do. What the fuck is your program doing that it needs to disregard case of letters.

Correct analogy. A word using a different case can sometimes have a different meaning.

Examples? Besides person, place or organization names that are just names and have no "meaning" per-se?

It is arbitrary in the sense that there is noone holding a gun and threatening. There is no law or whatever to force anybody to do it one way or the other.

It is arbitrary in the sense that you are free to decide the standards.

In this specific case I don't see why "case-insensitive" is so much fucking better at all. It's better in some specific cases at best.

It doesn't matter whether file names are "designations", or whatever. What matters is that people naturally sort all words, including names, titles and "designations", by usual lexicographic ordering. It is better that way because most people can easily remember letters, but cannot remember the case of each letter. People expect and it is easier for them if all words, including filenames, beginning with letter "a" appear next to each other, no matter the case.

I'm not sure how this even matters for a machine.
If you play the "but muh humans" card you also need to acknowledge that humans won't generate SO MUCH stuff that a modern machine can choke on unless in very specific circumstances.

I already said that all noteworthy user-facing applications can do case-insensitive searching already, and unless you want to work on multiple terabytes small-size text files (which is kind of a niche use) you aren't going to see any performance issue by just doing that in userspace through a library.
Leave a comment:
oiaohm replied

30 June 2019, 01:16 PM
Originally posted by starshipeleven View Post

I have always placed acronyms and personal names in the same category, they are designations to uniquely identify a specific thing, not a word with a meaning in and of itself.

Lot of people think the way you do that names don't have define language meanings unfortunately in particular fields of study they do have meanings .
https://en.wikipedia.org/wiki/Peter_(given_name)
I cannot remember the few names where the name and the english word don't have matching meanings. Different personal names do have a religion/historic meaning this is why they appear in dictionaries for religion and historic study.

Not all personal names do have meanings but the ones that do normally have some meaning as a trait parents of that child wished their child would have. This can be important with historic nick names.

So if you are keeping notes in one of those fields with names and you have overlap having case insensitive could be a disaster. So depending on your usage case is if you want case insensitive or case sensitive look up. But one thing universal no matter the look up everyone wants case preserving..
Leave a comment:
starshipeleven replied

30 June 2019, 12:26 PM
Originally posted by oiaohm View Post

The words commonly left out of dictionaries you have to remember like names and acronyms. Pat for a person name and pat for the action. Please note some of your old historic dictionaries have a names section. There are a large number of words with a different meaning with upper case and lower case as well.

I have always placed acronyms and personal names in the same category, they are designations to uniquely identify a specific thing, not a word with a meaning in and of itself.

So they are the same as file names, unique identifiers, you choose the rules for them.
Leave a comment:
microcode replied

30 June 2019, 11:43 AM
Originally posted by xfcemint View Post

It would be ridiculous to do that in every application.

Since I claimed that such a functionality is impossible to do without assistance from the driver, I guess your comment proves me wrong. When I was considering the possibilities of implementing case-insensitive file open, I failed to consider the inneficient and cumbersome method you are suggesting here. Ok, you are right, I was wrong.

Yeah, fair enough. Wine does this and it is as slow as you think it is. My main point was that it is feasible (and sometimes practical), not that you should want to need to do it.

Last edited by microcode; 30 June 2019, 11:46 AM.
Leave a comment:
oiaohm replied

30 June 2019, 11:22 AM
Originally posted by starshipeleven View Post

False analogy. Dictionaries explain the meaning of words. A word with uppercase or lowercase letters has the same meaning (usually)

The words commonly left out of dictionaries you have to remember like names and acronyms. Pat for a person name and pat for the action. Please note some of your old historic dictionaries have a names section. There are a large number of words with a different meaning with upper case and lower case as well.

There are cases where being case sensitive does get critical even when working with english. I will give that majority of cases working with English and language like it being case preserving with case insensitive works but there are cases where it does not work . You have to remember early computer operating systems you did not have lower case it was type everything upper case this is why ASCII standard has uppercase at lower value than lower case. Lower case was added because with a single case turned out not to work in every data processing case with english back then. English has not magically improved in this regard.

Both case sensitive and case insensitive are horrible in different use cases. In fact equally horrible. So you need to support both.

Those who don't learn from history are doomed to repeat it a lot. Those demanding everything case insensitive is lining up to repeat the some of the mistakes of when computer output only had uppercase leading to the introduction of lowercase.
Leave a comment:
oiaohm replied

30 June 2019, 11:07 AM
Originally posted by starshipeleven View Post

This is a filesystem feature that is needed by one of the possible usecases. Like Samba shares, or Wine application folders.

Just to be horrible there are applications that need case sensitive for Samba and Windows drives.

How to enable NTFS support to treat folders as case sensitive on Windows 10

https://www.windowscentral.com/how-enable-ntfs-treat-folders-case-sensitive-windows-10

Windows 10 now handles case sensitive files, just like Linux, and here are the steps to enable the features.

Yes it a windows 10 that one.
https://docs.microsoft.com/en-us/pre...783185(v=ws.10)
But you have use cases back on windows 2003 and other places where you need also do case sensitive to allow wine to emulate NFS storage to application.

Windows is not always case insensitive. Also it possible to nuke out the casefold data out of a NTFS drive. There is a table written into a NTFS partition containing the drives casefold map. No casefold map NTFS is now drive way case sensitive. If you do that to a windows install watch lots of third party applications explode. But items like MS Office and other programs from Microsoft keep working perfectly because this is a test case they have to perform in development.

Windows client(samba)/application(wine) support is one of the usage cases that give the requirement to support both case insensitive and case sensitive at the file system level. Yes per folder is required.
Likes 1
Leave a comment:
starshipeleven replied

30 June 2019, 10:01 AM
Originally posted by xfcemint View Post

Abstractions are generally regarded as desireable stuff, for managing complexity.

Abstraction can be good or bad depending on where it is employed. Too much abstraction is bad as it limits flexibility or performance.

Case-insensitivity is more about simplifying use and avoiding errors. It's not about reducing complexity, it is about doing what is implicitly expected.

This can be done by a library in the program side too and it's going to be fine.

If you need case-insensitive filesystems to "avoid errors" you have bigger problems.

Like: there is no dictionary, ever produced in the entire history, which puts all lower-case terms before upper case terms (thereby splitting the dictionary in two parts, the first one containing only words with initial lower case.

False analogy. Dictionaries explain the meaning of words. A word with uppercase or lowercase letters has the same meaning (usually).

File names are just designations, pointers to some addresses on a memory device. How you define their standard is arbitrary.
Leave a comment:
starshipeleven replied

30 June 2019, 09:53 AM
Originally posted by vegabook View Post

I have a horrible feeling that Linux is merging with Windows. It's coming from both directions. Case sensitivity was a badge of rigour. It speaks to the underlying ascii code. It's a different damn character. 'a' != 'A' just as 97 != 65. But now we have some awful translation going on that makes it so. We're letting sloppy Windows-style workflows creep in. I blame the WSL which has coopted Linux.

Honestly, I'm getting a kind of Python3 feeling about Linux. A good, clean, idea, hijacked by a committee which wanted to "improve" it and killed what made it special in the first place. Sure, might make it more "mainstream" or whatever, but it's not longer something special. Personally the seed is being planted to look at new OSs. Linux is is losing its mojo.

Another guy seeing thigs that are only in his own mind.

This is a filesystem feature that is needed by one of the possible usecases. Like Samba shares, or Wine application folders.

It's not on by default and it's specifically designed to be enabled on a folder basis.
Leave a comment:

Announcement

EXT4 Getting Faster Case-Insensitive Performance

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: