Announcement

Collapse
No announcement yet.

EXT4 Getting Faster Case-Insensitive Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by starshipeleven View Post
    False analogy. Dictionaries explain the meaning of words. A word with uppercase or lowercase letters has the same meaning (usually)
    The words commonly left out of dictionaries you have to remember like names and acronyms. Pat for a person name and pat for the action. Please note some of your old historic dictionaries have a names section. There are a large number of words with a different meaning with upper case and lower case as well.

    There are cases where being case sensitive does get critical even when working with english. I will give that majority of cases working with English and language like it being case preserving with case insensitive works but there are cases where it does not work . You have to remember early computer operating systems you did not have lower case it was type everything upper case this is why ASCII standard has uppercase at lower value than lower case. Lower case was added because with a single case turned out not to work in every data processing case with english back then. English has not magically improved in this regard.

    Both case sensitive and case insensitive are horrible in different use cases. In fact equally horrible. So you need to support both.

    Those who don't learn from history are doomed to repeat it a lot. Those demanding everything case insensitive is lining up to repeat the some of the mistakes of when computer output only had uppercase leading to the introduction of lowercase.



    Comment


    • #32
      Originally posted by xfcemint View Post

      It would be ridiculous to do that in every application.

      Since I claimed that such a functionality is impossible to do without assistance from the driver, I guess your comment proves me wrong. When I was considering the possibilities of implementing case-insensitive file open, I failed to consider the inneficient and cumbersome method you are suggesting here. Ok, you are right, I was wrong.
      Yeah, fair enough. Wine does this and it is as slow as you think it is. My main point was that it is feasible (and sometimes practical), not that you should want to need to do it.
      Last edited by microcode; 30 June 2019, 11:46 AM.

      Comment


      • #33
        Originally posted by oiaohm View Post
        The words commonly left out of dictionaries you have to remember like names and acronyms. Pat for a person name and pat for the action. Please note some of your old historic dictionaries have a names section. There are a large number of words with a different meaning with upper case and lower case as well.
        I have always placed acronyms and personal names in the same category, they are designations to uniquely identify a specific thing, not a word with a meaning in and of itself.

        So they are the same as file names, unique identifiers, you choose the rules for them.

        Comment


        • #34
          Originally posted by starshipeleven View Post
          I have always placed acronyms and personal names in the same category, they are designations to uniquely identify a specific thing, not a word with a meaning in and of itself.
          Lot of people think the way you do that names don't have define language meanings unfortunately in particular fields of study they do have meanings .
          https://en.wikipedia.org/wiki/Peter_(given_name)
          I cannot remember the few names where the name and the english word don't have matching meanings. Different personal names do have a religion/historic meaning this is why they appear in dictionaries for religion and historic study.

          Not all personal names do have meanings but the ones that do normally have some meaning as a trait parents of that child wished their child would have. This can be important with historic nick names.

          So if you are keeping notes in one of those fields with names and you have overlap having case insensitive could be a disaster. So depending on your usage case is if you want case insensitive or case sensitive look up. But one thing universal no matter the look up everyone wants case preserving..

          Comment


          • #35
            Originally posted by xfcemint View Post
            It can be done by a library at the program side, but usually it is not fine.
            Yes it is. Why would you need to write files with randomized case from your own application. Only usecase is to go find user-created files.

            No, you don't.
            Yes you do. What the fuck is your program doing that it needs to disregard case of letters.

            Correct analogy. A word using a different case can sometimes have a different meaning.
            Examples? Besides person, place or organization names that are just names and have no "meaning" per-se?

            It is arbitrary in the sense that there is noone holding a gun and threatening. There is no law or whatever to force anybody to do it one way or the other.
            It is arbitrary in the sense that you are free to decide the standards.

            In this specific case I don't see why "case-insensitive" is so much fucking better at all. It's better in some specific cases at best.

            It doesn't matter whether file names are "designations", or whatever. What matters is that people naturally sort all words, including names, titles and "designations", by usual lexicographic ordering. It is better that way because most people can easily remember letters, but cannot remember the case of each letter. People expect and it is easier for them if all words, including filenames, beginning with letter "a" appear next to each other, no matter the case.
            I'm not sure how this even matters for a machine.
            If you play the "but muh humans" card you also need to acknowledge that humans won't generate SO MUCH stuff that a modern machine can choke on unless in very specific circumstances.

            I already said that all noteworthy user-facing applications can do case-insensitive searching already, and unless you want to work on multiple terabytes small-size text files (which is kind of a niche use) you aren't going to see any performance issue by just doing that in userspace through a library.

            Comment


            • #36
              Originally posted by carewolf View Post
              How would losing the difference between the two be good?
              How is losing the ability to do stupid things good? Well, it's good, because it's less stupid? How else can I explain this?

              Originally posted by carewolf View Post
              and if you are actually storing the case, then IT IS CASE SENSITIVE, what you have are just different APIs acting in different and confusing ways, a colision API acting contrary to the naming API.
              You're confusing case-sensitive with case-preserving, HTH.

              Originally posted by carewolf View Post
              Btw. Have you ever had a database or emails on your computer? You will notice the files generated are often saved with a hash-name with random characters... It is very nice that those different files with random names stays separate files and are not randomly treated as the same file by broken file system.
              Oh I'm running databases, both on Linux and Windows, as I'm a sysadmin, and I'm sorry but I'm not following you. How will a database have any problems with a case insensitive filesystem? No database on Earth will generate case sensitive random files on the system. You're basically saying PostgreSQL, Oracle, MySQL et al are all broken on Windows. No one would make a random file name generator that uses both lower and uppercase patterns at the same time.

              You know why? Because anyone doing so would be considered a complete idiot. It's called common sense, it's really not that hard. Same goes for hashes. A hash is either uppercase or lowercase, there's no such thing as mixed case hash, because it'd be DAMN STUPID.

              If you made two files, foo.txt and FoO.TXt, and you said oh yeah, the info is in foo.txt! Literally everyone would ask, okay, BUT WHICH ONE? That's exactly why you never do that. Because it makes no sense, is counter-intuitive, redundant, annoying, and confusing, amongst other things.

              Name just ONE person on Earth who actually likes to type 'foo' in the terminal, press tab tab, then realize oh sh*t it's 'Foobar', not 'foobar', delete all the stuff she already typed, then start over, then don't let go of Shift in time, so now it's 'FOo', so let's do it AGAIN, and eventually manage to do it with proper case. It's a time consuming useless piece if cr@p, nothing else.

              Originally posted by carewolf View Post
              Also really nice that you don't need 22Mbyte ICU database
              You're extrapolating from ext4's implementation, but that doesn't make case sensitivity any less idiotic and useless thing in any case (pun intended).

              Originally posted by carewolf View Post
              that has a tendency to have security holes (macOS finds a new one in the unicode case-handling crap every other year) just to do simple file operations
              ICU is used by many OS components (on Linux too) so it needs to be patched and kept up-to-date anyway. At least now it'll be more tested and polished. The "it may contain secholes" argument can be applied to anything beyond the complexity of HelloWorld(), so it's not an argument per se.

              You're trying really hard to explain how it may lead to _problems_ to introduce case insensitivity to Linux (none of them were real problems, as I already explained), but you're dodging my actual original question: how is case sensitivity useful?

              Comment


              • #37
                Originally posted by xfcemint View Post
                Of course, and we won't ever need any new applications. Those existing applications are all we need for the next thousand years. It's just Wine that needs this feature, when that's fixed it's all done.
                Wine need both features case sensitive and case insensitive.
                Phoronix: EXT4 Getting Faster Case-Insensitive Performance The Linux 5.2 kernel brings optional per-directory case-insensitive filenames/folders while with the Linux 5.3 kernel that new EXT4 feature will see better performance... http://www.phoronix.com/scan.php?page=news_item&px=EXT4-Case-Insensitive-Faster


                Read my prior post. There are windows applications out there that expect case sensitive. Can be kind of important when you are putting like checksums letter encoded in filenames.

                Comment


                • #38
                  Originally posted by anarki2 View Post
                  If you made two files, foo.txt and FoO.TXt, and you said oh yeah, the info is in foo.txt! Literally everyone would ask, okay, BUT WHICH ONE? That's exactly why you never do that. Because it makes no sense, is counter-intuitive, redundant, annoying, and confusing, amongst other things.
                  This depends on the language as english speakers we cannot see it but there are languages where capitals have different phonemic sounds so when they say info is in foo.txt or FoO.TXt there is no confusion at all.

                  Originally posted by anarki2 View Post
                  You're trying really hard to explain how it may lead to _problems_ to introduce case insensitivity to Linux (none of them were real problems, as I already explained), but you're dodging my actual original question: how is case sensitivity useful?
                  Windows has support for full case sensitivity most for the support of software for us english speakers. Like if you are encoding checksums to text then this file names.

                  Windows has the means to disable case insensitivity on install if you language is set to one of the ones where case insensitivity does not work. Windows does this case by not writing the any entries in the NTFS partitions case conversion table.

                  Basically not all languages are as straight forwards and software requires are not straight forwards. You need both case sensitive and case insensitive and ability to control where each is active. The folder option is a good one.

                  Comment


                  • #39
                    Originally posted by xfcemint View Post
                    It is not fine with regards to efficiency and with regards to extra work needed by application and library developers.
                    case-insensitive is inefficient too, you are just moving the efficiency issue to someone else (everyone) for what is just your specific usecase.

                    It needs to open a file given a user-specified name. Common use-case. I'm arguaing that this doesn't work as expected on case-sensitive filesystems.
                    I can do that with a couple lines of shell script, if you have any issue in doing the same with a compiled program I assume it's your own problem.

                    Maybe also besides filenames, which are just names and have no "meaning" per-se?
                    yes of course. Names are names, and your argument is weak.

                    We are not arguing what's free, we are arguing what is better.
                    and your usecase is always better. Who cares about everyone else that don't need that. Let's hit everyone.

                    Maybe you don't see why it is better, but a phone book publisher certainly does.
                    This has no bearing with case-sensitive filesystems.
                    It's something that goes inside a database.

                    Irrelevant argument.
                    I am playing "but muh humans" card. That is relevant.
                    I'm playing the same card, and humans aren't going to generate enough files that you will see a performance impact for your usecase.


                    Of course, and we won't ever need any new applications.
                    Look ma! A strawman argument! I never said that.

                    Comment


                    • #40
                      Originally posted by starshipeleven View Post
                      case-insensitive is inefficient too, you are just moving the efficiency issue to someone else (everyone) for what is just your specific usecase.
                      Invalid point. Nobody forces you to use it or enable it on the directory.

                      You want a simple argument for it? Wine performance.

                      The end.

                      Comment

                      Working...
                      X