Announcement

Collapse
No announcement yet.

EXT4 Case Insensitive Support Sent In For The Linux 5.2 Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by rene View Post
    With keep things stupid simple (in kernel space) I mean not to try to solve this problem at all.
    Please remember this did not instantly happen. Its been put off from kernel space for over a decade now. You have just suggested another option that has already been tried and lead to issues with android, wine and In network file systems( samba, nfs and other network sharing file systems like cluster file systems that have windows clients).

    https://lwn.net/Articles/255519/
    I am not kidding about a decade. This is the 2007 attempt and it was put off back then. Procrastination never makes the problem go away.

    Basically you still have not suggest something that has not been tried.

    Comment


    • #22
      Originally posted by rene View Post
      Oh, wait, did you say micro kernel? ;-) https://www.youtube.com/watch?v=g85yri1kfJo
      don't forget to smash that like button and subscribe, click also the bell icon to be notified of any new content.

      Comment


      • #23
        Another interesting usecase is cross-compiling Windows programs using Clang.
        Most SDKs targeting Windows development have issues with casing. They can partly be worked around using a VFS in Clang itself, but you will end up having issues when linking (linker doesn't support the VFS).

        Comment


        • #24
          Originally posted by starshipeleven View Post
          don't forget to smash that like button and subscribe, click also the bell icon to be notified of any new content.
          thanks!

          Comment


          • #25
            Originally posted by Orphis View Post
            Another interesting usecase is cross-compiling Windows programs using Clang.
            Most SDKs targeting Windows development have issues with casing. They can partly be worked around using a VFS in Clang itself, but you will end up having issues when linking (linker doesn't support the VFS).
            That inside Clang suffer from the same screw up as wine and samba suffer from what todo in the case of Text.h and tEXT.h files in fact existing and the #include "text.h". So this is another half fix that does not fix it all.

            Casefold file system fixs the linker and the compiler and prevents end up in the unsolvable conflict.

            Comment


            • #26
              That's not going to happen with any Windows SDK as that isn't allowed on case insensitive filesystems where they were created.
              If you had 2 files with a conflicting names in different folder, then same rules that apply for case sensitive filesystems apply. Best case, it just won't build as it would on a proper case insensitive file system. So there is absolutely no conflict there to worry about.

              It's about importing data from a case insensitive file system onto a potentially sensitive one and keeping compatibility after changes, not the other way around (that's already a solved problem).

              Comment


              • #27
                Originally posted by Orphis View Post
                That's not going to happen with any Windows SDK as that isn't allowed on case insensitive filesystems where they were created.
                Do not underestimate how SDK update patches can screw up. I remember from doing this stuff under wine with case sensitive so it has happened with bad SDK update patches before where the patch was designed for case insensitive on case sensitive things did not turn out right.

                Comment


                • #28
                  Originally posted by oiaohm View Post
                  The feature is casefold so its not just plain case insensitive. Casefold include making chars on different keyboard layout that look basically the same successfully match instead of being unique unicode values as well.
                  Are you talking about Unicode homoglyphs? Because casefolding does not address those.
                  Also I am pretty sure that there is no full casefolding going on here, because that would cause more issues like the well known 'ß' example (cf. https://www.w3.org/TR/charmod-norm/#example-5).


                  Comment


                  • #29
                    Originally posted by chithanh View Post
                    Are you talking about Unicode homoglyphs? Because casefolding does not address those.
                    Also I am pretty sure that there is no full casefolding going on here, because that would cause more issues like the well known 'ß' example (cf. https://www.w3.org/TR/charmod-norm/#example-5).
                    This is something interesting. This is a different application of casefold.

                    https://linuxplumbersconf.org/event/...ve-lookups.pdf
                    Do look at page 3.

                    You have to remember file system is name-preserving form of the unicode casefold. So yes it does go into homoglypths.

                    "Because some applications cannot allocate additional storage when performing a case fold operation"
                    This problem does not happen. Name-preserving means what ever string the application requested the file with stays exactly same no requirement to alter buffers application side. Casefold is in the ext4 case is effecting how the file system decides if X file name string matches Y file on the filesystem. There is no requirement for a file system to have 1 to 1 match on files to filename think historic hardlinks you could have multi filenames same file contents. Ext4 casefold is really like auto hard-linking is no effective different to applications if the 4 files on page3 in the pdf were all in fact hardlinks to the same file or if Ext4 casefold make them 1 file.

                    So yes as ext4 casefold since it using in in the search for file process not the application strings it can be a full unicode casefold without causing any direct issues there are plans where you can put language particularly settings in the directory information.

                    "well known 'ß' example" is not in fact a problem in a name-preserving file system level casefold.


                    Comment


                    • #30
                      Originally posted by oiaohm View Post
                      This is something interesting. This is a different application of casefold.
                      I think it is very similar. The case folding there is used for string matching (e.g. user enters search string and expects to match regardless of case). And the proposed mechanisms are also similar to what the Linux Plumbers conference PDF proposes for matching in encrypted directories (normalize and then match).

                      Originally posted by oiaohm View Post
                      You have to remember file system is name-preserving form of the unicode casefold. So yes it does go into homoglypths.
                      No, homoglyphs cannot be folded in filesystems in principle.
                      Take for example Ρ (U+03A1 "GREEK CAPITAL LETTER RHO") which is a homoglyph to P (U+0050 "LATIN CAPITAL LETTER P") in uppercase, but when you apply lowercase mappings to them, they become ρ (U+03C1) and p (U+0070) which are no longer homoglyphs. As an added difficulty, there is ϱ (U+03F1) which has an uppercase mapping of U+03A1.

                      So all three lowercase versions must be able to coexist in the same directory, while in a theoretical homoglyph-folding filesystem the uppercase versions cannot. But what happens if you now open a file using one of the uppercase characters? By which criteria are you going to decide which one to open?

                      Originally posted by oiaohm View Post
                      "well known 'ß' example" is not in fact a problem in a name-preserving file system level casefold.
                      ß (U+00DF) is actually another problem due to having no simple uppercase mapping, but ẞ (U+1E9E) having a lowercase mapping of U+00DF.

                      Comment

                      Working...
                      X