Linux Fixes Regression That Broke File Names With ❤️ & Other Special Characters

Written by Michael Larabel in Linux Storage on 11 December 2024 at 09:11 PM EST. 37 Comments
LINUX STORAGE
Linus Torvalds took to reverting some code tonight within the mainline Linux kernel that inadvertently had broken support having filenames with ❤️ and other special Unicode characters in filenames when on file-systems with case-folding (optional case insensitive file/folder name) support.

Merged to the Linux kernel last month was this change to the kernel's Unicode handling to not special case ignorable code points. This commit stripping around 3k lines of kernel code left the ignorable code points to decompose/casefold themselves. Unfortunately though this ended up breaking things for file-systems with Unicode case-folding support for case insensitive file/folder handling, like F2FS. In turn those running new Linux kernels were no longer able to read files with special characters, such as the ❤️ emoji.

This kernel bug report raised the issue over being unable to find certain files on an F2FS file-system now after the specified Unicode change.

Broken SSD


With that Unicode change clearly causing problems and breaking existing user-space support with accessing existing files of all things, Linus Torvalds immediately took to reverting the problematic code.

Linus Torvalds commented in the revert:
"It turns out that we can't do this, because while the old behavior of ignoring ignorable code points was most definitely wrong, we have case-folding filesystems with on-disk hash values with that wrong behavior.

So now you can't look up those names, because they hash to something different.

Of course, it's also entirely possible that in the meantime people have created *new* files with the new ("more correct") case folding logic, and reverting will just make other things break.

The correct solution is to not do case folding in filesystems, but sadly, people seem to never really understand that. People still see it as a feature, not a bug."

At least if you don't make use of case-folding on a supported file-system and running on a very recent kernel, you have nothing to worry about especially if you don't typically toss special characters into your filenames. In any case one more interesting/unique Linux kernel regression now resolved.
Related News
About The Author
Michael Larabel

Michael Larabel is the principal author of Phoronix.com and founded the site in 2004 with a focus on enriching the Linux hardware experience. Michael has written more than 20,000 articles covering the state of Linux hardware support, Linux performance, graphics drivers, and other topics. Michael is also the lead developer of the Phoronix Test Suite, Phoromatic, and OpenBenchmarking.org automated benchmarking software. He can be followed via Twitter, LinkedIn, or contacted via MichaelLarabel.com.

Popular News This Week