In the Phoronix Forums discussion about the EXT4 corruption bug hitting the Linux 3.4/3.5/3.6 kernels, Ted Ts'o, the EXT4 file-system maintainer, ultimately jumped in on the discussion to respond to the numerous and polarized opinions of Phoronix readers.
One Phoronix reader suggested, "Maybe consider freezing all changes to Ext4 and make a new Ext5 for new ideas." Ted's response to this was:
We have considered this. Right now new features get added under experimental feature flags or mount options. One of the users who ran into problems were using experimental new features that are not enabled by default. We can't stop users from trying out new features that aren't enabled by default, just as we can't stop them from deciding to use ext5 instead of ext4 on production servers. Things like metadata checksums are not enabled by default specifically because they aren't ready yet. Brave users who try them out are invaluable, and I am grateful to people who help us do our testing, since that's the only way we can shake out the last bugs that aren't found in developer environments or via regression tests. But you make your choices, and take your chances when you turn on such experimental features.It's probably unlikely that we will see any EXT5 file-system for Linux anytime soon. Ted has also made some other file-system comments in this forum thread.
And there are some real costs with forking the code base and creating a potential new "ext5". We already have had problems where bugs are fixed in ext4, and they aren't propagated to ext3. Just today I found a minor bug which was fixed in ext3, but not in ext2. And sometimes bugs are fixed in extN, but someone forgets to forward port the bug fix to extN+1. If we were to add an "ext5" it would make this problem much worse, since it would triple the potential ways that bug fixes might fail to get propagated to where they are needed.
Speaking of bug fixes, you can't freeze all changes, because we are still finding bugs. Heck, as part of deploying ext4 is deployed on thousands and thousands of thousands of machines in Google data centers, we found a bug that only showed up because we had deployed ext4 in great numbers. When we found the bug fix, I checked and found that the exact same bug existed in ext3, where it had not been found despite ten years of testing in enterprise linux releases, by companies such as IBM and Red Hat. (It had probably triggered a couple of times, but it was so rare that testers probably chalked it up to random hardware failure or cosmic rays; it was only because I was correlating failure reports --- and most were caused by hardware failures, not by software bugs --- across a very large number of machines that I could discern the pattern and find this particular bug.)
The problem is that sometimes bug fixes introduce other bugs. In this particular case, it was a bug fix which as backported to a stable kernel which apparently made this failure mode happen more often. If you really mean "freeze all changes", as opposed to just being full of snark, then that would also mean not taking any bug fixes. And if you want to stay on an older version of Linux, feel free..... that's what people who are using RHEL 5, or RHEL 4, or even in some cases RHAS 2.1 have chosen.