Linux Pipe Code Again Sees Patch To Restore Buggy/Improper User-Space Behavior

yump replied

28 August 2021, 02:12 AM
Originally posted by skeevy420 View Post

Not breaking userspace seems like too rigid of a protocol to be sustainable for another 30 years. For another hundred and thirty years. What if this is done wrong, a new language feature comes out, etc? Will that lead to a pipe3? A pipe8?

yeschad.tif

You may not like it, but that is what 1,000 year software looks like.

The python 2-3 transition was an enormous clusterfuck, and an example that should not be repeated.
Likes 1
Leave a comment:
NobodyXu replied

27 August 2021, 10:38 PM
Originally posted by skeevy420 View Post

How many of those exist in the kernel? Both an old and a new way.

I don't understand why both have to stick around after major version changes. Not breaking userspace seems like too rigid of a protocol to be sustainable for another 30 years. For another hundred and thirty years. What if this is done wrong, a new language feature comes out, etc? Will that lead to a pipe3? A pipe8?

And that begs the question of what's the point of Extended Long Term Support (and Long Term Support to an extent)? Why can't the people who have to stick with the old use that? Why can't ELTS be used as Linux Legacy?

And if they're not allowed to break userspace, how come they can break user hardware? How come they can drop hardware support? At some point following that protocol becomes silly when you look at it in that regard.

I don't consider the answer of "just add more on top" to be a good answer when there are means in place to facilitate the making of what could be called a "next-gen" Linux kernel, the handling of long term and/or legacy users, and dropping kernel syscalls and code that need dropping.

And I'm just using 6.0 because the kernel is in the 5.X series. 6 is just the next major revision number and the major revision number is where most other projects do breaking changes...just easier to use as a talking point/rhetorical example.

IMHO in this case it’s because removal of driver only happens if:
- the hardware has almost disappear from the market
- nobody is willing to maintain it (likely because they also don’t have one anymore)

And that is probably the same as Linux syscall.
When all the softwares that relied on it has moved on, then it’s likely that they can get it removed.

Second, if you treat LTS as legacy, then every software we use are legacy.
Once a technology/software is rolled out to the public, it is legacy because by the time it becomes mature enough, it’s probably already depending on legacy technology that has released a new version.

Example will be Java 8 and Java 11.
The latest version java is Java 16, released this year, but again, not many projects have switched to it yet.

Many projects are still sticking with Java 11, some even sticks with Java 8.

Same for other languages, such as C++.
C++17 has released for 3 years, but the adoption rate is slow.

And that are not limited to programming languages, but also to softwares/frameworks writen upon it.

I think you misunderstand how industry works.
You think they are willing to adopt new technology and throw away the old one as long as the new ones are mature and supported.

THIS IS SIMPLY NOT THE CASE.

The industry don’t throw away usable softwares.
They only do that when it’s absolutely necessary due to serious bug/security problem, or adding so many new features that existing tech cannot cope with it anymore.

The one you see that adopt new technology ASAP are the fraction and are limited to certain usecase.
Most won’t adopt as soon as it’s available and won’t throw away they existing implementation until decades later.

That’s why COBOL is still a thing.
Likes 1
Leave a comment:
Ironmask replied

27 August 2021, 04:41 PM
Originally posted by skeevy420 View Post

Python updated and all the scripts adapt.

Not a great analogue, since Pythons scripts did not adapt, and instead there are simply two simultaneous versions of Python, and there always will be, because it's a popular programming language, just like COBOL. Python is used as an example to language designers what not to do when upgrading your language.
Leave a comment:
stargeizer replied

27 August 2021, 01:40 PM
You people forget the fact that Linus Torvalds is the guy that mandates the kernel, and for him, the only unbreakable rule is "do not break userspace". All of the rest of rules he can (and had) forego in the past when necessary (Albeit reluctantly, and with colorful language answers). And also, he also mentioned many times that kernel versioning has nothing to do with the MAJOR.MINOR semantic. He just put minor versions up to 20 and automatically ups the major version by one when minor reaches 21.

You can argue advantages and disvantages for versioning APIs, but for reference, the OS that everyone hates here (except Birdie, who loves it), also has the same restrictions. The Win32s userspace interface is unchanged since windows 2000. (And yes, you can run software for windows 2000 on windows 10, except for those software written using undocumented API's, or use creative kernelspace antipiracy protection schemes, or games that used creative ways to abuse DirectX or OpenGL, for those there's the compatibility troubleshooter). Nowadays they tried to have versioning API's since windows 8 in parallel ,but they failed to impress programmers and external developers with these, (Remember .NET and how was rejected in the beginings?? It wasn't until version 4 of the API that coders accepted to use it, but M$ still had to support the old Win32s API, and will do so even in windows 11.).

This will change?? Probably when Torvalds leaves kernel developing and the next leader can be convinced to clean the userspace API. Knowing the devs, the probability is basically nil.
Leave a comment:
skeevy420 replied

27 August 2021, 09:36 AM
Originally posted by NobodyXu View Post

They don’t have to annouce a 6.0 for such change…Just add another new syscall, problem fixed.

Or, add a new bitflag to pipe2 (not pipe) syscall https://man7.org/linux/man-pages/man2/pipe.2.html

How many of those exist in the kernel? Both an old and a new way.

I don't understand why both have to stick around after major version changes. Not breaking userspace seems like too rigid of a protocol to be sustainable for another 30 years. For another hundred and thirty years. What if this is done wrong, a new language feature comes out, etc? Will that lead to a pipe3? A pipe8?

And that begs the question of what's the point of Extended Long Term Support (and Long Term Support to an extent)? Why can't the people who have to stick with the old use that? Why can't ELTS be used as Linux Legacy?

And if they're not allowed to break userspace, how come they can break user hardware? How come they can drop hardware support? At some point following that protocol becomes silly when you look at it in that regard.

I don't consider the answer of "just add more on top" to be a good answer when there are means in place to facilitate the making of what could be called a "next-gen" Linux kernel, the handling of long term and/or legacy users, and dropping kernel syscalls and code that need dropping.

And I'm just using 6.0 because the kernel is in the 5.X series. 6 is just the next major revision number and the major revision number is where most other projects do breaking changes...just easier to use as a talking point/rhetorical example.
Leave a comment:
NobodyXu replied

27 August 2021, 08:48 AM
Originally posted by skeevy420 View Post

NobodyXu

You're unintentionally confirming what I said with your argument -- both that the kernel could use a way to break itself during an anticipated update and that if you don't pay for maintainers the community is expected to step up and become maintainers.

You basically listed example after example of why anticipate updates with breaks/changes are necessary. Even gave an example of a build system to cope with all the versions and changes over the years and decades...and gave a better name than grace period. There is no reason the kernel can't do the same with a major version change. Turn the last 5.X into ELTS and let 6.0 drop any fixes, duplicated efforts, etc because of things like this (not like I have a running tally here).

While potentially a pain, I can't think of another way for the kernel to truly advance itself...maybe only doing that once every 10 or 20 years to reduce the burden.

They don’t have to annouce a 6.0 for such change…Just add another new syscall, problem fixed.

Or, add a new bitflag to pipe2 (not pipe) syscall https://man7.org/linux/man-pages/man2/pipe.2.html
Likes 1
Leave a comment:
skeevy420 replied

27 August 2021, 08:46 AM
NobodyXu

You're unintentionally confirming what I said with your argument -- both that the kernel could use a way to break itself during an anticipated update and that if you don't pay for maintainers the community is expected to step up and become maintainers.

You basically listed example after example of why anticipate updates with breaks/changes are necessary. Even gave an example of a build system to cope with all the versions and changes over the years and decades...and gave a better name than grace period. There is no reason the kernel can't do the same with a major version change. Turn the last 5.X into ELTS and let 6.0 drop any fixes, duplicated efforts, etc because of things like this (not like I have a running tally here).

While potentially a pain, I can't think of another way for the kernel to truly advance itself...maybe only doing that once every 10 or 20 years to reduce the burden.
Leave a comment:
NobodyXu replied

27 August 2021, 08:19 AM
Originally posted by skeevy420 View Post

Kind of mixed up my day one rant and now

That's what makes this such a grey area. Working but incorrect. Normally we'd do out with the old, in with the new (just humans in general), but protocol makes it so it's keep the old, in with the new. I don't think it should necessarily be that way....maybe keep it until 6.0 and drop it.

I'm aware of the differences, just seems a bit odd to me that all the rest of the Linux software stack is allowed to "break" each other while the kernel isn't allowed to even if it is to fix bad code and increase performance. I still think a compromise of a grace period, like when 6.0 comes out, would be a good way for Linux to "break" userspace. If software can't have major changes between major versions then y'all need to come up with a new way of doing things.

It just seems dumb that we can drop IDE and floppy support entirely but we can't fix pipe. The worst part is the broken userspace code has been fixed....so we're debating about leaving in a bad implementation in the kernel even though the userspace side that the kernel fix broke has been fixed.

Which is exactly how it works now.

GTK/Qt updated and all the GUI programs adapt.
GCC/LLVM updated and all the programs adapt.
PulseAudio starts to be favored over ALSA and all the programs adapt.
Python updated and all the scripts adapt.
A physics library updates and all the game engines adapt.
X updates and Y related things adapt.

There's a reason that IBM Hat can push what they want. They pay the ever-growing army of maintainers.
There's a reason that Ubuntu stands out from the crowd. They pay an ever-growing army of maintainers.
There's a reason that Arch and Gentoo are so popular with the community. Because we're expected to be the ever-growing army of maintainers.

GTK/Qt also don’t break backward compatibility unless necessary, they only break it on major version update, and every major update take a long time.
For example, Qt 6 has released for a long time, but due to the fact that it changes so many APIs and some functionalities are missing, many softwares is still using Qt 5.

Changes to GCC/LLVM command line interface is so minor during major update that most programmer won’t notice, and for those who need to cope with it, there are build system like cmake which enable using flags only when the compiler supported it.

BTW, cmake is also versioned.
So in order to enable LTO in cmake, you need to first assert the minimal version of cmake.

Updates to python also only adds new APIs and the only breaking change is to upgrade from python2 to python3, and it’s a pain.
Even now, there are many softwares out there still supporting python2 and some only supports python2.

Even big companies like Red Hat and Ubuntu needs to maintain backward compatibility.

Red Hat used to have 10 years free support for CentOS. Why? Because they don’t want to break anything.
This year, they break the convention and suddenly announces EOL for CentOS 8 for their new product line CentOS Stream.
Guess what? People setup their own linux distribution like Rocky Linux.

Ubuntu and debian is also the same.
Debian guarantees at least 5 years of support.
Ubuntu LTS are the same (they usually release after debian), and they also provide an extended 10 year support services.

You think industry like frequently updating to newest changes that break their implementation so that they can spend money rewrite them?
Hell no.

There’s a reason why rolling distribution like Arch or Gentoo is never used in server environment.

Almost every docker (container) image uses debian/ubuntu/centos/alpine (also do point release).

Nobody wants their softwares to suddenly break because of an unanticipated update.

AND, BTW, GENTOO IS CALLING FOR MORE MAINTAINERS.
Leave a comment:
skeevy420 replied

27 August 2021, 07:50 AM
Originally posted by tomas View Post

Nothing is being held back. A new version of the syscall with a more optimal behavior is being made available. New versions of libraries that wish to use the new more optimal syscall with slightly changed behavior will be released. All is well.

Kind of mixed up my day one rant and now

This is the problem right here, definition of "fixed" and "wrong". This is not about fixing a bug in the syscall implementation that would cause a kernel panic or lead to a security issue.
This is about changing the behavior of an existing syscall in such a way that existing users of said syscall break. It does not matter if the behavior was undocumented or "whose fault it is". If you can optimize a syscall implementation without changing ANY externally visible behavior, then that is fine and I'm sure it has been done on many occasions in the kernel. But you do not break existing users of a sys call unless you really really have to, for example due to security reasons. Linus understands this, as do the developers of other operating systems. I ask again, can anyone point to other operating systems which have a different policy when it comes to keeping backwards compatibility for sys calls?

That's what makes this such a grey area. Working but incorrect. Normally we'd do out with the old, in with the new (just humans in general), but protocol makes it so it's keep the old, in with the new. I don't think it should necessarily be that way....maybe keep it until 6.0 and drop it.

EDIT:

That is vastly different, it's like night and day.
One is about changing an ABI causing existing binaries (libraries and/or programs) to break. The other one is about requiring source code changes to use a new compiler. The difference in impact between the two examples are that the first one potentially effects end users while the second one only affects developers or users that compile from sources with a compiler that is yet to be supported for building the library or program.

I'm aware of the differences, just seems a bit odd to me that all the rest of the Linux software stack is allowed to "break" each other while the kernel isn't allowed to even if it is to fix bad code and increase performance. I still think a compromise of a grace period, like when 6.0 comes out, would be a good way for Linux to "break" userspace. If software can't have major changes between major versions then y'all need to come up with a new way of doing things.

It just seems dumb that we can drop IDE and floppy support entirely but we can't fix pipe. The worst part is the broken userspace code has been fixed....so we're debating about leaving in a bad implementation in the kernel even though the userspace side that the kernel fix broke has been fixed.

Originally posted by yump View Post

This philosophy makes it impossible for software to ever be finished. If there is continuous churn in underlying libraries and interfaces that requires ongoing maintenance, the only way to have an ever-growing library of useful software is to have an ever-growing army of maintainers.

Which is exactly how it works now.

GTK/Qt updated and all the GUI programs adapt.
GCC/LLVM updated and all the programs adapt.
PulseAudio starts to be favored over ALSA and all the programs adapt.
Python updated and all the scripts adapt.
A physics library updates and all the game engines adapt.
X updates and Y related things adapt.

There's a reason that IBM Hat can push what they want. They pay the ever-growing army of maintainers.
There's a reason that Ubuntu stands out from the crowd. They pay an ever-growing army of maintainers.
There's a reason that Arch and Gentoo are so popular with the community. Because we're expected to be the ever-growing army of maintainers.
Leave a comment:
arQon replied

27 August 2021, 05:22 AM
Originally posted by bachchain View Post

I can just imagine a few years from now someone trying to do a rewrite of epoll that makes it 10x faster, but being rejected because it breaks this one buggy version of this one library that nobody uses anymore

And that's still the right way to deal with it. As others have pointed out, we already have -Ex or -2 etc versions of a TON of syscalls specifically for breaking changes.

This naive mentality of "but everybody should just fix all the code ever written" isn't just naive: it's a shitty, narcissistic way of operating that makes *everybody else* responsible for *your* mistakes.

To take a simple example that might be more familiar to you, consider Gnome Shell. Every release, a bunch of extensions break. (IDK how large the number is each time, but it's enough that people here have been complaining about it constantly for years). The users aren't at fault, but they suffer for it, because now something they use doesn't work any more. The extension developers aren't at fault either, but they suffer even more as they have to deal with both the breakage and the users complaining. Now imagine that instead of some toy widget on a desktop, the impact is thousands of programs running on millions of servers.

Alternatively, try thinking of the cumulative impact of having to constantly keep doing those repairs. Instead of working on the "real" parts of your project, you're always wasting time dealing with the fallout of *someone else's* bugs, design failures, and other mistakes. Over time, you either get tired of cleaning up their mess and walk away by choice, or the productivity loss means you're forced to.

If your "One True Kernel" ever happened, it would make Hurd look popular by comparison - which is probably not the end result you were hoping for.
Likes 4
Leave a comment:

Announcement

Linux Pipe Code Again Sees Patch To Restore Buggy/Improper User-Space Behavior

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: