Announcement

**jacob** · 18 June 2021, 05:44 AM

Originally posted by ultimA View Post

These are exactly the C++ vs Rust myths that I was talking about. First of all, if anything, I was talking about unique_ptr and not shared_ptr. With shared_ptr you can easily have circular dependencies and leaks, an unclear object ownership is basically guaranteed, not to mention it is non-zero overhead (if that matters matters for somebody). So no wonder Rust's borrowing system seems a lot superior, but only because people *wrongly* associate modern C++ memory-management with shared_ptr. But even if you use unique_ptr, that's not the only thing I am talking about. When I say you have make proper use of the library and language features in C++ and then you are memory-safe, I mean not only unique_ptr, but proper and consequent use of std::string_view, containers such as std:array and std::vector, range-based loops or otherwise iterators, native references, adhering to parameter passing rules and so on. If you stick to these in C++, you will notice there is nothing that CAN go wrong, because proper and valid memory access is guaranteed by the interfaces. And suddenly your memory access patterns become provably correct like in Rust, and the compiler is checking it for just you like in Rust (basically by the means of the interface-contracts). These are no promises made by the programmer here like you claim, only strong guarantees that are easily provable in your code, given by the C++ spec and its standard library.

But unique_ptr provides no guaranty, yet alone strong guaranty. For example:

Code:

unique_ptr<int> a[100];
*a[3] = 10;

-> segmentation fault

Or how about this one:

Code:

unique_ptr<int> foo()
{
  int a;
  return unique_ptr<int>(&a);
}

There you have an unique_ptr that outlives the pointed object, just like that.

Originally posted by ultimA View Post

Correct. And using arbitrary addresses is exactly what makes memory access and management unsafe in general in C and C++.

Almost no-one ever assigns a pointer to an arbitrary address in C++ or in Rust, unless they are doing very low level code. The thing that makes memory management unsafe in C++ is that any C++ compiler accepts and compiles stuff like the above. You can trivially have situations when unique_ptr are not really unique, where several sets of shared_ptrs independently point to the same object, you can have unitialised pointers galore, null pointers that are never checked and anything you like. In fact unique_ptr and friends don't make C++ any safer than the old C++ or even plain C, they are just a bit of syntactic sugar over the same old model. In Rust on the other hand, none of that would compile.

Originally posted by ultimA View Post

This is why I was talking about an "opt-in" with C++. Your C++ code can both easily and provably memory safe, but as programmer, you need to be consequent about the use of the features I mentioned above. If you go without using the standard library or write code as it was taught 10 years ago, then yes you will likely introduce some memory-related bugs in any non-trivial program. In Rust I said safety-features are "opt-out", because you explicitly have to introduce unsafe contexts to your code to be able to do bad things, but once you are granted arbitrary pointer manipulation and access, you really can do almost anything (including stuff you shouldn't) like in C.

There is this belief among C++ programmers who don't understand Rust (but believe that they do) that if they only stick to the "modern" C++ then it's safe, but you've just seen a demonstration that it isn't. In fact trying to prove memory management in C++, even using exclusively the "modern" features, is tantamount to the Turing Machine halting problem and thus is not provable. Plus there is a LOT of C++ that doesn't even use that, for example the iostream classes still take void* pointers....

There was a study published by Google showing that about 50% of their memory-related bugs in C++ were less than 1 year old. That doesn't mean that their new code is somehow dramatically worse than old code, it means that as older bugs get fixed, new ones are unstoppably coming in.

Among the other reasons why C++ can't be as safe as Rust:

Rust's type system is strong, provably sound and uses Hindley-Milner inference, C++'s is weak (albeit very slightly stronger than C's)
Rust has well defined move semantics, move in C++ is basically undefined behaviour
(consequence of the above) in Rust all objects have a clear lifetime, functions can be data sinks and it is possible to enforce that an object cannot ever be used after a certain operation. In C++ this does not exist.
in Rust the compiler ensures that all possible values in match{} (switch in C++) are covered, in C++ it doesn't
Error management in Rust is entirely under the scope of the type system, borrow checker, is itself memory safe, thread safe and provably handles all cases. in C++ you have sometimes potentially null pointers (which is not possible in Rust), sometimes exceptions that involve very strange and often unpredictable pointer lifetime and aliasing errors and can actually themselves crash your code
Rust is safe from data races, which is proven at compile time with no need for runtime support.

Originally posted by ultimA View Post

I know that some threading checks in Rust are done at compile time, but to be able to do that requires information about various threads and entry points that in a kernel-context cannot always be known or available to the compiler at compile-time. So unless shown otherwise by a Rust implementation in the kernel, I will stay skeptical about how viable (and thus how useful) this is in kernel-programming. The other threading-related features, that are available in Rust's standard library, can of course be ported to the kernel, but you could do that for any other language too that has a threading library, so that's really not a unique "advantage" to Rust. Obviously, I was talking about the static checks, not about library features.

You clearly don't understand how threading safety works in Rust. You can create a threading library for any language, but unless that language has a type system like Rust's (or, in a different approach, Haskell's), which C++ doesn't, then you won't ever be able to prove that your code is sound. Then you are back to the same place as with the memory: either you insert runtime checks and incur a performance penalty (that's what Go and Swift do), or you take someone's word for it and wait for the program to crash, which it will.

**moltonel** · 18 June 2021, 06:11 AM

Originally posted by Alexmitter View Post

I think this is a non issue, because the gcc front end will rather quickly become the standard way to compile rust.

Sorry, but this is delusional. People eager to improve the Rust language will not want to work on a compiler implemented in C++. The whole language design workflow is organized around Rustc's Github, and is a well-tuned system. Both Rustc and LLVM have more developers than Gcc as a whole (1, 2). The GNU community is not bad, but it's much less welcoming than the Rust community. It'll be a challenge for gccrs to stay relevant, and not repeat the failures of Gcc's Java, D, and Go frontends despite trying to tackle a more complex and fast-moving language.

If you want to compile Rust code with Gcc (many people do), you'll be better served by rustc_codegen_gcc.

**jacob** · 18 June 2021, 06:43 AM

Originally posted by moltonel View Post

Why use a language that's twice as good when you can use a language that's 4 times as good ? Any new tech has to provide a good ratio between the expected benefits and the cost of change. The later is similarly huge for "the kernel's 2nd language" whether that's Rust, C++, or D.

Well you said it yourself, proving beats testing. But besides that, the D ecosystem is nowhere as mature and well managed as Rust despite being older and the language never seems to know where it really wants to go (gc vs nogc vs nogc-but-we-still-actually-want-a-gc but we don't really want to have to use it). Everything is permanently unfinished, no clear decisions are made. As a result D never got any solid backing and it looks like a perpetual hobby project for Walter and Andrei. I'm not saying it couldn't be used for the Linux kernel technically speaking, but it's not something I would bet Linux's future on if I was a kernel developer.

**ultimA** · 18 June 2021, 06:44 AM

Originally posted by jacob View Post

Almost no-one ever assigns a pointer to an arbitrary address in C++ or in Rust, unless they are doing very low level code.

Almost no-one ever assigns a pointer to an arbitrary address in C++ or in Rust, but that doesn't invalidate what I said about arbitrary pointer accesses being the problem. It doesn't matter how the invalid pointers come to life. Off-by-one indexing? Already freed memory? Invalid alignment? It doesn't have to be by direct assignment. The problem in the end is that pointers can have arbitrary values.

Originally posted by jacob View Post

Code:

unique_ptr<int> a[100];
*a[3] = 10;

-> segmentation fault

Or how about this one:

Code:

unique_ptr<int> foo()
{
int a;
return unique_ptr<int>(&a);
}

...

There is this belief among C++ programmers who don't understand Rust (but believe that they do) that if they only stick to the "modern" C++ then it's safe, but you've just seen a demonstration that it isn't. In fact trying to prove memory management in C++, even using exclusively the "modern" features, is tantamount to the Turing Machine halting problem and thus is not provable. Plus there is a LOT of C++ that doesn't even use that, for example the iostream classes still take void* pointers....

The only thing you've demonstrated here is that if you want to purposefully compile bad code in C++, you can. Both are great examples of errors that, while theoretically accepted by the language, are irrelevant for any comparison or real-life examples, as any mainstream compiler (GCC, clang, MSVC, possibly others too) will warn you about these during compilation (or depending on compiler flags even error out). You can of course choose to neglect the compiler messages, but with that attitude as a programmer people are going to have a hard time compiling Rust code too.

Originally posted by jacob View Post

There was a study published by Google showing that about 50% of their memory-related bugs in C++ were less than 1 year old. That doesn't mean that their new code is somehow dramatically worse than old code, it means that as older bugs get fixed, new ones are unstoppably coming in.

Perfectly neglecting the fact that new bugs "less than 1 year old" doesn't mean the bugs were introduced in code adhering to new style. The basic assumption when I claim C++ is memory safe is that you are making proper use of the modern standards. And most code today cannot do that as there are restrictions regarding these due to the historical nature of the old code base that has to be maintained. And even if a code base finally manages to migrate to a new toolset, the old code that already existed still needs to be rewritten.

Originally posted by jacob View Post

Among the other reasons why C++ can't be as safe as Rust:

Rust's type system is strong, provably sound and uses Hindley-Milner inference, C++'s is weak (albeit very slightly stronger than C's)
Rust has well defined move semantics, move in C++ is basically undefined behaviour
(consequence of the above) in Rust all objects have a clear lifetime, functions can be data sinks and it is possible to enforce that an object cannot ever be used after a certain operation. In C++ this does not exist.
in Rust the compiler ensures that all possible values in match{} (switch in C++) are covered, in C++ it doesn't
Error management in Rust is entirely under the scope of the type system, borrow checker, is itself memory safe, thread safe and provably handles all cases. in C++ you have sometimes potentially null pointers (which is not possible in Rust), sometimes exceptions that involve very strange and often unpredictable pointer lifetime and aliasing errors and can actually themselves crash your code
Rust is safe from data races, which is proven at compile time with no need for runtime support.

There are many things wrong with your list. Move semantics being UB is false, object lifetimes in C++ not being clear or well-defined is a complete BS, the complete coverage of all options in a switch is checked by basically all compilers today, error management does not at all involve unpredictable pointers, and aliasing issues are again checked by compilers and pretty often tricks with aliasing are even explicitly wanted by the programmer and could be well defined. As far as data races are concerned, preventing data races was never a hard problem. What makes them non-trivial is to avoid them efficiently in a contended scenario that scales well. Show me the implementation of a high-performance thread-safe lockless multi-writer multi-consumer queue in Rust. Or a cache implementation under similar circumstances. It won't be any easier to develop or understand than in C++. Those are the things that make multi-threading a complex topic, simply preventing data races never was.

But as I said earlier, I do agree Rust has much better "defaults" and is thus easier to teach and learn when you're starting out.

**moltonel** · 18 June 2021, 06:50 AM

Originally posted by jacob View Post

As a result D never got any solid backing and it looks like a perpetual hobby project for Walter and Andrei. I'm not saying it couldn't be used for the Linux kernel technically speaking, but it's not something I would bet Linux's future on if I was a kernel developer.

Agreed. If it wasn't clear, in my mind the "4x as good" language is Rust, not D or C++.

**mvniekerk86** · 18 June 2021, 06:57 AM

Folks - if big corporations decide to support, fund and enable something as globally (and human as a species) important like getting Rust in the Linux kernel, at least give it space and time.
This is not a fad - if billions of dollar's worth of servers, mobile devices and PCs can be spared from exploited code paths by using Rust, allow at least the people that is doing the actual work to continue to do their work.
Tooling takes time - as steward of the kernel I do think Linus Torvalds has the experience and sobriety not to include rubbish into his kernel.

**oleid** · 18 June 2021, 07:21 AM

Originally posted by ultimA View Post

There are many things wrong with your list.

* Move semantics being UB is false

I think they are referring to the state of the object which was moved. You can still access it and that is UB.

Originally posted by ultimA View Post

* object lifetimes in C++ not being clear or well-defined is a complete BS

What was originally is the following:

1. You create an object in a scope.
2. you move it into a function
3. the function "forgets" the object

In c++ you can move an object, but you can still access it in the original scope. Some compilers warn about that, others not.
The usage for the aforementioned procedure is the TypeState pattern.

Originally posted by ultimA View Post

the complete coverage of all options in a switch is checked by basically all compilers today

True, sadly it is only a warning, not a hard compile error. While you can make warnings a hard compile error, this is usually not done in real life, as it will break everything once you use a newer compiler. Luckily my employer tries hard to have a warning free code base.

Originally posted by ultimA View Post

and aliasing issues are again checked by compilers

Huh? How can a C++ compiler know when pointer addresses are generated at run time (i.e. via malloc under the hood)?

Originally posted by ultimA View Post

As far as data races are concerned, preventing data races was never a hard problem. What makes them non-trivial is to avoid them efficiently in a contended scenario that scales well.

Really?

The problem is not implementing a queue, the problem is to find the places where to insert this construct. And keep it data race free even after the n-th refactoring.

**ZeroPointEnergy** · 18 June 2021, 07:34 AM

Originally posted by jacob View Post

You are looking at it from a typical C point of view, but realise that the only reason why C developers are so hostile to dependencies is because managing them in C projects is a nightmare. In Rust, cargo makes that seamless. Now if your project needs solving 50 sub-problems that are not relevant to your main task, it is better to reuse code and pull in dependencies that are well known, well documented and tested in practice, rather than re-implement them yourself. Not only it takes unnecessarily more effort to create your project that way and makes your code more bloated, but it's almost always done badly in a quick & dirty way.

No, I'm looking at it from a user and maintainer perspective. It may all be convenient to the programmer of the software, but as a user or package maintainer is a huge nightmare of dependencies that can't be separated in a clean way. Hell, sometimes cargo pulls even multiple version of a crate into the dependencies. But this isn't a unique problem for rust. A lot of modern applications just go the way of bundling everything together. I really just don't like that and try to stay away from such software as much as possible.

As for the programing side of things. I would love something like Rust, but with actual dynamic libraries like C has. I don't care if you think that is outdated. It's in my opinion an incredibly more robust approach to develop software that lasts and gets maintained for a long time. There is a reason that like 90% of the base system of a given Linux is still C.

**oleid** · 18 June 2021, 07:41 AM

Originally posted by ZeroPointEnergy View Post

I would love something like Rust, but with actual dynamic libraries like C has.

Well, nobody stops you from using dylib crates. Then all your dependencies will be linked dynamically.

EDIT: This is what the systemd-developers are discussing about, IF they were to adopt rust in their code base.

**ssokolow** · 18 June 2021, 08:28 AM

Originally posted by ultimA View Post

but proper and consequent use of std::string_view

As Modern C++ Won't Save Us by Alex Gaynor demonstrates, that and other things are easier said than done under C++'s opt-in, retrofitted safety model.

Originally posted by ultimA View Post

If you stick to these in C++, you will notice there is nothing that CAN go wrong, because proper and valid memory access is guaranteed by the interfaces.

It also points out things like how you can dereference a std:

ptional<T> that's empty and the compiler will let you get away with it.

Originally posted by ultimA View Post

In Rust I said safety-features are "opt-out", because you explicitly have to introduce unsafe contexts to your code to be able to do bad things, but once you are granted arbitrary pointer manipulation and access, you really can do almost anything (including stuff you shouldn't) like in C.

What makes Rust different is that the whole API ecosystem is built around a philosophy that unsafe is to be used to construct abstractions like Arc<T> which enforce safety at compile time through their API designs... thus ensuring that auditing for memory safety need only occur locally, not globally.

Originally posted by moltonel View Post

Google and Microsoft (who can be assumed to have best-in-class programmers, practices, and tooling) stats have shown that this kind of "good programmer" is a myth (and I say that as a good programmer

.

Not just Google and Microsoft. Alex Gaynor also did a blog post where he tallied up citations for half a dozen different sources that support that point.

What science can tell us about C and C++'s security

Originally posted by ultimA View Post

Move semantics being UB is false

The argument is that it's too easy to accidentally invoke undefined behaviour with C++'s non-destructive moves.

See Move semantics in C++ and Rust: The case for destructive moves by Radek Vit.

Originally posted by ZeroPointEnergy View Post

No, I'm looking at it from a user and maintainer perspective. It may all be convenient to the programmer of the software, but as a user or package maintainer is a huge nightmare of dependencies that can't be separated in a clean way.

If you think C and C++ are any better, you haven't looked into it. See the "Gotta go deeper" section of Let's Be Real About Dependencies, which touches on the elephant in the room:

Header-only libraries and hand-rolled re-implementations that it's not feasible for package maintainers to split back out again.

Originally posted by ZeroPointEnergy View Post

Hell, sometimes cargo pulls even multiple version of a crate into the dependencies.

You'd rather the status quo, of having people up and down the stack vendoring patched versions or holding back updates to deal with symbol collisions while they wait for their dependencies to get their transitive dependencies back in sync?

Originally posted by ZeroPointEnergy View Post

As for the programing side of things. I would love something like Rust, but with actual dynamic libraries like C has. I don't care if you think that is outdated. It's in my opinion an incredibly more robust approach to develop software that lasts and gets maintained for a long time. There is a reason that like 90% of the base system of a given Linux is still C.

So would a lot of people. The problem is, it's a Hard Problem and an area of active study. Monomorphization (i.e. templates/generics) doesn't play well with dynamic linking, and it's been argued that the C ABI is the highest stable one we had before Swift came along because it's right at the edge of what we know how to make stable without the abstraction tricks Swift uses.

See The impact of C++ templates on library ABI by Michał Górny from Gentoo.

But this isn't a unique problem for rust. A lot of modern applications just go the way of bundling everything together.

...because they have real problems that need to be solved and, instead of trying to help solve them, distro packaging systems are trying to convince them that their priorities are wrong.

Distro packaging was designed around the needs of C and C++. That's visible in how it doesn't attempt to address things things like bespoke implementations of functionality that gets reinvented differently by everyone that needs it for want of a more modern dependency system.

Also, Rust is suitable for current approaches to library sharing... using the same techniques as C++.

Manually declare C ABIs to be built as shared libraries where it makes sense and, for the stuff where it doesn't make sense to do that (eg. monomorphized generics that no other packages need), link it statically and rely on the tooling's ability to query which dependencies are being used to determine when you need to rebuild what.

There's work in progress to design a specification and tooling for embedding that dependency information in the built binaries and, as for shared libraries for other purposes, Drew Devault did a blog post showing some actual statistics on how claimed benefits work out in actual practice... and his blog post is talking about C and C++.

Announcement

Google Wants To See Rust Code In The Linux Kernel, Contracts The Main Developer

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment