Announcement

Collapse
No announcement yet.

Rust Developers Move Ahead With Preparing To Upstream More Code Into The Linux Kernel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by cj.wijtmans View Post
    No i stand by what i said. C++ standard is only syntax and STL and some minor things such as main(), RTTI, globals. All of the underlying work is up to the tooling. C++ CAN have memory safety its just never worked on by the tooling implementations. It even took google to make ASAN. C++ is also getting modules and its taking a long time to implement. I dont hold my breath for memory safety to be standardized though because its completely out of the scope of the C++ syntax. Unlike rust which has centralized tooling.
    Don't hold your breath. C & C++ just can't be "fixed" for memory safety. Modern C++ and tooling advances are really nice but many companies including Google say Rust is the way to go for writing future code that is safer.

    Comment


    • #32
      Originally posted by ryao View Post

      You would need a static analyzer that can catch all runtime errors (and then manually fix all of them) before execution to get similar memory safety. This one claims to be able to do that:

      https://www.absint.com/astree/index.htm

      The pricing is not public, so it is likely exhorbitantly expensive, which is why most C++ developers have likely never heard of it.
      There are a couple of them, but the issue is that C and C++ don't provide enough information for the compiler or static analyzer to know what the code is supposed to do. Most of them will have a set of macros, but they're not standardized, they're pretty ugly, and you need to make sure to use them.

      Comment


      • #33
        Originally posted by cj.wijtmans View Post
        Modern C++ compilers has ASAN. I dont know if that would make it "memory safe" but its pretty much similar to rust that it can be turned off and on. Again the issue is not with c++ its with the implementations and tooling. Rust syntax cringes me out but also its centralized tooling and lack of standards are scary as well. You can see why big corp are pushing out.
        IMHO the "lack of standards" and "centralised tooling" criticisms miss the main point: standards matter when and only when the implementations are proprietary. Rust, including its compiler, library and all the tools are open source. They can't lead to a lock-in situation and they can't leave you stranded if the proprietary compiler suppliers disappears or decides to end support.

        In fact, in a FOSS world a standards-driven programming language can be a hindrance, not a benefit. Take C++: virtually every compiler in existence implements standards incompletely, but at the same time adds its own extensions. In other instances, two compilers will behave differently even though both are technically compliant with the standard. The fact is that every nontrivial project needs to be developed with a specific compiler in mind. Case in point: Linux itself. It took clang a while before it was able to build the kernel (and it took considerable effort both on clang's part and on the kernel's part), and no compiler other than gcc and clang can reliably build it. It's much easier to develop software when you know that it will always use the One True Compiler rather than deal with the minefield of various partially compatible but subtly different implementations.

        Comment


        • #34
          Originally posted by baryluk View Post
          Which architectures will be supported? If we see for example some PCIe device drivers or file systems in Rust in the future, I hope some niche architectures are not left without support. I know there is gcc rust compiler in pretty good shape and will be part of GCC 13 officially (so probably May 2023), which does support quite a lot of platforms, but not sure what Kernel will use by default. I had various issues with standard rustc compiler, not playing nicely with some tools.
          That Rust GCC front-end is completely irrelevant here, and only exists to appease people who insist on having multiple implementation of a language, with the results described in the post above mine.

          The project you want to follow is rustc_codegen_gcc. It uses the rustc front-end (and standard library, of course), with GCC as a backend, for code generation. This means that rustc can emit code for almost any target supported by GCC, without forking the language.

          Comment


          • #35
            Originally posted by cj.wijtmans View Post
            No i stand by what i said. C++ standard is only syntax and STL and some minor things such as main(), RTTI, globals. All of the underlying work is up to the tooling. C++ CAN have memory safety its just never worked on by the tooling implementations. It even took google to make ASAN. C++ is also getting modules and its taking a long time to implement. I dont hold my breath for memory safety to be standardized though because its completely out of the scope of the C++ syntax. Unlike rust which has centralized tooling.
            Language needs to work together with tooling to make it efficient. You can make a lot of things into tooling for C++ but it won't be required in language itself. If it won't be required then a lot of projects simply won't use it. As you said "C++ CAN have memory safety" it's only "can". It's not and it will never be necessary due to backwards compatibility. In Rust it's not "can". It's simply a thing and you can't just not use it. It's core part of language and language doesn't exists without it, unlike C++. That's why despite C++ can have many things, it won't be better than language that is actually built around these things.

            Comment


            • #36
              Originally posted by dragon321 View Post

              Language needs to work together with tooling to make it efficient. You can make a lot of things into tooling for C++ but it won't be required in language itself. If it won't be required then a lot of projects simply won't use it. As you said "C++ CAN have memory safety" it's only "can". It's not and it will never be necessary due to backwards compatibility. In Rust it's not "can". It's simply a thing and you can't just not use it. It's core part of language and language doesn't exists without it, unlike C++. That's why despite C++ can have many things, it won't be better than language that is actually built around these things.
              C++ is efficient so what are you talking about?

              Comment


              • #37
                Originally posted by jacob View Post

                IMHO the "lack of standards" and "centralised tooling" criticisms miss the main point: standards matter when and only when the implementations are proprietary. Rust, including its compiler, library and all the tools are open source. They can't lead to a lock-in situation and they can't leave you stranded if the proprietary compiler suppliers disappears or decides to end support.

                In fact, in a FOSS world a standards-driven programming language can be a hindrance, not a benefit. Take C++: virtually every compiler in existence implements standards incompletely, but at the same time adds its own extensions. In other instances, two compilers will behave differently even though both are technically compliant with the standard. The fact is that every nontrivial project needs to be developed with a specific compiler in mind. Case in point: Linux itself. It took clang a while before it was able to build the kernel (and it took considerable effort both on clang's part and on the kernel's part), and no compiler other than gcc and clang can reliably build it. It's much easier to develop software when you know that it will always use the One True Compiler rather than deal with the minefield of various partially compatible but subtly different implementations.
                You can turn on strict c++ so no extensions are used. Compilers behaving differently is a good thing as long as they follow the c++ standards. What is the issue?

                Comment


                • #38
                  Originally posted by GrayShade View Post

                  There are a couple of them, but the issue is that C and C++ don't provide enough information for the compiler or static analyzer to know what the code is supposed to do. Most of them will have a set of macros, but they're not standardized, they're pretty ugly, and you need to make sure to use them.
                  Again not a c++ issue. A tooling issue. While debugging there is plenty of information, sometimes even the code itself is inside the debug information. The issue is that its third party and not by the toolchain itself. The information not being there in a release build is a good thing. Theres memory safety then theres other security issues with too much information in a binary.

                  Comment


                  • #39
                    Originally posted by cj.wijtmans View Post

                    You can turn on strict c++ so no extensions are used. Compilers behaving differently is a good thing as long as they follow the c++ standards. What is the issue?
                    The issue is that "following the C++ standards" is irrelevant; what is relevant is to produce correct working machine code. If two compilers behave differently, you must assume one of those two behaviours, and thus having another incompatible compiler that behaves differently becomes a liability and a source of problems. That it is also "standards compliant" is a meager consolation if the plain fact is that it creates a bug in your program.

                    Or you resolve yourself to code to the lowest common denominator and/or fill your code with #ifdef's. Development requires considerably more effort, maintainability suffers, and for what? The users of the software couldn't care less that it was written in "standards compliant C++", all they will see is software where bugs take longer to get fixed and new features take much longer to get implemented.

                    In short, trying to develop software without assuming which compiler it will be built with and with which options is a major PITA. It was a necessary evil (which means it was first and foremost an evil) in the era of proprietary, expensive and closed source compilers and programming languages. In a world where languages, compilers and tools are both free speech and free beer, it's a solution looking for a problem.

                    Comment


                    • #40
                      Originally posted by cj.wijtmans View Post

                      Again not a c++ issue. A tooling issue. While debugging there is plenty of information, sometimes even the code itself is inside the debug information. The issue is that its third party and not by the toolchain itself. The information not being there in a release build is a good thing. Theres memory safety then theres other security issues with too much information in a binary.
                      Sorry, but I don't think you understand the divide between what a C program is allowed to do and the guarantees the Rust compiler gives you. C doesn't even have the language to describe those guarantees. It's a Whorfian divide.

                      Take some basic C function:

                      Code:
                      char * frob(char *, char *, int, int);
                      If you're lucky, you might get some parameter names. If you've got a bit of programming experience, you might have some ideas about what this function might do. It looks like it takes two pointers and two lengths, so they look like buffers, however:

                      there's nothing in the code to actually guarantee that
                      you might not even have the code (it's probably from some ancient library)
                      you don't know if any or either of the pointers can be NULL
                      you don't know if these are bytes or characters, and if embedded 0s will have a special meaning or not
                      you don't know, in principle, which buffer has which length
                      you don't even know if they are actually buffers, as the code might cast the pointers to something else
                      lengths are normally size_t, who knows why these are int?
                      there's no const there. This might be intentional, or just old code, or a programmer who doesn't trust in language restrictions, but you don't know what the function reads and where it writes
                      there's nothing to say what the return value is -- part of the first buffer, part of the second one, either, something else?
                      you don't know what the function does. Is it a strcat, a strcpy, a string searching function, some mathematical (vector) op?
                      you don't know if the function stashes away one of the input pointers, or if it pulls the return value out of a global or a static
                      finally, regardless of what you answered to these questions, the answers might change when you modify the function, since they're not reflected in the signature

                      Once again, C and C++ don't have the vocabulary to answer these. The compiler doesn't know the answers. The debug info doesn't have the answers. Even if you're looking at the source code, it's not trivial to answer them.

                      So here you start adding annotations for a static analyzer. These look like:

                      Code:
                      _Outptr_result_buffer_​(np) char * frob(_Inout_updates_​(np) char *, _In_reads_(nq) char *, int np, int nq);


                      This is just one of the possible interpretations above. The annotation syntax is SAL, available in the Microsoft compiler (any mistakes mine). No mechanical process will be able to add these automatically for non-trivial code.

                      But let's assume you can spend a year to annotate all of your code. What will you do with the libraries you're using? Do you expect your vendors to annotate their? What will you do with the OS APIs? Should everyone project re-do this work? What happens if you switch compilers or static analyzers? And remember that code without annotations (or with the wrong annotations) is still correct C and needs to compile.

                      So how does Rust help here?

                      Code:
                      fn frob<'a, 'b>(p: &'a mut [u8], q: &'b [u8]) -> &'a mut [u8]
                      This is different in some important ways:

                      Rust buffers (slices) aren't disembodied pointers and lengths, they exist at a language level. Any slice will know its length.
                      the function declaration clearly says it's modifying the first buffer. If you forget to say mut, the code doesn't compile. If you say mut, you've reserved the right to change the buffer in the future, even if you don't do it today.
                      the return value is a slice itself, and needs to exist for as long as p exists

                      C and C++, and every static analyzer I've seen, don't even have the terms to describe a temporal relationship between pointers or references. You can't say "this value must, in every invocation in the program, outlive this other value". You also can't say "this value has thread affinity", which is a story for another day. In Rust you have to describe these complex relationships.

                      Does Rust still need unsafe code? Yes, but it's going to be limited in scope if you write idiomatic Rust. Can it express every kind of complex relationship? No. Other languages will be more expressive. Is it the language to end all languages? Probably not.

                      But it solves problems which exist today. Mozilla, Microsoft, Google and Apple all say that 70% or more of their severe bugs are caused by memory- and thread-safety issues, which would not exist in Rust. And that proportion hasn't gone down in the last decade, despite of the various security pushes in each of these companies. This isn't because they don't have good tooling. Microsoft created SAL and made static analysis mandatory in some code bases, IIRC. The reason is that nobody can write a non-trivial amount of correct C or C++ code, and the "sufficiently smart" compiler you believe in will never exist.
                      Last edited by GrayShade; 16 November 2022, 12:30 PM.

                      Comment

                      Working...
                      X