Announcement

Collapse
No announcement yet.

Cloudflare Ditches Nginx For In-House, Rust-Written Pingora

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    Originally posted by darkonix View Post
    That would be the point. Rust compiler verifies that it is valid to use some value if it was assigned to something else previously. For C++ it would require extra work for a partial mitigation since the language doesn't enforce enough to be sure the compiler can catch all the permutations in code. 'Modern' C++ improves on the situation but it is not perfect and developers may just overlook they made a mistake or forgot to use only allowed features. Programmers are humans unfortunately.
    Well in this particular case, all the compiler needs is information that a move ctor/assign op invalidates the source object, at least for a warning, even if it still compiles without error. A compiler attribute can do the trick, or it could be made the default or #pragma for any user-defined move ctor/assignment op, with possible attribute to override this and ignore.

    Originally posted by darkonix View Post
    Executing memory free may be part of the library but Rust does a very good job of helping preventing use-after-free at compile time. It is much more difficult to do that in languages like C or C++ that are not designed to prevent memory issues. One could say that they seem created to encourage them ;-)
    Ok, I guess I wasn't clear enough. I have compiler experience so to me the "language" is just part of the compiler.

    How does Rust know that the function you execute, which is the "free" function, actually frees the memory? Most functions are black boxes, unless the compiler treats them specially as builtins or with attributes. SInce "free" is a builtin Rust function, it knows that it frees memory allocated by the respective allocator, and so it can give errors on a use-after-free.

    But it doesn't know this for arbitrary memory allocation functions, such as from an external library. HeapFree for example. Or some_new_library_free you just wrote. How does it deal with those? It can't. The behavior of the "free" is hardcoded into the compiler.

    Note that a lot of builtins for C/C++ are also hardcoded and the compiler emits warnings on misuse! It only does this because it knows the behavior of those functions, they're baked into the compiler proper. It wouldn't know this for external, "black box" APIs.

    The simple fix here is easy, and it already exists. Add a compiler attribute to mark such functions. We have __attribute__((__malloc__(...))) for this.

    This is obviously not a Rust-specific solution, it works equally everywhere. The only difference is that on C/C++ it might give a warning instead of a flat out error. Not an issue in practice, your code should be clear of warnings to begin with, and if a warning is a false positive, just mask it out selectively where necessary.

    Btw, you can use memory allocations now in constexpr code in C++, aka 100% at compiler time, so you can clearly see, that this is not a C++ limitation. The compiler has to special case the allocators since it has to do the entire calculation and evaluation at compile time, including memory allocation and free.

    And yes, it's an actual error in constexpr to do use-after-free or to not even free the allocated memory (a memory leak is a flat out error). This goes to show you that, yes, you can detect this stuff at compile time in C++ as well. All it needs is for the compiler to know which functions allocate and which deallocate.

    Originally posted by darkonix View Post
    I believe that the usual approach is to encapsulate those system calls into unsafe wrappers so the rest of the application is isolated from issues there. So in that regard I agree with you, but with the caveat that unsafe sections of code need to be annotated explicitly with unsafe, unlike C/C++ where anything can be unsafe by default. I would give some credit to Rust because of that.
    See above. The issue isn't so much the unsafe block itself, but rather that the Rust compiler will have no clue what the function does. It doesn't know it frees memory. It's not like it can read the documentation and parse the english and understand what it does. It needs to be informed of this, via some attribute. Which can be done in C/C++ as well, so it's not Rust specific. And it already exists.

    Originally posted by darkonix View Post
    That's actually the main 'selling' point of Rust; being safe by default. C and C++ aren't safe by default. You can potentially use only 'modern' C++ features to achieve a higher level of safeness but you can not have the same guarantees that Rust is designed to achieve. Other nicer features of Rust are more related to being a much modern language, however they would not be a reason strong enough to encourage a migration at this time.
    C/C++ are not really unsafe by default, you can just do unsafe stuff without an "unsafe" block. But, for example with C++ casts, they're ugly by design, and stand out and are easy to grep for, just like "unsafe" blocks in Rust. C casts aren't, unfortunately, and a lot of people still use them where not needed.

    I like C don't get me wrong, but the casts are definitely one of the most dangerous points about it, mostly because of the syntax not being very obvious when reading how dangerous it really can be. I'm not saying unsafe code is bad, don't get me wrong, I love hacks that squeeze out performance in critical areas, but at least make them easy to spot when you debug. And C casts don't have that.

    Anyway you can write "safe by default" C++ code in most cases, except for the "move" ctor/assignment op which, imo, should be fixed with an attribute. I still don't understand why GCC or Clang didn't add one yet, while they have one for malloc.

    Comment


    • #92
      Originally posted by Weasel View Post
      How does Rust know that the function you execute, which is the "free" function, actually frees the memory? Most functions are black boxes, unless the compiler treats them specially as builtins or with attributes.
      Rust does not need any special builtin or compiler attributes to know that.
      It uses borrow checker for that.

      When implementing a new resource type such as your customized RAII unique_ptr, you do something like this:

      Code:
      /// It is actually better to use std::ptr::NonNull, but for the simplicity reason I avoid that
      struct UniquePtr<T>(*mut T);
      
      /// Similar to C++'s operator *
      impl<T> std::ops::Deref for UniquePtr<T> {
          type Target = T;
      
          /// Here, the borrow checker kicks in.
          /// Rust provides a lifetime infer rule: When there's only one argument that is a reference
          /// and the return type is also a reference, they are inferred to have the same lifetime (scope).
          ///
          /// So the returned &T is considered to have the same lifetime as self.
          /// If self is dropped, then any &T is considered to be invalid
          fn deref(&self) -> &Self::Target {
             unsafe { &*self.0 }
          }
      }
      
      /// Same as Deref, but for mutable reference (alias)
      impl<T> std::ops::DerefMut for UniquePtr<T> {
          /// Same rule here
          fn deref_mut(&mut self) -> &mut Self::Target {
              unsafe { &mut *self.0 }
          }
      }
      See, rust does not need any special compiler built-in or attribute like you describe to find out use-after-free and reject them.

      Comment


      • #93
        Originally posted by NobodyXu View Post

        Rust does not need any special builtin or compiler attributes to know that.
        It uses borrow checker for that.

        When implementing a new resource type such as your customized RAII unique_ptr, you do something like this:

        Code:
        /// It is actually better to use std::ptr::NonNull, but for the simplicity reason I avoid that
        struct UniquePtr<T>(*mut T);
        
        /// Similar to C++'s operator *
        impl<T> std::ops::Deref for UniquePtr<T> {
        type Target = T;
        
        /// Here, the borrow checker kicks in.
        /// Rust provides a lifetime infer rule: When there's only one argument that is a reference
        /// and the return type is also a reference, they are inferred to have the same lifetime (scope).
        ///
        /// So the returned &T is considered to have the same lifetime as self.
        /// If self is dropped, then any &T is considered to be invalid
        fn deref(&self) -> &Self::Target {
        unsafe { &*self.0 }
        }
        }
        
        /// Same as Deref, but for mutable reference (alias)
        impl<T> std::ops::DerefMut for UniquePtr<T> {
        /// Same rule here
        fn deref_mut(&mut self) -> &mut Self::Target {
        unsafe { &mut *self.0 }
        }
        }
        See, rust does not need any special compiler built-in or attribute like you describe to find out use-after-free and reject them.
        That's not the freeing of memory, though, it's just the "move" thing of Rust that invalidates the source (still a good thing, of course, but not what I was talking about). There's no guarantee here that you actually free the memory correctly (or with the correct function, even!).

        Imagine e.g. allocating with HeapAlloc and freeing with VirtualFree, Rust wouldn't complain.

        Comment


        • #94
          Originally posted by Weasel View Post
          That's not the freeing of memory, though, it's just the "move" thing of Rust that invalidates the source (still a good thing, of course, but not what I was talking about). There's no guarantee here that you actually free the memory correctly (or with the correct function, even!).

          Imagine e.g. allocating with HeapAlloc and freeing with VirtualFree, Rust wouldn't complain.
          That is indeed an issue rust cannot guarantee to find out, but I think the "move" part or the drop part (destroy) part is more likely to be wrong than having allocate and free using different mechanism.

          The allocation and deallcation are typically made by the associated function of the new type, written once, checked by devs and peers, run tests multiple times before deployed.
          These types of code are unsafe, meaning extra caution is required.

          Where as the error I mentioned is much more common and they are rejected by the compiler, unless you deliberately workaround it by using pointers and use unsafe code to dereference them, but that can be easily find out during review.

          Comment


          • #95
            Originally posted by Weasel View Post
            That's not the freeing of memory, though, it's just the "move" thing of Rust that invalidates the source (still a good thing, of course, but not what I was talking about). There's no guarantee here that you actually free the memory correctly (or with the correct function, even!).

            Imagine e.g. allocating with HeapAlloc and freeing with VirtualFree, Rust wouldn't complain.
            Freeing memory or allocating memory is not done by explicit function call. Rust does it on its own (eg. if object gets out of scope, it gets freed, if object loses owner, it also gets freed etc). It is done purely statistically so Rust always know exact moment when something gets freed. This enforces diffrent style of programming but one when you don't have such issues.

            If you call from OS api alloc and after free technically you are correct that Rust wouldn't complain but there is another issue... without unsafe() Rust wouldn't also allow you to use that memory... what means Rust as long as safe aspect of language goes will remain safe.

            Anyway a lot of your questions would be answeared if you read few chapters of Rust programming language book (official book to learn Rust). Especially chapter 4 (Understanding ownership). It wont' take you long (probably less then 30 mins). In general concept of Ownership, Mutability, Borrowing are concepts that work together to make sure that memory issues are not existing. You have generally speaking 3 types of memory managment:
            - explicit like C/C++ where every memory allocation/freedom is done by programmer quite explicitly (even if it is hidden by some abstraction),
            - dynamic for languages with GC when memory allocation/free is done by virtual machine/garbage collector and is totally out of sight of programmer,
            - (Rust) by concept of ownership/scope, when memory allocations/freedom is known at compile time but are not done by programmer

            You cannot claim Rust is memory unsafe if you approach it way like C/C++ because that memory will be unusable.

            Sidenote: Rust (as far as safe part of language goes) guarantees safety of memory, but doesn't guarantee safety of memory leakage. Chapter 15.6 from Rust book is exactly that type of issue. https://doc.rust-lang.org/book/ch15-...ce-cycles.html

            Still it is way harder in Rust comparing to C/C++.

            Also that concept of ownership and borrow checker is reason why many beginnier programmers actually dislike Rust because often compiler will complain about things that technically aren't unsafe but will not allow you to compile until you write it in way that it is clean for compiler that it will know it is safe.
            Last edited by piotrj3; 21 September 2022, 10:28 AM.

            Comment


            • #96
              It would be very interesting to see a comparison between Pingora and HAProxy someday.

              Comment


              • #97
                Originally posted by Mahboi View Post
                It would be very interesting to see a comparison between Pingora and HAProxy someday.
                People still uses HAProxy these days? Wow

                Comment


                • #98
                  Originally posted by darkonix View Post

                  People still uses HAProxy these days? Wow
                  What? It's still as high throughput as it ever was. What are you talking about?

                  Comment


                  • #99
                    Originally posted by Mahboi View Post

                    What? It's still as high throughput as it ever was. What are you talking about?
                    Last time I used HAProxy was 10+ years ago. I honestly assumed it has been long relaced by ngnix.

                    Comment


                    • Originally posted by darkonix View Post

                      Last time I used HAProxy was 10+ years ago. I honestly assumed it has been long relaced by ngnix.
                      It has but hasn't had its performance replaced. The entire point of HAProxy is to be a C load balancer/reverse proxy with super high throughput. It's just that apart from Cloudflare, people were happy enough with Nginx doing the entire work including with its bigger overhead.

                      https://www.haproxy.org is still there with its age old look and 1990s design and performance.

                      Comment

                      Working...
                      X