Fish Shell Outlines Their Successes & Challenges Going From C++ To Rust

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Weasel
    Senior Member
    • Feb 2017
    • 4534

    Originally posted by ssokolow View Post
    My point is that you're drawing an arbitrary line between "of course we should have that abstraction" and "that abstraction is unnecessary". There's nothing special about the line C chose to draw.
    That wasn't my point. You brought malloc up as a counter argument to what I said before, but malloc is not special in C, so it was a bad analogy. That's all. I didn't say abstractions are always bad.

    But IMO abstractions should just be syntactic sugar. When abstractions interfere with how you want to write the feature and dictate it, it starts smelling and gimping proper devs.

    Originally posted by ssokolow View Post
    I could say the same thing about C and typecasting, given that machine code has no conception of assigning "data types" to variables instead of to opcodes.

    Typecasting is something that's perfectly legitimate when necessary, but gratuitous use of it is a code smell. Same for Rust and unsafe.
    Well the huge difference is that the typed system doesn't force you to write your code/feature in a specific way. It doesn't dictate how you have to do it. On the stack, type casting is usually pointless since compiler already optimizes the storage, so unless you want to reinterpret those bits...

    For pointed-to memory, though, granted C by default does suck here because it lacks the "may_alias" attribute, so you can't do it legally by the standard (union type punning is also non-standard). However that is offered by GCC (and I think LLVM) so it's fine in the real world. Once you have that you can just have two different pointers to/from void* without any type casting.

    Originally posted by ssokolow View Post
    My point is that, even within a single platform, there are multiple calling conventions. C abstracts away following a calling convention the same way Rust abstracts away freeing memory.
    Except as a HLL it is necessary to abstract the calling convention, since otherwise your code wouldn't be portable across different architectures (even 32-bit and 64-bit of the same x86 btw). That's a requirement for HLLs. And C can't do anything about this by itself without compiler features, since it's not tied to a specific arch.

    Originally posted by ssokolow View Post
    Because some parts of BASIC don't map well to the hardware. My point was that parts of pre-structured BASIC that are relevant to this discussion map better to the hardware's conception of procedures/functions than C does.

    I should have just stuck with the mention of FORTRAN 57 or ALGOL 58 rather than giving you the opportunity to ignore the point I was making in favour of jumping on some irrelevant side detail.
    Sorry, I don't know those languages (other than hearing about them), not sure what you want me to tell you here...?

    Comment

    • Weasel
      Senior Member
      • Feb 2017
      • 4534

      Originally posted by darkonix View Post
      C is considered a low level language in the present.
      By who?

      Low level usually means they work only on a specific CPU architecture since it's tied to it. That's not the case with C.

      Comment

      • DumbFsck
        Senior Member
        • Dec 2023
        • 346

        Originally posted by Weasel View Post
        What?

        Code:
        // some 0..63 type where it's initialized
        if((unsigned)input > 63)
        throw error;
        
        // more crap
        return value_based_on_input;
        Anyone who sees this knows for a fact that this function's return value is 0...63. Same crap with NULL.

        If we're talking black boxes, aka different programs or closed source libs or whatever, then like I said, you can annotate them as you wish, but that doesn't mean they'll be respected and this applies to every language, even Rust.

        It's not like it's enforced. If you use Rust, and the library is written in C or unsafe Rust or whatever, it can violate whatever Rust expects just fine. Not like it enforces it since you don't see the code.

        If you know the library is "safe" then that's the same shit as knowing its source code. If you don't know its source code, how do you know this? You don't. You trust what it claims and that's no different than trusting its API documentation or comments or a "typedef" that restricts a type to a range (even if it doesn't actually restrict it btw, it's for annotation/clarity).

        People like you who say references can't be NULL... you can pass NULL reference from a black box just fine. It's not enforced since you don't have the source code. In fact it's impossible to enforce it without source code.
        I'm a layman, so take this question as is, and if it is a dumb question, that should be understandable, since I'm a layman.

        Aren't these kind of things easier to test when coming straight from a library? I heard people like Rust's testing infra, is it easier to test these kinds of returns from libs in Rust than some other languages? Or is it more bothersome or uncomfortable?


        Also, does rust have some sort of "assert"? I know it is considered a code smell in general (or am I mistaken?) But at least for a debug build it should help narrow down if black boxes are "correct", right?

        Comment

        • darkonix
          Senior Member
          • Sep 2021
          • 395

          Originally posted by Weasel View Post
          By who?

          Low level usually means they work only on a specific CPU architecture since it's tied to it. That's not the case with C.
          Is has a.lower level of abstraction from the hardware compared to languages like Java or python, hence low level.

          Comment

          • ssokolow
            Senior Member
            • Nov 2013
            • 5137

            Originally posted by Weasel View Post
            That wasn't my point. You brought malloc up as a counter argument to what I said before, but malloc is not special in C, so it was a bad analogy. That's all. I didn't say abstractions are always bad.
            malloc not being special in C is the problem... especially when you see compilers trying to make it special with warnings about misusing it that can't trigger reliably.

            Originally posted by Weasel View Post
            But IMO abstractions should just be syntactic sugar. When abstractions interfere with how you want to write the feature and dictate it, it starts smelling and gimping proper devs.
            Rust's abstractions are syntactic sugar... they're just syntactic sugar where you need to apply a special marker to their un-annotated counterparts so people auditing your code can grep for it.

            Originally posted by Weasel View Post
            Well the huge difference is that the typed system doesn't force you to write your code/feature in a specific way. It doesn't dictate how you have to do it.
            My bad. I'd forgotten that C was worse than Python.

            Code:
            void main(void) {
              int foo = 5;
              foo = foo + "six";
            }
            Code:
            % gcc test.c
            test.c: In function ‘main’:
            test.c:3:7: warning: assignment to ‘int’ from ‘char *’ makes integer from pointer without a cast [-Wint-conversion]
                3 |   foo = foo + "six";
                  |
            Code:
            % python3
            Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0] on linux
            Type "help", "copyright", "credits" or "license" for more information.
            >>> foo = 5
            >>> foo = foo + "six"
            Traceback (most recent call last):
            File "<stdin>", line 1, in <module>
            TypeError: unsupported operand type(s) for +: 'int' and 'str'
            That warning won't trigger reliably if you accidentally obfuscate things hard enough.

            Originally posted by Weasel View Post
            Except as a HLL it is necessary to abstract the calling convention, since otherwise your code wouldn't be portable across different architectures (even 32-bit and 64-bit of the same x86 btw). That's a requirement for HLLs. And C can't do anything about this by itself without compiler features, since it's not tied to a specific arch.
            Early high-level languages like FORTRAN 57 and ALGOL 58 disagree with you, supporting BASIC GOSUB-style "build your own calling convention" program structuring while also being portable between CPUs. (If you don't remember, GOSUB in compiled dialects of BASIC is basically just a portable wrapper around CALL/RET or equivalent with no provisions made for argument passing or value-returning. You're expected to do something like using globals to pass things in and out.)
            Last edited by ssokolow; 12 January 2025, 04:36 PM.

            Comment

            • ssokolow
              Senior Member
              • Nov 2013
              • 5137

              Originally posted by DumbFsck View Post
              I'm a layman, so take this question as is, and if it is a dumb question, that should be understandable, since I'm a layman.
              Anyone who says dumb questions exist is dumb. We all have to start somewhere.

              Originally posted by DumbFsck View Post
              Aren't these kind of things easier to test when coming straight from a library? I heard people like Rust's testing infra, is it easier to test these kinds of returns from libs in Rust than some other languages? Or is it more bothersome or uncomfortable?
              Well, first, you don't need to write as many tests in Rust since the type system can catch more problems. For example, only the raw pointer type for C FFI is nullable without being wrapped in Option<T>, so you don't need to test passing null for arguments because it just won't compile.

              (And, because Rust has proper sum type support (first-class compiler-supported tagged unions... something that was conceived at the same time product types (structs) were) , you "make invalid states unrepresentable" more often... for example, none of that Go "both/neither of the result and the error return fields have something in them" nonsense. Fallible Rust functions return a Result<T, E> and you have to specify what to do in both cases to get access to their contents... even if "what to do if Err(E)" is "assert failure" (.unwrap()) or "return failure and let the caller handle it" (the ? operator).)

              (There are API-unstable "only the standard library or nightly builds of the compiler may use this" annotations that teach the compiler about "niche optimization" so that, if you put something non-nullable like a reference into an Option<T> or Result<T, E> when using Rust's default ABI, the compiler will repurpose that bit pattern to be the tag for the union, and it's smart enough to search recursively for cases like Result<Option<T>, E> as used in things like Rusqlite where the "get one row" convenience querying API may error out or return zero rows and you might want to ? the Result up to the caller and handle the Option locally.)

              That aside, Rust's built-in testing has three options, all of which get run when you type cargo test:

              First, for unit testing, you can slap this at the bottom of any source code file:

              Code:
              #[cfg(test)]
              mod test {
                  use super::*;
              
                  #[test]
                  fn my_thing_works() {
                      assert!(thing() == "Yahoo!");
                  }
              }
              (The cfg, mod, and use are so the dead code warning doesn't fire off for non-test builds. #[cfg(...)] is equivalent to #ifdef. Having it in the same file grants it access to private members of the code under test.)

              Second, for integration testing, you can create files under YOUR_PROJECT_FOLDER/tests and it will call the #[test]-marked functions in them. (In C, you'd probably use something like #pragma test for #[test].)

              Third, the testing is integrated with the API documentation generator for functionality akin to Python's doctests, where, unless you opt out of it, cargo test will compile and run the code samples in your API documentation to ensure they haven't fallen out of sync with the APIs they're documenting. (Yes, the apidoc tool has a syntax for hiding lines in a Markdown code block so you don't have to clutter up every example in your API docs with boilerplate.)

              Originally posted by DumbFsck View Post
              Also, does rust have some sort of "assert"? I know it is considered a code smell in general (or am I mistaken?) But at least for a debug build it should help narrow down if black boxes are "correct", right?
              Rust has two asserts. debug_assert!() which behaves like C's assert and is omitted from release builds, and assert!() which is included in release builds and can be used as a hint to the optimizer for things like bounds check elimination. (i.e. assert the length of the array before engaging in your too-complex-for-an-iterator iteration.)

              There are also *_eq and *_ne variants of both of them, similar to what various testing libraries do, which pretty-print their arguments for more convenient debugging.

              You just want to be careful when you use them because it's very rude for your library to kill someone's program for a "programmer error" that's triggered by unexpected input instead of reporting failure so they can't handle it.
              Last edited by ssokolow; 12 January 2025, 04:42 PM.

              Comment

              • Weasel
                Senior Member
                • Feb 2017
                • 4534

                Originally posted by ssokolow View Post
                malloc not being special in C is the problem... especially when you see compilers trying to make it special with warnings about misusing it that can't trigger reliably.
                Why should malloc be special in the language? What about other memory allocators from other libraries? Compilers can add attributes to them that's totally fine. What's not fine is to coerce them into the core language rules somehow, especially when you can allocate memory with other functions.

                Originally posted by ssokolow View Post
                My bad. I'd forgotten that C was worse than Python.

                Code:
                void main(void) {
                int foo = 5;
                foo = foo + "six";
                }
                Code:
                % gcc test.c
                test.c: In function ‘main’:
                test.c:3:7: warning: assignment to ‘int’ from ‘char *’ makes integer from pointer without a cast [-Wint-conversion]
                3 | foo = foo + "six";
                |
                Code:
                % python3
                Python 3.10.12 (main, Nov 6 2024, 20:22:13) [GCC 11.4.0] on linux
                Type "help", "copyright", "credits" or "license" for more information.
                >>> foo = 5
                >>> foo = foo + "six"
                Traceback (most recent call last):
                File "<stdin>", line 1, in <module>
                TypeError: unsupported operand type(s) for +: 'int' and 'str'
                That warning won't trigger reliably if you accidentally obfuscate things hard enough.
                Wait, what am I missing? What's your point here? Just a contrived test? What I meant is C doesn't dictate how you design your algorithm and logic in your code unlike ownership/borrow checker in Rust does. Creating a new variable with proper type is not redesigning your algorithm or code.

                Originally posted by ssokolow View Post
                Early high-level languages like FORTRAN 57 and ALGOL 58 disagree with you, supporting BASIC GOSUB-style "build your own calling convention" program structuring while also being portable between CPUs. (If you don't remember, GOSUB in compiled dialects of BASIC is basically just a portable wrapper around CALL/RET or equivalent with no provisions made for argument passing or value-returning. You're expected to do something like using globals to pass things in and out.)
                That's what I meant. They don't use the stack frame of the CPU, except return address (in which case they had to use frame pointer, since you could be arbitrarily deep, some CPUs had max nested limits). Inefficient and can be done in C if you want to, with GOTO, it's nothing special. It's not using special features of the CPU, at least with x86.

                You can't separate it into its own function, but then BASIC functions were labels anyway, so it fits. GCC allows inner/nested functions in C (C++ allows it in the standard if you use a local class), which would allow you to do this more BASIC-like in syntax I guess, but using functions instead of labels.

                Perhaps the closest "modern" equivalent that I use is AutoHotkey, which supports both BASIC GoSub style and C-like functions, which are obviously more efficient and can hold local variables (their own stack frame, although it's interpreted so it's not the real stack of the CPU directly).
                Last edited by Weasel; 13 January 2025, 10:58 AM.

                Comment

                • ssokolow
                  Senior Member
                  • Nov 2013
                  • 5137

                  Originally posted by Weasel View Post
                  Why should malloc be special in the language? What about other memory allocators from other libraries? Compilers can add attributes to them that's totally fine. What's not fine is to coerce them into the core language rules somehow, especially when you can allocate memory with other functions.
                  Funny enough, Rust demonstrates that they don't need to be as long as the language's ability to abstract is powerful enough... but you do want some kind of language "construct" on top of them to reliably detect mismatching of allocation and free operations.

                  That was sort of the lesson Rust learned in the two years before v1.0 when they were replacing all sorts of built-ins with standard library constructs.

                  Basically, if C++ had unsafe and malloc and free were unsafe in C++ to discourage bypassing std::unique_ptr and friends, and the same had been done for things like the array/vector indexing APIs that perform no bounds checking, that'd be sufficient.

                  My point is that, if the commonplace interface used to interact with the allocator is just a pair of ordinary functions, it's not sufficient.

                  Originally posted by Weasel View Post
                  Wait, what am I missing? What's your point here? Just a contrived test? What I meant is C doesn't dictate how you design your algorithm and logic in your code unlike ownership/borrow checker in Rust does. Creating a new variable with proper type is not redesigning your algorithm or code.
                  I'd forgotten that it's only a warning in C if you try to do something like adding an integer and a string literal without using a typecast to say "I meant to do that"... a warning that I remember will sometimes fail to trigger if the code is in certain shapes.

                  Originally posted by Weasel View Post
                  That's what I meant. They don't use the stack frame of the CPU, except return address (in which case they had to use frame pointer, since you could be arbitrarily deep, some CPUs had max nested limits). Inefficient and can be done in C if you want to, with GOTO, it's nothing special. It's not using special features of the CPU, at least with x86.

                  You can't separate it into its own function, but then BASIC functions were labels anyway, so it fits. GCC allows inner/nested functions in C (C++ allows it in the standard if you use a local class), which would allow you to do this more BASIC-like in syntax I guess, but using functions instead of labels.

                  Perhaps the closest "modern" equivalent that I use is AutoHotkey, which supports both BASIC GoSub style and C-like functions, which are obviously more efficient and can hold local variables (their own stack frame, although it's interpreted so it's not the real stack of the CPU directly).
                  My point is that, no, the amount of structure C forces on the design of a function is not intrinsically part of being a portable language.

                  ...and I say it in support of my prior statement that, back when structured programming started to come in, people who were wizards at using spaghetti code to squeeze every last iota of performance out of the hardware were bemoaning the limitations languages like C place on GOTO... and em they had to drop into inline assembly to bypass them rather than just using unsafe and raw pointers.
                  Last edited by ssokolow; 14 January 2025, 04:55 AM.

                  Comment

                  • Weasel
                    Senior Member
                    • Feb 2017
                    • 4534

                    Originally posted by ssokolow View Post
                    Funny enough, Rust demonstrates that they don't need to be as long as the language's ability to abstract is powerful enough... but you do want some kind of language "construct" on top of them to reliably detect mismatching of allocation and free operations.

                    That was sort of the lesson Rust learned in the two years before v1.0 when they were replacing all sorts of built-ins with standard library constructs.

                    Basically, if C++ had unsafe and malloc and free were unsafe in C++ to discourage bypassing std::unique_ptr and friends, and the same had been done for things like the array/vector indexing APIs that perform no bounds checking, that'd be sufficient.

                    My point is that, if the commonplace interface used to interact with the allocator is just a pair of ordinary functions, it's not sufficient.
                    But you can still call malloc in Rust as a normal function can't you? I'm talking about the function from libc or whatever shared library (jemalloc for instance). If not, that's huge yikes.

                    Let's make it simpler. How do you even deal with HeapAlloc in Rust then? Are you telling me Rust can't even use some of the Windows API without unsafe? This literally makes me double down on my stance. Do you see where I'm getting with "malloc is not special" now and how Rust can cripple you if you just wanted to use such a function from a random library? Not the standard library, I'm sick of "magic".

                    The point is that any language that doesn't allow you to call an API of a library easily simply sucks.

                    Originally posted by ssokolow View Post
                    My point is that, no, the amount of structure C forces on the design of a function is not intrinsically part of being a portable language.

                    ...and I say it in support of my prior statement that, back when structured programming started to come in, people who were wizards at using spaghetti code to squeeze every last iota of performance out of the hardware were bemoaning the limitations languages like C place on GOTO... and em they had to drop into inline assembly to bypass them rather than just using unsafe and raw pointers.
                    I know what you mean here. And IMO you need assembly for that, because you need to control the stack completely. In the spirit of this discussion, there could be a lower level language than C where the stack pointer was not abstracted… but the problem with that is that some architectures have too weird stacks. Some have alignment requirements and so on. So it wouldn't be very portable.

                    It also wouldn't allow the compiler to optimize the stack usage but that's another topic.

                    Comment

                    • ssokolow
                      Senior Member
                      • Nov 2013
                      • 5137

                      Originally posted by Weasel View Post
                      But you can still call malloc in Rust as a normal function can't you? I'm talking about the function from libc or whatever shared library (jemalloc for instance). If not, that's huge yikes.

                      Let's make it simpler. How do you even deal with HeapAlloc in Rust then? Are you telling me Rust can't even use some of the Windows API without unsafe? This literally makes me double down on my stance. Do you see where I'm getting with "malloc is not special" now and how Rust can cripple you if you just wanted to use such a function from a random library? Not the standard library, I'm sick of "magic".

                      The point is that any language that doesn't allow you to call an API of a library easily simply sucks.
                      Again, you seem to be wilfully misunderstanding how Rust is meant to be used.

                      Yes, calling malloc requires unsafe because calling any C function through FFI requires unsafe because unsafe means "Stuff the compiler has to trust the human's auditing ability on" because C lacks the concept of explicit lifetimes. Think of it as related to how C++ requires you to use extern "C" to export un-mangled function names.

                      ...and it is "easily simply". You just have to wrap your call to the foreign function in unsafe { ... } so people know where to start looking if a memory bug is discovered.

                      The only mainstream language that FFIs with C more easily than Rust is C++, and that's because, until... C99, I believe... diverged things, it was literally a superset of it where the original compiler (Cfront) translated it to C.

                      Originally posted by Weasel View Post
                      I know what you mean here. And IMO you need assembly for that, because you need to control the stack completely. In the spirit of this discussion, there could be a lower level language than C where the stack pointer was not abstracted… but the problem with that is that some architectures have too weird stacks. Some have alignment requirements and so on. So it wouldn't be very portable.

                      It also wouldn't allow the compiler to optimize the stack usage but that's another topic.
                      I literally just said that you don't need assembly for function calling more primitive than C's and it is portable, because that's what pre-C, pre-structured programming languages like FORTRAN 57 and ALGOL 58 did.

                      Again, you're claiming that C occupies some special "the closest you can reasonably get to portable assembly" niche when it does not and is, in fact, the winner of a previous round of tantrum-throwing over restricting what programmers can easily do.

                      Comment

                      Working...
                      X