Announcement

**Weasel** · 24 January 2024, 10:23 AM

Originally posted by mdedetrich View Post

Yeah math scary, big problem for dem little programmer boys

I'd rather say it's more like a mindfuck than scary.

**ssokolow** · 24 January 2024, 10:43 PM

Originally posted by Weasel View Post

Thanks for the link. But I was talking in general, binary trees (not B-Trees but still) were just an example. What I meant was that adding custom things like that is just unnecessarily difficult.

And what I meant was that writing it yourself is probably going to be inferior to reusing a component written and refined by people who have more domain expertise than you do. Hell, writing in something higher-level than a macro assembler is that at the language level.

As for "unnecessarily difficult", I don't find it to be so... what I find is that, if you fuzz the Rust solution and the C or C++ solution and/or run them with sanitizers, you tend to get crashes, corruption, and/or UB from the C++ one until you've incrementally turned it into what Rust forced you to write in the first place.

It's not that writing Rust is unnecessarily difficult, it's that we underestimate how difficult it is to write a correct data structure in the presence of things Rust cares about, like threading. (Which I think you'll agree to be important when my new Ryzen 5 7600 is over 20 times as fast as my old Athlon II X2 270 from 2011... but only if you can saturate all six cores. It's only a little over 3 times as fast for single-core performance.)

Originally posted by Weasel View Post

Right, that's basically an optimization. You can add bounds checks yourself and they can be removed by optimization.

Things like iterating in reverse will also do it. The point is to make the first thing the compiler sees be an access to the highest index you care about, so that it'll strip away the others as unnecessary... and that's assuming you can't just use one of the APIs that are considered more idiomatic anyway, such as Iterator or one of the split_at family of methods, which will internally use well-audited unsafe to bypass bounds checking.

Originally posted by Weasel View Post

The problem is that sometimes they can't be proven, and the code can be correct, so they're there plaguing the binary. For instance if it's an external argument or parameter it can't inspect. Even if it comes from user input, maybe it was already validated up the chain, so there's no point checking for it again (anyway, "validation" needs to have proper error checking and feedback so it still has to be done regardless, an ASSERT is just trash, really should only exist for debugging).

That's what unsafe methods like String::from_utf8_unchecked and Vec::get_unchecked are for... just don't be reckless about using unsafe to bypass checks if you want to keep your reputation as a supplier of libraries intact. (Basically, if you're using it for performance, include benchmarks using something like criterion-rs to demonstrate the need and, if you're using it for non-FFI non-performance stuff, include a definition file for something like cargo-fuzz and a report on the testing methodology you use to demonstrate that you're doing due diligence.)

That's why cargo-geiger exists. To make it easy for people to identify which dependencies contain unsafe-ium so they can check whether their authors are following proper handling practices. (Hell, that's why people using Rust prefer it to C++. In C++, they can't get a signal because everything is full of alpha-emitting unsafe-ium isotope contamination.)

**Weasel** · 25 January 2024, 10:22 AM

Originally posted by ssokolow View Post

And what I meant was that writing it yourself is probably going to be inferior to reusing a component written and refined by people who have more domain expertise than you do. Hell, writing in something higher-level than a macro assembler is that at the language level.

I hate this argument so much. No. Just no.

Yeah I'm sure printf implementations are surely the most optimal, and even better, C++ has absolutely zero cost abstractions, it doesn't even need to parse format-strings right?

I mean everyone says it!!! Zero cost abstractions guys!!!

We are at a time where almost nobody innovates new stuff anymore because, well, they're instructed to just plumb library calls together and make the millionth copy of the same software. Sad days of library plumbers instead of actual programmers.

**darkonix** · 25 January 2024, 06:15 PM

Originally posted by Weasel View Post

I hate this argument so much. No. Just no.

Yeah I'm sure printf implementations are surely the most optimal, and even better, C++ has absolutely zero cost abstractions, it doesn't even need to parse format-strings right?

I mean everyone says it!!! Zero cost abstractions guys!!!

We are at a time where almost nobody innovates new stuff anymore because, well, they're instructed to just plumb library calls together and make the millionth copy of the same software. Sad days of library plumbers instead of actual programmers.

That's an interesting bug analysis for sure. But the main issue seems in macOS implementation rather than the library.

However if you don't trust any library, standard or third-party, then I'm curious what is your workflow. Do you compile without standard libraries? That's not the most usual pattern except maybe for low level development ,embedded or operating system. Other developers rely on the standard libraries without having too much issues.

**ssokolow** · 25 January 2024, 10:22 PM

Originally posted by Weasel View Post

I hate this argument so much. No. Just no.

Yeah I'm sure printf implementations are surely the most optimal, and even better, C++ has absolutely zero cost abstractions, it doesn't even need to parse format-strings right?

I mean everyone says it!!! Zero cost abstractions guys!!!

We are at a time where almost nobody innovates new stuff anymore because, well, they're instructed to just plumb library calls together and make the millionth copy of the same software. Sad days of library plumbers instead of actual programmers.

"Data is not the plural of ancedote" applies here because, no matter how rigorous the analysis is, it's talking about a single instance, so it can't be used to draw general conclusions about other APIs. I could just as easily point to how well-optimized a single function or type from the Rust standard library is and claim that's proof that all Rust functions/types are better than what you can write.
unsafe exists if you really can't do what you need with the safe APIs... but the existence of unsafe doesn't mean that needing to use it for some operations has no value, any more than the existence of void pointers means it's worthless having other types in C and C++.

In other words...

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.
-- Donald Knuth

**Weasel** · 26 January 2024, 11:27 AM

Originally posted by darkonix View Post

That's an interesting bug analysis for sure. But the main issue seems in macOS implementation rather than the library.

However if you don't trust any library, standard or third-party, then I'm curious what is your workflow. Do you compile without standard libraries? That's not the most usual pattern except maybe for low level development ,embedded or operating system. Other developers rely on the standard libraries without having too much issues.

If you care about performance or code quality, you compile without standard libraries, or at the very least test them out (glibc is pretty good for instance). NEVER assume code written by someone else, no matter how ubiquitous it is or how popular it is, is in the "best shape possible" or "optimized to the limits". In fact, quite the opposite, most devs don't give a fuck about it and prefer their stupid maintainability and laziness over the millions of things it is used for.

Obviously not every program you write will have to be squeezed out of perf or unbloated or anything of the sort.

And btw of course it's the implementation? What does that mean? A library is not something abstract. If you mean the interface, sure, sometimes the interfaces can be a performance killer, but no, it's usually the implementation. Other people usually suck. I hate this appeal to authority.

**Weasel** · 26 January 2024, 11:30 AM

Originally posted by ssokolow View Post

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified.
-- Donald Knuth

I fucking hate this scapegoat argument so much. First of all I don't care what Knuth said, he's not special or anything.

But that's not the point. YOU guys keep making claims like how other software, especially ubiquitous ones, are always "better" optimized than what you personally can write. STOP MAKING BULLSHIT ARGUMENTS IF YOU'RE GOING TO LATER CLAIM IT "DOESN'T MATTER".

I don't necessarily have an issue with people admitting that their stupid standard libs aren't in the best shape, but that "they don't have to be". That's a different thing.

However using this as an excuse after you FAILED YOUR ARGUMENT that they're "always better than what you can write" is beyond cringe to me. You fucking lost it so at least admit it. The argument was that the libs aren't always the most optimized, not even close, so there's always a reason to write your own if you have to or want that. Nobody cares if you don't want that, because that's not for you. So WTF is the point of the quote here?

Simple Analogy:

A: "Stop trying to clean your Windows PC of junk files you won't do better than Microsoft" - some clown.
B: "But I just did and cleaned 1 MB..."
A: "Dude but 1 MB doesn't matter..."

WTF? He already fucking lost the argument, who the fuck asked him if it matters or not?

**ssokolow** · 26 January 2024, 01:47 PM

Originally posted by Weasel View Post

I fucking hate this scapegoat argument so much. First of all I don't care what Knuth said, he's not special or anything.

But that's not the point. YOU guys keep making claims like how other software, especially ubiquitous ones, are always "better" optimized than what you personally can write. STOP MAKING BULLSHIT ARGUMENTS IF YOU'RE GOING TO LATER CLAIM IT "DOESN'T MATTER".

That's not what I said and that's not what Knuth said either. The point is "Don't burn effort until you've confirmed that it'll actually make a difference".

In Knuth's case, he's effectively saying "Don't spend half the time before the deadline on a project optimizing the CPU consumption of a tool if it spends 99.999% of its time waiting on I/O unless you've profiled it and found that there's a spot where optimizing CPU will produce a noticeable improvement".

Originally posted by Weasel View Post

I don't necessarily have an issue with people admitting that their stupid standard libs aren't in the best shape, but that "they don't have to be". That's a different thing.

However using this as an excuse after you FAILED YOUR ARGUMENT that they're "always better than what you can write" is beyond cringe to me. You fucking lost it so at least admit it. The argument was that the libs aren't always the most optimized, not even close, so there's always a reason to write your own if you have to or want that. Nobody cares if you don't want that, because that's not for you. So WTF is the point of the quote here?

I never said "always better than what you can write". I said that odds are, if you're reinventing all your wheels, they won't be as good as shared designs that multiple people have had time to test and contribute to. Yes, some of your wheels may be better, but nobody is an expert at everything, so test the wheels and use yours only in the cases where they are better... and pure statistics says that what you write will be worse than what's available ready-made from someone else in the majority of cases.

Originally posted by Weasel View Post

Simple Analogy:

A: "Stop trying to clean your Windows PC of junk files you won't do better than Microsoft" - some clown.
B: "But I just did and cleaned 1 MB..."
A: "Dude but 1 MB doesn't matter..."

WTF? He already fucking lost the argument, who the fuck asked him if it matters or not?

Again, no. My argument is "Don't reinvent the tools that came with your OS until you've confirmed that it's worthwhile, because every tool you rewrite instead of using someone else's is another tool you have to be responsible for maintenance and security updates on."

Does 1MB matter? It depends on the context. On my system? Not unless I can free it and many more megabytes in a highly automated fashion, because I've got roughly 9TB of data on 16TB of total storage and a fraction of a millionth of either number is not worth the time it takes to free by hand.

**Weasel** · 26 January 2024, 04:12 PM

Originally posted by ssokolow View Post

That's not what I said and that's not what Knuth said either. The point is "Don't burn effort until you've confirmed that it'll actually make a difference".

In Knuth's case, he's effectively saying "Don't spend half the time before the deadline on a project optimizing the CPU consumption of a tool if it spends 99.999% of its time waiting on I/O unless you've profiled it and found that there's a spot where optimizing CPU will produce a noticeable improvement".

Ok. I'm not saying I disagree but... legit what does that have to do with the argument?

The argument was that you shouldn't need to write your own data structure or whatever because the standard library provides it, aka it can never be "better". Which is wrong.

Originally posted by ssokolow View Post

I never said "always better than what you can write". I said that odds are, if you're reinventing all your wheels, they won't be as good as shared designs that multiple people have had time to test and contribute to. Yes, some of your wheels may be better, but nobody is an expert at everything, so test the wheels and use yours only in the cases where they are better... and pure statistics says that what you write will be worse than what's available ready-made from someone else in the majority of cases.

I see, but that was just an example. It wasn't an example applicable to everyone. It was just one example where it may be more painful to use it, obviously there's far more similar ones.

Originally posted by ssokolow View Post

Again, no. My argument is "Don't reinvent the tools that came with your OS until you've confirmed that it's worthwhile, because every tool you rewrite instead of using someone else's is another tool you have to be responsible for maintenance and security updates on."

Does 1MB matter? It depends on the context. On my system? Not unless I can free it and many more megabytes in a highly automated fashion, because I've got roughly 9TB of data on 16TB of total storage and a fraction of a millionth of either number is not worth the time it takes to free by hand.

Again, the point isn't whether it matters or not. That's subjective, and it's not the topic. The topic (and analogy) was that the guy made a claim which has one meaning. That meaning is that Microsoft (in the analogy) are always better than what you can do, which is provably wrong. False appeal to authority. It's a logical fallacy and proven wrong in this case.

Note that I also mocked C++'s "zero cost abstractions" in the standard library. Do you know why? I'm not saying to reinvent cout: most of the time it doesn't matter and in fact it's probably smaller if it imports the C++ runtime library since it's shared.

No, the point was that the argument "zero cost abstractions" is PROVABLY WRONG. I never said it's "significant cost abstractions". But "zero cost" has one specific meaning. "Zero" is not subjective, nor is "zero cost". "Insignificant" is subjective. It's not a fact.

So whatever the C++ people always babble about their zero cost abstractions is pure bullshit. Provably bullshit.

Here's an example of an actual zero cost "abstraction":

Code:

int i = 5;

// something that never touches i

func(i);

Compiler will literally replace i with 5 since it wasn't touched, as if you wrote func(5). The variable doesn't exist, etc, unless you're debugging.

That's ACTUALLY zero cost. Most C++ abstractions are not, and neither are the clowns making that claim.

Again, legit nobody asked their opinion if it's significant or not (and I'm not saying to not use them, either). All I'm saying is that "zero cost" has one factual meaning. And it's actually useful in practice to denote stuff that's truly zero cost (and compiles to the same thing due to optimization). They need to stop butchering actually useful words with subjective bullshit.

**ssokolow** · 26 January 2024, 10:35 PM

Originally posted by Weasel View Post

Ok. I'm not saying I disagree but... legit what does that have to do with the argument?

The argument was that you shouldn't need to write your own data structure or whatever because the standard library provides it, aka it can never be "better". Which is wrong.

That wasn't quite the argument I was intending to make. My argument was that:

The trade-off in making data structures more involved to write in exchange for gains elsewhere is reasonable because, in a language with a solid, standard package manager like Rust, someone is almost always going to have written something better than what you'll write unless you waste a ton of time reinventing the wheel. (That has always been Rust's value proposition. Make the stuff that doesn't make sense to constantly reinvent a bit more complicated to write in exchange for significant gains on the stuff that's project-specific.)
When you do the work to evaluate your implementations, you almost invariably find that Rust's complaining was protecting you from lurking edge-case bugs in what you wrote in the other language.

Originally posted by Weasel View Post

Again, the point isn't whether it matters or not. That's subjective, and it's not the topic. The topic (and analogy) was that the guy made a claim which has one meaning. That meaning is that Microsoft (in the analogy) are always better than what you can do, which is provably wrong. False appeal to authority. It's a logical fallacy and proven wrong in this case.

That's fair... but it's not the claim I'm making. I'm saying that. in a language with a good package manager, you can almost always rely on either the standard library or the greater ecosystem to provide something better than what you can write in the time you're willing to spend on stuff that isn't part of your project's secret sauce/core competency/whatever.

...so not C++'s standard library, but C++'s standard library or Boost or Qt or whatever else you can think of.

Originally posted by Weasel View Post

Note that I also mocked C++'s "zero cost abstractions" in the standard library. Do you know why? I'm not saying to reinvent cout: most of the time it doesn't matter and in fact it's probably smaller if it imports the C++ runtime library since it's shared.

No, the point was that the argument "zero cost abstractions" is PROVABLY WRONG. I never said it's "significant cost abstractions". But "zero cost" has one specific meaning. "Zero" is not subjective, nor is "zero cost". "Insignificant" is subjective. It's not a fact.

So whatever the C++ people always babble about their zero cost abstractions is pure bullshit. Provably bullshit.

Here's an example of an actual zero cost "abstraction":

Code:

int i = 5;

// something that never touches i

func(i);

Compiler will literally replace i with 5 since it wasn't touched, as if you wrote func(5). The variable doesn't exist, etc, unless you're debugging.

That's ACTUALLY zero cost. Most C++ abstractions are not, and neither are the clowns making that claim.

No argument. See The Day The Standard Library Died.

So far, Rust has managed to dodge that bullet with a combination of leaving as much functionality in external packages as possible (where they're not tied to the compiler's update cycle), a type system that allows stronger encapsulation and a policy that, if you use unsafe to depend on something we don't consider API and your code breaks, then we're not just going to turn that into de facto API for you.

Some examples of things the Rust compiler and standard library have successfully changed to preserve the zero-overhead abstraction principle in the face of better alternatives becoming available:

They imported an implementation of Google SwissTables named hashbrown and replaced the guts of HashMap<T>
They imported a better channel implementation named crossbeam-channel and replaced the guts of std::sync::mpsc with it. (Though using crossbeam-channel directly still has advantages because it's a multi-producer, multi-consumer queue while std::sync::mpsc is a multi-producer, single-consumer API. Likewise, if you want async, Flume has it while the former two APIs don't do async. std::sync::mpsc is sort of a "we didn't manage to boot this from the standard library before the v1.0 freeze" thing where they'll improve the implementation but not extend the API.)
They rewrote their mutex implementation to catch up with the parking-lot mutex crate. From what I remember, parking-lot still looks better in benchmarks, but only in exchange for worse long-tail latencies, making it less "better" and more "for different use-cases".

There's actually a compiler option to randomize data structure layouts at build time to catch and discourage hidden dependencies on memory layout... it's not on by default because they consider reproducible builds more important.

They also retrofitted automatic structure packing along the way, thanks to their focus on "If you used unsafe to tell the compiler you knew what you were doing and didn't read and follow the docs on what is and isn't API-stable, it's your fault".

The TL;DR: for what goes in the standard library is basically "interfaces, things language syntax depends on, and a few things that aren't changing and which would be pulled in as dependencies to 99% of projects otherwise". For example:

Why are Option and Result in the standard library? Because the ? operator (originally the try! macro) needs to know about them and the ecosystem is intended to standardize on them.
Why is Iterator in the standard library? Because for needs to know about it.
Why did std::sync::OnceCell recently get added to the standard library? Because, every project of non-trivial complexity is likely to depend on lazy_static or once-cell or both, and there was no sign that something was going to come along and provide an improvement on once-cell's API the way once-cell did for lazy_static.
etc. etc. etc.

As an example of the alternative, the only way the http crate could be more standard is if it were in the standard library, but it doesn't need to be. It can be the standard set of interface types for talking about HTTP requests and responses perfectly well without being in the standard library because the standard library contains no HTTP functionality.

Likewise, why are rand and regex not part of the standard library? Because they don't need to be, and being outside it has allowed them to iterate on new API versions without forcing people to upgrade off the branches for the older APIs.

Originally posted by Weasel View Post

Again, legit nobody asked their opinion if it's significant or not (and I'm not saying to not use them, either). All I'm saying is that "zero cost" has one factual meaning. And it's actually useful in practice to denote stuff that's truly zero cost (and compiles to the same thing due to optimization). They need to stop butchering actually useful words with subjective bullshit.

No argument that the C++ people failed at their goals.

Announcement

Autocheck To Check If Your C++ Code Is Safe For Automobiles & Safety Critical Systems

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment