Announcement

Collapse
No announcement yet.

It's Past Time To Stop Using egrep & fgrep Commands, Per GNU grep 3.8

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    "Because we want you to revisit old scripts." Why? "Because."

    Comment


    • #32
      Originally posted by coder View Post
      No, it's not. If something made sense to be a shell script, in the first place, this really doesn't change that calculus.

      Shell scripts excel at:
      • executing programs
      • collecting & acting on their return codes
      • stream management
      • job management
      • file manipulation

      Yes, you can do those things in any language, but only with considerably more complexity and verbosity. If you appreciate using the right tool for the job, then something like egrep/fgrep deprecation would barely be a rounding error in the calculation of which to use for what.

      About the only language I've seen that even comes close to shell scripts, for those things, is Perl. And that owes to the fact that it was explicitly designed to supersede shell scripts. I don't particularly like Perl, but I respect its strengths.

      As an aside, I think one of Perl's biggest liabilities is also the reason it caught on so quickly: how much it borrowed from the tools that preceded it. That meant people coming from the background of writing csh scripts with lots of awk and sed could take to it like a fish to water, but those from any other background would have a big learning curve to climb. Python won out, due to its approachability, portability, and focus on core language features ...but it sucks, if what you really wanted was a shell script.
      I would suggest awk as a healthy middle ground between sh and Perl. It has its own warts but it's much harder to shoot yourself in the foot. Or at least to blow it off entirely. It is easy to do most of the things shell is good at and the best part is it's a good old POSIX tool. It is available even on all but the tiniest Linux-based embedded systems.

      Comment


      • #33
        PCRE2 brings us from 1997 to 2015 and it's the best hammer in search of a nail we have.

        Comment


        • #34
          Originally posted by TheCycoONE View Post

          Sounds like a fine use for an alias. Maybe shortem it to "eg" while you're at it.
          An alias will work, but only in the current profile/shell.

          The other alternative is removing /bin/egrep and creating a /bin/egrep that executes grep -E with passed arguments

          Comment


          • #35
            Originally posted by ssokolow View Post
            Things like how easy it is to wind up with the wrong value in $? because you didn't sleep well.
            Yeah, the automatic variables thing can be a source of trouble. It's something Perl does, as well.

            What I meant was the ability to do things like:
            Code:
            if cmd1 && cmd2
            then
                do_something
            else
                do_otherwise
            fi
            Or the more concise form:
            Code:
            cmd1 && cmd2 && do_something || do_otherwise
            Of course, they're not exactly equivalent, since failure of do_something can trigger do_otherwise. So, if do_something can fail and you don't want do_otherwise to happen if it does, then you'd have to guard it.

            Anyway, A && B || C is one idiom I tend to use a fair amount.

            Originally posted by ssokolow View Post
            You can from subprocess import call or even from subprocess import check_call as c to make it concise.
            Again, if you're worried about verbosity, you can't beat shell at its own game. Another useful construct:
            Code:
            VAR=$(cmd1 arg arg $(cmd2 eyepatch | grep parrot))
            Executes cmd2 and filters it with grep, to supply the remaining args of cmd1, the output of which goes into VAR.

            That's the kind of concise notation you get from using a language that's properly adapted to the problem domain. ...which is not to turn this into a referendum on bash. It has its warts, but the idea that you're better off using a general purpose language for traditional shell scripting tasks is laughable at best, and dangerous at worst.

            Originally posted by ssokolow View Post
            in Rust. (The ? does a checked exception-esque early return on "failed to exec" and the .success() is a guaranteed-to-be-portable abstraction over how the platform expects programs to signal success/failure on exit.
            You don't always need portability. In such cases, it's often not worth the tradeoffs of going to something like Rust.

            Originally posted by ssokolow View Post
            You can use std:: process::Command; or even write a quick little macro (hygienic and token-based) to condense things further.
            I think I get how programming works. I've been doing it for a little while. Building up your own abstractions takes time, bloats code, and creates more opportunities for bugs. Often, it's better to simply use a tool that's already adapted for the task at hand.

            Originally posted by ssokolow View Post
            Fair. One can argue over logical vs. physical lines and I'd be OK saying 10-20.
            It depends a lot on what you're doing and really why it's long. If it has some nontrivial data structures or control flow, that's where shell scripts tend to hit growing pains.

            Originally posted by ssokolow View Post
            I have my Vim configured to run Flake8, ...
            That's right ...I knew about Flake8. I think we had that set to run during code reviews, and I got annoyed at how many arbitrary style points it decided to treat as warnings.

            Originally posted by ssokolow View Post
            With Rust, once a dependency is on Crates.io, that version can't change
            That's terrible for security. With executable programs and shared libraries, if there's a bug or limitation in some program or library, you can just install a newer version with the fix. Or fix it yourself!

            Originally posted by ssokolow View Post
            "if it built today, it'll build tomorrow".
            So, in other words: "If it crashes or has limitations or exploits today, it'll have them tomorrow". Unless the upstream developer decides to address them, which is fine because we know all developers are instantly responsive and sensitive to the needs and concerns of all users.


            Or, maybe there's some wisdom in weak/late binding, in the way that it preserves freedom and flexibility for users? It generally works quite well.

            Comment


            • #36
              Well, compared to glibc DT_HASH deprecation, at least this one has a sane deprecation process.

              Comment


              • #37
                Originally posted by rclark View Post
                One of things I did in Windows land for my company, it convert all the batch files calling vb scripts in excel and such to python. Greatly simplified the processes as all in one place. And 'much' more maintainable (and turns out more reliable too).
                Treating DOS/Windows Batch files as equivalent to something like bash...


                If you have a decent number of scripts which are all doing a limited set of things, then yeah, it's easy to build up a small Python module to streamline them.

                Shell scripts are more of an all-purpose Swiss army knife for tackling a varied set of automation and administrative tasks. In a UNIX-type environment, the prevalence of filesystem bindings and line-oriented, text-based formats truly empowers the shell. That's good for interactive use, and the magic of shell scripts is that anything you can do interactively can be captured in a script and vice versa.

                Comment


                • #38
                  Originally posted by coder View Post
                  Yeah, the automatic variables thing can be a source of trouble. It's something Perl does, as well.

                  What I meant was the ability to do things like:
                  Code:
                  if cmd1 && cmd2
                  then
                  do_something
                  else
                  do_otherwise
                  fi
                  Or the more concise form:
                  Code:
                  cmd1 && cmd2 && do_something || do_otherwise
                  Of course, they're not exactly equivalent, since failure of do_something can trigger do_otherwise. So, if do_something can fail and you don't want do_otherwise to happen if it does, then you'd have to guard it.

                  Anyway, A && B || C is one idiom I tend to use a fair amount.
                  In interactive use, I use stuff like that. In scripts, I prefer a touch more verbosity in exchange for more readability. (As the Rust devs say, "code is read more than it's written")

                  Heck, part of the reason I use zsh is that it has its own equivalent to omitting the braces in a single-line C flow control block:

                  Code:
                  for X in *.foo; echo "$X"
                  ...doesn't change the fact that shell script semantics for choosing when to split or not split on whitespace make simple interactive use easy at the expense of making more complex stuff difficult and error-prone. For automation, I much prefer opt-in splitting like shlex.split(something).

                  Originally posted by coder View Post
                  Again, if you're worried about verbosity, you can't beat shell at its own game. Another useful construct:
                  Code:
                  VAR=$(cmd1 arg arg $(cmd2 eyepatch | grep parrot))
                  My intent wasn't to compete with shell script on concision, but to show that it's not as bad as it first appears in Python or Rust.

                  Originally posted by coder View Post
                  That's the kind of concise notation you get from using a language that's properly adapted to the problem domain. ...which is not to turn this into a referendum on bash. It has its warts, but the idea that you're better off using a general purpose language for traditional shell scripting tasks is laughable at best, and dangerous at worst.
                  I'll have to disagree. My perspective is that the idea that shell scripting is suited for more than simple one-offs and "Set/unset these variables and then exec a program" wrappers is dangerous. This is something that's typically used for tasks with outsized risk, that are a hassle to set up automated tests for, like filesystem manipulation. I want as few footguns as possible.

                  Originally posted by coder View Post
                  You don't always need portability. In such cases, it's often not worth the tradeoffs of going to something like Rust.
                  I was just giving context for why it exists. I chose it because, for the desired task, it's actually more concise.

                  Originally posted by coder View Post
                  I think I get how programming works. I've been doing it for a little while. Building up your own abstractions takes time, bloats code, and creates more opportunities for bugs. Often, it's better to simply use a tool that's already adapted for the task at hand.
                  I didn't intend to call doubt on your expertise. I just have no idea how familiar you are with Rust... and I am using the tool most adapted to the task... when the task is not being woken up at 3AM to fix surprise in-production bugs. "How much can we catch at compile time without becoming Haskell?" is sort of Rust's raison d'etre.

                  Originally posted by coder View Post
                  It depends a lot on what you're doing and really why it's long. If it has some nontrivial data structures or control flow, that's where shell scripts tend to hit growing pains.
                  To me, shell hits growing pains as soon as you need something fancier than or die or ON ERROR RESUME NEXT semantics for dealing with error returns.

                  Originally posted by coder View Post
                  That's right ...I knew about Flake8. I think we had that set to run during code reviews, and I got annoyed at how many arbitrary style points it decided to treat as warnings.
                  Not a surprise. It's a combination of the PyFlakes static analyzer and the PEP8 style linter. I addressed the majority of its complaints by setting autopep8 to run on save.

                  Originally posted by coder View Post
                  That's terrible for security. With executable programs and shared libraries, if there's a bug or limitation in some program or library, you can just install a newer version with the fix. Or fix it yourself!
                  1. cargo install ignores Cargo.lock by default (so some projects instruct users to install with the version of the command which selects the locked versions, even if they're stale/insecure. Humans.), cargo build only considers your Cargo.lock (i.e. the top-level one), not those of your dependencies, updating all dependencies, direct and transitive, to a new minor version is as simple as cargo update, and Cargo.toml versions are interpreted in accordance with semver compatibility rules. Updating direct (but not transitive) dependencies to a new major version is as simple as using cargo upgrade from the cargo-edit package (which is effectively a nursery for new built-in commands.)
                  2. The part about Crates.io is no different than how many build manifests store hashes of external dependencies so that, if something like the Boost 1.76 tarball has its contents change, it fails the build.
                  3. It's part of their "fearless upgrades" strategy to get people comfortable with not over-eagerly reaching for vendoring and other tools to force the issue. In line with that, Cargo.lock also stores and checks SHA256 hashes for all dependencies, direct or transitive, to ensure that, if Crates.io does get compromised, it can't slip an exploit in at a point when you're not paying attention. (Crates.io complements that by not letting an on-Crates.io package depend on something outside of it, similar to how GreasyFork has a whitelist of trusted JavaScript CDNs that @require is restricted to for scripts they host.)

                  Originally posted by coder View Post
                  So, in other words: "If it crashes or has limitations or exploits today, it'll have them tomorrow". Unless the upstream developer decides to address them, which is fine because we know all developers are instantly responsive and sensitive to the needs and concerns of all users.
                  I don't follow. How is it any different than any other system which uses stored hashes to force upstream to bump the version when making changes?

                  Again, "fearless upgrades" philosophy. Humans gonna human, and Rust's approach is to make developers feel comfortable that they're in control of when changes will happen, so they can trust the tools and upgrade more readily, rather than reaching for things like vendoring, which just bury the problem. See also https://wiki.alopex.li/LetsBeRealAbo...otta-go-deeper

                  Rust has things like the RUSTSEC database, which GitHub's DependaBot and the cargo audit tool query, so you've got an automation-friendly way to get notified of when you need to upgrade (and, if you're using DependaBot, then it can automatically generate a PR to update the Cargo.lock'd versions and kick off a test suite run on it), and there's an in-progress effort to standardize a way to embed a complete manifest of all dependency versions in Rust executables so you can check if they're vulnerable even if they've become separated from the APT/RPM/Flatpak/etc. package metadata that's currently used for that sort of thing.

                  Originally posted by coder View Post
                  Or, maybe there's some wisdom in weak/late binding, in the way that it preserves freedom and flexibility for users? It generally works quite well.
                  Again, that's a developer decision, akin to how tools like Bandit (a security linter for Python) will complain if you invoke subprocesses via the PATH rather than specifying absolute paths. Something I # nosec away in regular code and obey on the occasions I need to run something as root.

                  I lean into Rust's static linking as part of a larger strategy of making distro upgrades less painful in the hope that I can un-train myself from waiting out the 5-year support window on Kubuntu/Lubuntu LTSes before upgrading. (i.e. If I can trust that the stuff unique to my system has minimal chance of breaking, I'm more willing to upgrade to the next LTS after two years rather than five.)

                  Heck, for my "shell scripts" in Python, I tended to lean into using the standard library, even when it was inferior and deprecated, purely because I trusted it more than third-party packages or subprocesses to survive a distro upgrade without breakages.
                  Last edited by ssokolow; 05 September 2022, 02:46 AM.

                  Comment


                  • #39
                    Originally posted by jabl View Post

                    Really? At least here on Ubuntu 22.04:

                    Code:
                    $ cat /usr/bin/fgrep
                    #!/bin/sh
                    exec grep -F "[email protected]"
                    (and same for egrep). Looking at the upstream git repository this is not a Ubuntu specific patch, it's straight from upstream.
                    Hadn't noticed that. I straithtaway made one for igrep as I use grep -i quite alot.
                    Last edited by DRanged; 05 September 2022, 05:33 AM. Reason: typo

                    Comment


                    • #40
                      Originally posted by Mr.Elendig View Post

                      This is a non-issue. If you want to use bash, call bash and not sh.
                      sh and ksh are still heavely used in solaris, aix, hp-unix, tru64 and some others.

                      Comment

                      Working...
                      X