If you’re writing shell scripts you should have https://www.shellcheck.net/ in y...

koala_man · on Jan 12, 2020

I keep posting this, but my favorite rule of thumb came from a Google dev infra engineer who said that "every Python script over 100 lines should be rewritten in bash, because at least then you won't fool yourself into thinking it's production quality"

yodsanklai · on Jan 12, 2020

The Google shell style guide [1] says this:

"If you are writing a script that is more than 100 lines long, you should probably be writing it in Python instead. Bear in mind that scripts grow. Rewrite your script in another language early to avoid a time-consuming rewrite at a later date."

[1] https://google.github.io/styleguide/shell.xml

cookiecaper · on Jan 13, 2020

Stuff like this does a huge amount of unintentional damage. Every crappy company in the world thinks they're Google and tries to copy them. Google-caliber people might be able to be reasonable about this, but ordinary employees come across something like this and interpret it to mean "bash is evil" and ostracize everyone who tries to write a bash script for anything, no matter how sensible.

At work lately, there's been a spate of contorted "just why?" Python scripts that could've been accomplished elegantly in a handful of lines of shell. While no one would select shell languages as the ideal for a lot of complex logic, data parsing, etc., there's no competition for a shell when you need to do what shells are meant to do: chaining invocations, gluing and piping output, and so on.

Godel_unicode · on Jan 13, 2020

Just because they write crappy Python doesn't mean they'd write good shell scripts. For me it indicates the opposite.

samtheprogram · on Jan 13, 2020

As Gary Bernhardt said, and I quote often, it makes sense when it’s right half of the ass.

https://youtube.com/watch?v=ZQnyApKysg4

pmarreck · on Jan 21, 2020

wow, have seen many of Gary's talks but not this one until now; clearly I missed a gem

pif · on Jan 13, 2020

> Google-caliber people might be able to be reasonable about this...

I'm not sure about that. Google certainly has some very talented people among their developers, but most of them are so bad at programming that they had to invent their own heavily watered down programming language.

pjc50 · on Jan 13, 2020

Shell is brilliant until you need to keep a filename in a variable. Or, worse, read filenames from the output of a command. Since basically everything except '/' and '\0' is a valid filename character, the potential for mishaps or exploits is huge.

That's where we came in with the need to use "$@" vs the unsafe-by-default bare $@.

TazeTSchnitzel · on Jan 13, 2020

> here's no competition for a shell when you need to do what shells are meant to do: chaining invocations, gluing and piping output, and so on

There is no reason this shouldn't be easy to do in Python too, if it is then it is something a library should fix.

DonHopkins · on Jan 13, 2020

But bash IS evil.

cookiecaper · on Jan 14, 2020

I mostly agree, but it's sometimes a necessary evil. There's a reason I praised shell languages in general rather than bash specifically; fish shell, for instance, is much more comfortable.

cpach · on Jan 13, 2020

pmarreck · on Jan 21, 2020

In most languages, you don't need to know/understand this much minutiae about string handling; as an example, I've been doing open source dev for years now in Bash and I never knew the distinction OP posted.

a3n · on Jan 12, 2020

Related, the longer your shell script gets, the greater potential for accumulated foot-guns. When you finally do the rewrite, will you be bug-compatible? And once discovered, will you be confident on which side the bug lies?

blaser-waffle · on Jan 13, 2020

An aside: when did the phrase "foot-guns" become a thing.

Cuz I've never heard it until this thread, where it pops up multiple times. Great phrase but went 0-60 in usage.

mrguyorama · on Jan 13, 2020

I'm pretty sure it's at least as old as usenet and mail lists

HocusLocus · on Jan 12, 2020

"If you are writing a script that is more than 100 lines long, you should probably be writing it in Perl instead."

This is true, especially of Python scripts.

RHSeeger · on Jan 12, 2020

That way, it will never be rewritten again; because nobody will be able to read it.

Edit: Getting some downvotes, so to clarify, I am indeed saying that concerting something from Python to Perl has a high likelihood of making it for less maintainable. I get that people can write good perl. As someone who has had to maintain perl in the past; the fact is that it's far more common for the end result to be horrible perl. I have some issues with Python, but it is FAR more maintainable than perl.

bsder · on Jan 13, 2020

Let me drag this to something concrete: if you are interviewing, it is almost impossible to pass an amateur "Perl screen".

Every single person who doesn't use Perl every day, professionally knows a different subset of Perl. This is one of the few times where being more knowledgeable than the interviewer is also a problem. You will write something, and the interviewer will question you on it because he has never seen it.

I finally solved this problem by bringing one of my personal programs written in Perl and also rewritten in Python so we could talk about it. Now, the interviewer is in MY subset of Perl AND can't argue because I have working Perl code in front of him. This was back in the 1990's when everybody in VLSI design expected you to know Perl.

Because of this "different subset" issue, if you want to maintain a Perl script, you have to basically know the entire language. This is what makes maintaining Perl scripts so difficult.

This "different subset" problem is the whole reason I left the Perl ecosystem back in 1996(!) at the height of Perl's popularity and never looked back.

asveikau · on Jan 13, 2020

> This is one of the few times where being more knowledgeable than the interviewer is also a problem

I have actually found that is pretty much always a problem. If you pull out something the interviewer is unfamiliar with they will often assume you are full of it. I have had such people refuse perfectly good explanations for things they haven't heard of because they assume incompetence before that possibility.

laumars · on Jan 13, 2020

That's a sign of a bad interviewer and thus also an indicator of potential bad management.

I'd take those experiences as a blessing rather than a curse.

cookiecaper · on Jan 13, 2020

Indeed. I've "failed" interview screens because the reviewer's Python installation was broken, and rather than reading the traceback and realizing this, they just assumed the candidate's code was broken.

Last I knew, that position was still open almost a year after the fact.

cosmojg · on Jan 13, 2020

As an experienced Perl programmer, what's your opinion of Raku? Is it a worthwhile replacement or too little too late?

redis_mlc · on Jan 13, 2020

> As an experienced Perl programmer, what's your opinion of Raku?

Perl5 still works fine, and is maintained.

> Is it a worthwhile replacement or too little too late?

False dichotomy - see above.

DonHopkins · on Jan 13, 2020

It's been too little too late for many years, while it was still named Perl 5 for so long.

frank2 · on Jan 14, 2020

Raku is the new name for Perl 6, not Perl 5.

arbie · on Jan 13, 2020

> back in the 1990's ... everybody in VLSI design expected you to know Perl.

Has this changed at all?

bsder · on Jan 13, 2020

Dunno. I finally left VLSI design because it was effectively a career dead end. Sense a trend? :)

While I still regard myself as a vastly better VLSI designer than programmer, my ability to wrangle software the whole way from assembly language on a chip to just shy of the top of a full web stack pays far better than my ability to wrangle transistors. And, in my opinion, attacks far more interesting problems.

echelon · on Jan 12, 2020

That's a cute heuristic, but I think the better practice is to distrust scripts without tests as they can quickly diverge from the rest of the codebase.

It's still better to script in Python or Ruby than Bash. Nobody understands Bash. It's even more mysterious than Perl.

fiddlerwoaroof · on Jan 12, 2020

I’d rather write bash for orchestration than nearly anything else: bash is designed to make coordinating processes easy, something very few programming languages have managed to do.

laumars · on Jan 12, 2020

The thing that gets me about all the new shells and shell scripting languages popping up these days is they loosely seem to fall into 2 categories:

1. more emphasis traditional programming paradigms (be it JS, LISP, Python, whatever) which leaves a platform that is arguably a better designed language but however is a poorer REPL environment for it. Bash works because it's terse and terseness is actually preferable for "write many, read once" style environments like an interactive command prompt.

2. or they spend so much effort supporting POSIX/Bash -- including their warts -- that they end up poorer scripting languages.

I think what we really need isn't to rewrite all our shell scripts in Python but rather better shells. Ones which work with existing muscle memory but isn't afraid to break compatibility for the sake of eliminating a few footguns. Shells that can straddle both the aforementioned objectives without sacrificing the other. But there doesn't seem to be many people trying this (I can only think of a couple off hand).

lukeschlather · on Jan 13, 2020

I've been writing a lot of Powershell lately, and my only real gripe with it is that it seems suspiciously like not being Posix compliant in any way was a design goal.

I agree with the idea of breaking backwards compatibility, but Powershell honestly has enough core design issues that it itself is starting to feel like it needs a major backwards-incompatible update.

leosarev · on Jan 13, 2020

Major backwards-incompatible update have come recently (Powershell Core).

nailer · on Jan 13, 2020

What did they break? Powershell core just doesn't include Win32, but that's fine, it runs on Linux and macOS and they don't need Win32.

Powershell 7 has a full Win32.

leosarev · on Jan 18, 2020

I.e. curl command now means actual curl, not Invoke-webrequest

fiddlerwoaroof · on Jan 12, 2020

If you haven’t tried scripting in zsh, I’d give it a go. zsh has a lot of safety improvements and quality of life features.

https://news.ycombinator.com/item?id=22029662

laumars · on Jan 12, 2020

It's also subject to many of the same footguns as Bash so I'd put that into the 2nd camp (re my previous post).

Not that I'm taking anything away from zsh. It is a nice shell. But I think we can do even better considering how dependant we still are on shells for day to day stuff.

fiddlerwoaroof · on Jan 12, 2020

> Zsh had arrays from the start, and its author opted for a saner language design at the expense of backward compatibility. In zsh (under the default expansion rules) $var does not perfom word splitting; if you want to store a list of words in a variable, you are meant to use an array; and if you really want word splitting, you can write $=var.

https://unix.stackexchange.com/a/26672

fiddlerwoaroof · on Jan 12, 2020

I realize that it has similar footguns, however reading through their info pages, I was surprised by how many they just decided to fix, unless you explicitly turn on compatibility mode.

jhasse · on Jan 12, 2020

What are your thoughts about fish?

fiddlerwoaroof · on Jan 12, 2020

Does fish still have the issue where pipelines aren’t really concurrent?

XorNot · on Jan 13, 2020

Go is actually superb at this IMO. Channels and goroutines provide the actual primitives you need to handle streaming data, and the integration of the context library makes starting up and shutting down everything a breeze.

Best of all though, it's absolutely compatible wherever you need to run it.

fiddlerwoaroof · on Jan 13, 2020

The issue with I’ve had with go is that it’s batch compiled: I can repl something together in an interactive shell and then generalize it, which is my preferred workflow

spc476 · on Jan 13, 2020

Here's one thing that I find difficult to remember how to do in shell: run a command, have stdout go to a file (or /dev/null or whatever), but pipe stderr to more or grep or some other program to process.

I mean, I would expect the syntax to be something like:

    make >/dev/null 2| more

but no ... it's some incomprehensible mess of redirection arcana to get that to work.

pwg · on Jan 13, 2020

The syntax for that is present in the manual:

    Note that the order of redirections
    is significant.  For example, the 
    command

    ls > dirlist 2>&1

    directs both standard output and 
    standard error to the file dirlist, 
    while the command

    ls 2>&1 > dirlist

    directs only the standard output to 
    file dirlist, because the standard
    error was duplicated from the 
    standard output before the standard 
    output was redirected to dirlist.

Does it look a little arcane, yes, esp. until one memorizes it.

Can it be memorized? Yes, because it is just a single 'incantation': "2>&1". Just put that redirection operator before the redirection of the standard output to the file, and the result is stdout goes to the file, stderr goes to the pipe.

Legogris · on Jan 13, 2020

Just in case someone's confused by this, it becomes clear if you know how pipes work. `2>&1` means "redirect stderr to whatever stdout is currently pointed to" (redirection by copy, not by reference). More recent version of bash manual[0] has a less confusing wording in the same paragraph IMO:

  Note that the order of redirections is significant. For example, the command

  ls > dirlist 2>&1

  directs both standard output (file descriptor 1) and standard error (file descriptor 2) to the file dirlist, while the command

  ls 2>&1 > dirlist

  directs only the standard output to file dirlist, because the standard error was made a copy of the standard output before the standard output was redirected to dirlist.

[0]: https://www.gnu.org/software/bash/manual/html_node/Redirecti...

dllthomas · on Jan 13, 2020

It's actually a pretty simple model. Every process has an array of open file objects. File descriptors are indexes into that array. "Redirection" copies the underlying entries, and are processed left-to-right.

"Redirecting streams" winds up being confusing - it's all just `dup2`.

spc476 · on Jan 13, 2020

Okay, but can you pipe stdout to one string of commands, and stderr to a different string of commands? That's something I feel should be possible but how Unix shells handle redirection is just ... alien to my way of thinking.

Yes, I know that underneath it's all calls to `pipe()` and `dup2()`, which I can do (and have done) in a language other than shell. It's the shell redirection syntax (for anything more complex than simple redirection or a pipe) that just doesn't make sense to me.

pwg · on Jan 13, 2020

Yes you can (provided you have Bash and a system that supports process substitution):

    command 2>&1 > >(cat - | cat - > fd1) | cat - | cat - > fd2

Note, the extra 'cats' are just to show a "string of commands".

How to read this:

When Bash sets up the file descriptors for 'command' it initially makes descriptor 1 refer to the pipe, and descriptor 2 refer to the terminal.

So, the dup operator (2>&1) copies the fd in descriptor 1 (which is the pipe) into descriptor 2 (so after the dup operator is processed both stderr and stdout for command reference the pipe).

Next descriptor 1 is replaced (the > operator) by a reference to a fifo created by the process substitution operator (the >(...) operator).

So now command's descriptor 1 refers to the fifo created by >() (which itself contains a "string of commands").

Then, because stderr for command was made to be a copy of what was previously stdout (before modifying stdout) it continues to refer to the pipe Bash setup (the | operator), and stderr now flows out over the pipes to another "string of commands".

dredmorbius · on Jan 13, 2020

    ( echo "stdout"; echo "stderr" 1>&2 ) 2>&1 1>/dev/null

echelon · on Jan 13, 2020

Practically alien runes.

reagent_finder · on Jan 13, 2020

I want to disagree with this so badly because I have tried to do stuff that's just a bit too complex for bash and ended up scrapping the idea... but I can't. Hacky python is even worse.

scbrg · on Jan 12, 2020

Another rule of thumb would be: "How many years from now do you still want this to work?" If I run a shell script I wrote ten years ago, it works. If I run a Python script I wrote ten years ago, it's quite likely to fail with a SyntaxError.

This, I say as someone who loves Python and I use it as my primary language both privately and at work. But I have to admit, Python scripts do not really age well.

bsder · on Jan 13, 2020

> If I run a shell script I wrote ten years ago, it works.

I'll provide the counterexample.

Any scripts I wrote more than 10 years ago for actual "Bourne shell" /bin/sh often fail in mysterious (and sometimes silent) ways when run under bash.

Bash is just as guilty of backward incompatibility as Python is. Python at least generally has the decency to squawk about it.

scbrg · on Jan 13, 2020

Well, I won't argue with that. When I say shell script, I mean POSIX. I never write bash scripts (although admittedly, I do miss pipefail).

acdha · on Jan 13, 2020

I haven't really found that to be a problem, with some code I still use daily which was started before Python 3.0 was released. The py3 transition was real but it was also something which took seconds to run through futurize/modernize.

Over the years I've spent way more time going into inconsistencies about the various platform tools (e.g. GNU vs. BSD implementations of sed, grep, find, etc. even before you ge to the space aliens-with-broken-translators realm of AIX) — which using Python avoided needing to care about — or library / file naming changes. Similarly, a few years back there was way more time used when various HTTPS improvements flushed out code using old versions of OpenSSL or, worse, gnutls.

scbrg · on Jan 13, 2020

I'm glad you mentioned SSL. Because hey - that broke some scripts in 2.7.9 ;-)

https://www.python.org/dev/peps/pep-0466/

(Can't remember the details, but it may have been something about some extra hoop you'd have to jump through to trust self signed certs that broke our testing infrastructure)

acdha · on Jan 13, 2020

Totally agree - I just don’t see that as a distinguishing factor since we had a bunch of shell scripts break, too, as various CLI tools stopped behaving consistently.

scbrg · on Jan 13, 2020

I guess this is a YMMV thing. My experience is that Python rots more than shell scripts, but of course I understand that this may not be universally true, and depends a lot on what your scripts actually do (and, to a large extent, on how they're written, of course).

np_tedious · on Jan 12, 2020

I'm really struggling to think of an example that doesn't involve py2 --> py3. Can you share a few?

scbrg · on Jan 12, 2020

Well, py2 -> py3 is obviously one.

Before that, well it's been a while. But with and yield were added as a keywords (breaking any code using them as a variables). The xreadlines module disappeared at some point (I'm sure a few other modules have died along the way). At some point you couldn't raise anything you wanted as exceptions any more. Oh, and the source encoding declaration. That's what I remember off the top of my head.

Minor things that are handled in things that are maintained, sure. But many things are also built once and then expected to run for a long time - especially in the role shell scripts usually fill.

Note, that I don't argue that any of these changes were bad. They were all good changes that made the language better. But, they resulted in old code breaking. One extra headache to deal with for the poor schmuck that was responsible for upgrading this server or that to the latest OS release.

edoceo · on Jan 12, 2020

I had one like that, years ago, in a 2.3 to 2.7 environment update. Had to just update a few lines. But, we knew to watch for it in test, cause we knew of the update. We test our she'll scripts too, when the env changes. Anecdotally the Py is slightly more touchy than Bash

acemarke · on Jan 12, 2020

I'm a big fan of using Plumbum [0] for writing more complex shell-script-like logic as Python.

[0] https://plumbum.readthedocs.io/en/latest/

moopling · on Jan 12, 2020

Wow this looks great, thanks!

_ytji · on Jan 12, 2020

My rule of thumb is- if I think I'm going to need to use an array in bash, I'm probably doing something wrong and better just use python.

pinopinopino · on Jan 12, 2020

I don't know, I often miss the pipe operator in python, which makes certain things much easier. I once created a small class to simulate this a bit, it allows you to use + as pipe operator. Wouldn't use it in production though, was just fun to write it. https://pastebin.com/eQacwLj7

ehsankia · on Jan 12, 2020

Why not use | as the pipe operator? Just use __or__ instead of __add__

I have done something similar too once in a shell-like Python CLI I worked on. Can also use __ror__ to be able to pass primitives into your commands like `[1, 2, 3] | SumCmd() | PrintCmd()`

I do agree that typing something like that, especially when working in a shell, is much nicer than having to go back and forth all the time to type `print(sum([1,2,3]))`

Here it is mentioned in a talk about Python Aesthetics by Brandon Rhodes: https://www.youtube.com/watch?v=x-kB2o8sd5c&t=8m24s

pinopinopino · on Jan 13, 2020

Oh that is possible? I am not that deep into python honestly. Looks much neater to use |.

krazykringle · on Jan 13, 2020

Plumbum (above) matches this need in Python. It overloads | to make pipes! It makes it relatively trivial to express shell idioms in Python.

emmelaich · on Jan 13, 2020

One thing I like Ruby for over Python is the function chaining, which is quite shell-pipe-ish.

I'll stick to Python, but Ruby is worth a look.

swiley · on Jan 12, 2020

Doesn’t python have channels? The pipe is essentially a channel combined with a close message (a null object I guess) Otherwise you can do something similar with iterators and function composition. I guess the syntax isn’t quite as easy to read if you’re not used to function composition?(which seems surprisingly difficult for some beginners)

fiddlerwoaroof · on Jan 12, 2020

Shell pipes are fairly low overhead because they are an os primitive.

swiley · on Jan 20, 2020

I really wouldn’t call pipes “low overhead.”

andrewshadura · on Jan 12, 2020

Tried python-sh?

castillar76 · on Jan 13, 2020

This is exactly my rule of thumb for this as well, and it's funny that that's exactly the feature that tells me to start using Python. Yes, there are arrays in Bash, but every time I start to look up how to write them (again), I think, "Ennnhhhhh...maybe not" and rewrite the darn thing in Python.

gdevenyi · on Jan 12, 2020

I still find python incredibly inconvenient for pipelining a ton of unix style command line tools together. Bash still wins there.

znpy · on Jan 13, 2020

Uh.. you just reminded me of a python syntax hack that might come in handy... I'll give it a look again today at work, maybe I'll write some PoC and link it back here :)

sverhagen · on Jan 12, 2020

I suppose I had a more forgiving day, the other day, and told someone the threshold was hundred lines. Anyway, some value.

It just challenged my colleague to make the code denser, to stay within the given limit. Sigh.

michaelmrose · on Jan 12, 2020

What about some sort of complexity checking more precise than lines of code? What about trying to write one thing you would normally write in shell with something else per unit of time.

lukeschlather · on Jan 13, 2020

I would simply say if you're planning on making changes to the script you probably shouldn't write it in bash. Or to be a little more rigorous, don't use bash if you couldn't rewrite the entire script, correctly, from scratch, in an hour.

Although I think all of these things are just complicated ways of saying "please seriously reconsider writing it in bash."

But I write bash scripts all the time, I just try to keep them as short and simple as possible.

sverhagen · on Jan 13, 2020

Will, this is more about giving someone some quick guidance in passing, that they can apply without too much, um, complexity.

BurningFrog · on Jan 12, 2020

As someone who likes small fonts and big screens, for me a screen is often a lot more than 100 lines.

kjeetgill · on Jan 12, 2020

I don't think this is like the 80 character line length that's about screen size. This 100 line limit is framed as a quick and dirty for heuristic for script complexity.

mysterydip · on Jan 12, 2020

A common problem that happened to a coworker was he made a quick bash script for something simple, then kept adding "just one more" thing with sunk cost fallacy not wanting to take the time to rewrite it. Eventually the monstrosity created was too difficult to debug and it had to be rewritten in a different language.

Jade_Jet · on Jan 13, 2020

I’ve seen the same thing happen with any language. Generally tends to happen when a dev hasn’t thought through the scope of what they are doing beforehand. I’ve written some ugly python in my earlier days due to this as well.

My point here is it is less to due with the language and more to due with the mindset when solving a problem.

The main issue I see with more inexperienced devs with bash is that they tend to think it’s okay to be lazy with the code because it’s just “bash”. If you would write safety checks and comments in your python you should be doing the same in bash really.

JdeBP · on Jan 12, 2020

Or parsing the output of the ls command. (-:

* https://unix.stackexchange.com/a/129120/5132

3xblah · on Jan 12, 2020

    shellcheck /scriptsdir/script

I noticed this in the output.

    ^-- SC2148: Tips depend on target shell and yours is unknown. Add a shebang.

Being new to shellcheck, not familar with options or what it does, so I hastily and erroneously typed:

    shellcheck -shell=bash script

Note I learned UNIX via NetBSD. I prefer and use their version of ash for both interactive and scripting use.1 I never got used to "--" GNU-style long options. I sometimes type a single "-" out of habit. Anyway, here is the output I got from shellcheck:

    Unknown shell: hell=bash

I agree with shellcheck.

Although there may be some irony in the fact it cannot sort out it own argument parsing.

1. I do not use other scripting languages such as Python, Perl, Ruby, etc. That means, e.g., for quick and dirty one-offs and prototyping, I can omit the shebang. Debian's "dash" scripting shell is derived from NetBSD's ash, the one I choose for interactive use.

inetknght · on Jan 12, 2020

The error message is quite clear:

    SC2148: Tips depend on target shell and yours is unknown. Add a shebang.

If you google what a shebang is, the top link for me is a Wikipedia article on the subject [0]. A shebang is basically just a line (always the first line) in a file which tells the operating system what program to invoke to execute the script. There are different shells beyond just bash, so shellcheck wants to know which flavor the shell is written for and uses the shebang to figure it out.

I always have the top of my shell scripts with a shebang, even if the script isn't intended to be directly executed.

Pick the user's bash from PATH environment:

    #!/usr/bin/env bash

Or specify a specific bash:

    #!/bin/bash

Or use whatever plain-shell is installed:

    #!/bin/sh

Or maybe it's a Python script:

    #!/usr/bin/env python3

Or it's a text file:

    #!/usr/bin/env vi

If you're not using shebangs then you're probably writing your scripts wrongly.

[0] https://en.wikipedia.org/wiki/Shebang_(Unix)

therein · on Jan 12, 2020

Sounds like it would be content with `-sbash`.

npmaile · on Jan 12, 2020

second this. We recently added it to a project at my company (https://github.com/homedepot/spingo) as part of a github action and it is awesome. A quick search for the specific code in the shellcheck wiki reveals the problem and a solution. I've had no real issues with it yet.

de_watcher · on Jan 13, 2020

Oh no, Python feels like an application code. When you hit the bug tangled in Python you know it's going to be tedious.

skocznymroczny · on Jan 12, 2020

To be honest, for most of my scripting needs I usually start with Python directly and just go mass os.system() or commands.get_output() calls. Later on I refactor into subprocess.Popen as needed