Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Whitespaces and Strings in Bash (2018) (indradhanush.github.io)
27 points by luu on July 30, 2021 | hide | past | favorite | 14 comments


> Bash cleans up extra spaces when writing to stdout unless you use variable quoting.

Not really:

  echo $result
Will evaluate to:

  echo node-0    Ready     <none>    3d        v1.11.1
Which is the following argv array in C:

  const char *argv = { "echo", "node-0", "Ready", "<none>", "3d", "v1.11.1" };
Then `echo` is a builtin command that prints argv[1:] separated by spaces.

Then:

  echo "$result"
Evaluates to:

  echo "node-0    Ready     <none>    3d        v1.11.1"
Which is the following argv array in C:

  const char *argv = { "echo", "node-0    Ready     <none>    3d        v1.11.1" };
The spaces get cleaned up when parsing the command line, not when writing to stdout.


Came here just to say that. It's such a beautiful case where a developer notices an issue in their mental model of how the tools work, changes their mental model to account for that issue, but as a result introduces a completely different bug in their mental model.


Someone need to add a comment to the page so the misconception does not spread.

I would but Discuss does not like me.


Yes, and also glob expansion:

  msg="Configured search pattern: /usr/*"
  echo $msg
  # Configured search pattern: /usr/bin /usr/games /usr/lib /usr/lib64 ...


The author's script (unknowingly) was relying on word splitting to convert string to cut-compatible format.

Shellcheck suggested: "Double quote to prevent globbing and word splitting."

Author implemented the suggestion, which (as advertised) prevented word splitting.

The script stopped working.

---

Related: every once in a while I talk to people who don't know how Unix argv vector works, and how come "system" takes space separated string which needs quoting, while "subprocess.run"/"execv" takes a list of strings which do not need quoting.

Anyone knows a good blog post with concise explanation of this concept that I can send them to? Because all I am finding is either manpage-level document which assumes reader already knows the concept, or huge books which mention many unrelated things as well.


How about this...

All programs take a list of arguments when they are executed.

Shell code, while it can be as simple as a program and list of arguments, is actually a small programming language.

In shell code

a b | tee a. log

Pipe is not an argument, it's code that the shell interprets.

system() takes shell code.


shellcheck's advice on double quoting has been plainly wrong a few times for me. I ignore shell check and use

bash - n


Why in the world would you even reach for plain text when you can get JSON or YAML? Dude, you have machine-processable formats to work with and you figured you'll be clever.

I'd definitely scold him if he was my colleague.

And I seriously don't care about bash weirdness. It's an endless rabbit hole. Happily we have `shellcheck` nowadays but even with that I do my very best to avoid scripting beyond the most basic needs because I've been bitten by unexpected behaviours many times.

Will it all make sense if you work with it full time for 6 months? I am sure it will but most of us will never get such big stretches of uninterrupted bash practice so I just prefer to work around it as much as I can.

It's kind of like GIT: if you go all-in you'll emerge very enlightened. But most programmers just want to get in, do their rare-and-hard-to-remember-thing and get out. That's why most of us have to Google various GIT incantations every time we need them.


In fairness to the author, he does mention the JSON and YAML at the end of the blog. Though he does still completely misunderstand how bash is handling strings in his conclusion:

> Bash cleans up extra spaces when writing to stdout unless you use variable quoting.

Nope. Bash parses the spaces as parameter delimiters if you don't quote:

  $ echo "a b c"  # is equivalent to echo("a b c") in C
  $ echo a b c    # is equivalent to echo("a", "b", "c") in C
Also remember that variables are expanded on the command line, not passed to the calling executable (the calling executable wouldn't know what to do with a $variable so any variables need to be expanded on the command line).


See, this is exactly what I mean. :D

I would misunderstand as well. Processing plain text is ambiguous and we all carry our own assumptions along for the ride. Bash's traditional huge (and badly formatted on the web) help pages don't help matters as well.

I'd think in 2021 we'll have an official "TL;DR: bash gotchas" but I guess not.


In the shell I'm writing you get a preview of what the command is before it's sent so its definitely possible to write better shells. It's not POSIX though and there will always lie the problem. You might build a better shell but few will use it


There are better ways to split a string on spaces in Bash than calling external tools. For example, the following will split the line as intended, and put the results in variables:

    read name state roles ages version <<< "$result"


Even funnier when you realize bash and zsh do different things (by default) there.

   foo="ls -las"
   $foo
   "$foo"
for zsh to split words, `setopt SH_SPLIT_WORDS`.


TL;DR use awk instead of cut




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: