Honestly this is not a bad idea. Since bringing up a new instance VM will always be in exactly the same state, the shell script is completely deterministic. If any line fails to run, you could simply log an error and automatically shut down the VM.
The bash script itself is also extremely straight forward and is easily testable with a built in REPL (you know, bash). Anybody who vaguely knows unix is also going to be able to understand and maintain the shell script. Adding in new dependencies is simple. It's easily portable and you don't need to do any extra work to add in new features. I can't even think of a drawback.
Typically shell scripts are fine in the beginning, but over time the complexity rises and it becomes an unmaintanable mess, and you'll end up reimplementing it in a proper programming language.
Shell scripts are not easily portable either, unless by "portable" you mean "works in Linux". Node.js scripts, for example, really (mostly) work on Linux and other platforms like Windows.
Until you hit some nasty "path is longer than 254 characters" bugs. Oh well..
I'm making the assumption that you are running a bunch of AWS instances off the same AMI. If you're not, a shell script is not going to be deterministic or maintainable - but most companies these days are just using AWS with an ubuntu AMI and then running some software on top x100 for each server. The solution to this has been generally very complex deployment and management programs that you need a devops team to keep maintained.
For something like this, you'd start up an instance, ssh in, set up the server using bash, grab your commands out of bash history into a script, and then deploy 100x instances and have them all run the script. Simple enough anybody who has used unix now understands your whole ops setup. It doesn't have the perfect rigor that other solutions have - but sometimes that high learning curve and perfect rigor means that you miss the forest for the trees.
Just because someone wrote something shiny and new that replaces 10 lines of shell with 20 lines of chef installation and another 10 lines of chef, doesn't mean you have to use it.
for box in box1 box2 box3; do
cat << 'EOF' | ssh $box
# do something
EOF
done
easy. Even better, just pass in some data into userdata with your autoscaling group. (That userdata field? just start it with a #! line just like a shell script... and it'll execute your shell, php, perl, python, etc script!)
Shell scripts are extremely portable and should be the preferred method for a large set of tasks. Properly written, a shell script can run on a 20 year old Solaris machine, any version of Windows (with installed tools like Cygwin) and any modern Unix variant... claiming Node.js is more portable is ridiculous on so many levels.
The problem is so many programmers don't take the time (or care to take the time) to learn to use the well thought out design of Unix tools opting instead to see every problem as a nail corresponding to the latest trend in hammers (programming languages).
This has led a lot of programming types to create advanced tools for managing Unix systems which largely ignore the design of Unix.
It's hard to write a portable shell script. My dotfiles need to be portable, and they involve a lot of shell. Every time I introduce a new OS, I have to make changes. Various oddities get you. These, for example, look really innocent but aren't portable:
find -iname 'foo*' # [1]
... | sed -e 's/ab\+c//' # [2]
... | sed -i -e 's/abc//' # [3]
tar -xf some-archive.tar.gz # [4]
python -c 'anything' # [5]
Things like messing around with /proc are more obvious, but things like curl (is curl installed? what do we do if it isn't? try wget?) can be hard too.
[1]: find doesn't assume CWD on all POSIX OSs
[2]: "+" isn't POSIX. You have to \{1,\} that.
[3]: -i requires an argument on some OSs.
[4]: This is stretching the definition of portable a bit; I've worked on machines where you had to specify -z to tar, given a compressed archive. (tar has been able to figure out compression on extraction for well over a decade now, so -z is usually optional, but some places are really slow to upgrade.)
[5]: Unless anything is a Python 2/3 polyglot, you'd better hope that you guess correctly that Python 2 was installed. (And it's really hard here: python is either python 2 or 3 on some systems, depending on age & configuration, with python2 and python3 pointing to that exact version, but on some machines, python2 doesn't exist even if Python is installed, despite PEP-394.)
It's a well known fact that GNU tools have plenty of extra features which you have to be careful about using if you want portibility AND that many of the legacy commercial Unix implementations have positively ancient implementations and feature sets. I wouldn't really say it is so very difficult though.
Every script you write isn't going to be portable, but it's not that much of a stretch to endeavor to keep your script simple, not make assumptions, and be mindful of the potentially missing features of some implementations.
I take a special objection to [5], `python -V` isn't difficult at all to run, hoping and guessing are not necessary.
Portability is a red herring anyway. If you pursue it, you'll always end up chasing the lowest common denominator.
YOU can control where the app is deployed (this is largely true even if you're selling your app just by having installation requirements or by selling appliances instead of installable apps).
> I take a special objection to [5], `python -V` isn't difficult at all to run, hoping and guessing are not necessary.
I mostly meant that in a simple statement of:
python -c "code"
…you're probably forced to assume that it's Python 2 (or write a 2/3 code) and hope that your assumption is right. You can't run `python -V`: you're a script! The point is that it is automated, or we wouldn't be having this discussion.
Of course, you can inspect the output of python -V (or just import sys and look at sys.version_info.major) and figure it out, but now you need to do that, which requires more code, more thought, testing…
I'd argue that you should probably stick to one subset of things in your bootstrap script -- and I'd say grep, awk, sed and (ba)sh go together, anything "higher level" like python/ruby/perl/tcl does not fit within that. You might want to check for python with a combination of "python -V" and the dance described above -- and, as part of bootstrapping, make a symlink (or copy, if you need to support windows and/or a filesystem without symlink support) to eg: python2. Save that tidbit as "assert-python2.sh" and then first "assert-python2.sh" then "check-bootstrap-deps.sh" and finally "bootsrap.sh" :-)
Interesting, for 1 and 4 -- I immediately assumed that might break (as for tar, I'd generally prefer something like zcat (or for scripts gzip -dc) | tar -x... makes it easier to change format (both gzip to lzma and tar to cpio). For 2,3 I'd be wary of sed for anything that needs to be portable in general. For 3, it seems prudent to use a suffix with -i anyway; explicit being better than implicit most of the time.
As for 5; How many system does have python 2 installed, but no python2 binary/sym-link? (I've never had to consider this use-case for production).
Note a slight benefit of splitting tar to zcat and replacing python with python2, is that you'll get a nice "command not found" error. You could of course do a dance in the top of your script trying to check for dependencies with "command -v"[1]. If nothing else such a section will serve as documentation of dependencies.
Something like:
# NOT TESTED IN PRODUCTION ;-)
checkdeps() {
depsmissing=0
shift
for d in "${@}"
do
if ! command -v "${d}" > /dev/null
then
depsmissing=$(( depsmissing + 1 ))
if [ ${depsmissing} -gt 126 ]
then
depmissing=126 # error values > 126 may be special
fi
echo missing dependency: "${d}"
#debug outpt
#else
#echo "${d}" found
fi
done
return ${depsmissing}
}
deps="echo zcat foobarz python2"
checkdeps ${deps}
missing=${?}
if [ "${missing}" -gt 0 ]
then
echo "${missing} or more missing deps"
exit 1
else
echo "Deps ok."
fi
# And you could go nuts checking for alts, along the lines of
# pythons="python2 python python3"
# and at some point have a partial implemntation of half of
# autotools ;-)
>The problem is so many programmers don't take the time (or care to take the time) to learn to use the well thought out design of Unix tools opting instead to see every problem as a nail corresponding to the latest trend in hammers (programming languages).
Unix tools are the arguably the best tools available to a modern user. That, however does not mean that the Unix tools are well designed; many would argue that the Unix tools are extremely poorly designed or have no discernible design at all. S-expression are a much more powerful and useful abstraction than a "stream of bytes". POSIX was hacked on many years later in attempt to make sense out of the mess that shell commands had become. Shell scripts are very fragile and have never been truly portable across various *nixes, although the situation is better than it was twenty years ago, when it was enormously difficult to port scripts across the various commercial Unix installations, because they would break in many different and subtle ways.
Portability always comes at a high complexity cost. If you don't see it personslly, someone else in your org does. Instead, look at why you think you need portability.
Why do you really need it to be portable outside of Linux? I don't imagine many services will evolve to run on other platforms in their lifetime without significant configuration changes anyway.
I can't really argue the point about complexity though.
Because there are other operating systems out there too, with their own features making them better choice (or just a preference) for certain tasks over Linux, including FreeBSD, OpenBSD, DragonFlyBSD, SmartOS, Illumos... World doesnt end with Linux, neither starts with it ;)
Chances are that you don't have a mixed fleet for a particular application though. So if you have a bunch of FizzBuzz VMs that you need to bring online, you can safety write shell scripts that work on FreeBSD because you know all FizzBuzz machines will be running FreeBSD. If you've also got your BazQux service that needs a Linux fleet, then for that fleet you create shell scripts that work on Linux.
There may be some overlap between things that must be done on both the FizzBuz and BazQux fleets, but that overlap is probably in simple tasks.
I think the point was that it's not common for someone to switch from running a server on Linux to running a server on something else, and it's even less common for someone to do that without changing the configuration pretty extensively.
> Honestly this is not a bad idea. Since bringing up a new instance VM will always be in exactly the same state, the shell script is completely deterministic.
Wrong, kind of. There are two variables in a VM: 1. Everything not in the VM and 2. The script itself.
1 is mostly dependent on what you're doing. If you're just calculating digits of pi, then yes, it's quite probably deterministic; if you're deploy software that's being pulled from github, running some initialization scripts, attaching some storage, then you're going to run into variables. All of those aforementioned actions have failed: github.com might be down (a rarity, but happened this week!), your scripts contain new code that's not quite up to par, and the cloud provider says the storage is attached to the VM, but it doesn't actually show up.
2 is that the script is probably in a VCS, and people are changing it. Someone is bound to write a line that doesn't work. (In fact this seems to happen quite often when tests are absent…)
> I can't even think of a drawback.
I can. The biggest one is that bash's arcane syntax is a deathtrap. It's a great shell, but for stuff that needs to work and work reliably, it's riddled with holes. Take the article's script:
sudo apt-get -y install build-essential zlib1g-dev libssl-dev libreadline6-dev libyaml-dev
cd /tmp
wget http://ftp.ruby-lang.org/pub/ruby/2.0/ruby-2.0.0-p247.tar.gz
tar -xzf ruby-2.0.0-p247.tar.gz
cd ruby-2.0.0-p247
./configure --prefix=/usr/local
make
sudo make install
rm -rf /tmp/ruby*
Several of these (apt-get, wget, tar, did you just install code downloaded over an insure channel onto a server?!, ./configure, make, make install) can easily fail; if they do, your fucking shell script will keep plowing along as if nothing happened. Depending on the next action, this can be meh, or WAT. Since it ends with "rm -rf ...", I think if it does blow up horribly, it'll return success. You can say "set -e" at the top to cause it to bail sooner, but `set -e` won't catch failures in all commands (false | true). Fucking shell scripts.
Don't get me wrong: shell scripts are great, especially if you need it to work NOW. One off stuff especially. But if it's going to stick around awhile, having something that automatically looks at and raises exceptions/errors when stuff fails is great.
The thing I miss from a lot of these automation libraries is being able to annotate dependencies between commands. That wget and apt-get can run together. (The rest is must pretty much run in parallel.)
Libraries also allow people who really know how to make this stuff sing built the low level functionality in. That make could be make -j $(( $coeff * $number_of_cores )) ; make install could be similar. Maybe CFLAGS or CXXFLAGS could compile ruby with a bit more options for a slightly more optimized install. We might extract the tar in a directory where a rm -rf /tmp/ruby* won't inadvertently delete something (unlikely if you're on a new VM, but I find that's not always the case).
Shell scripts are a tool. They have a place. Nobody is saying get rid of them, nor is anyone saying get rid of them for deploys. I just want something a little more robust.
Thank you for explaining this with some clarity; it's early here, and I've been struggling to say this very thing.
Bash is fine, but it's not a sophisticated high-level language. For doing sophisticated things, a simplistic tool is not enough. We need processes that are deterministic and that can act intelligently. The "bash is fine" crowd, in my experience, tends to be the same crowd that thinks that servers are special snowflakes that we must feed and care for. Those days are over.
The bash script itself is also extremely straight forward and is easily testable with a built in REPL (you know, bash). Anybody who vaguely knows unix is also going to be able to understand and maintain the shell script. Adding in new dependencies is simple. It's easily portable and you don't need to do any extra work to add in new features. I can't even think of a drawback.