Best practices for shell scripts

We strive to write the scripts using bash which has a lot more stuff built in than the older sh.

We currently have shell scripts in two forms:

Ordinary .sh/.bash files
Scripts that are run in a pipeline, usually defined in .gitlab-ci.yml

When to not use shell scripts

Attempting to do general-purpose programming in a shell script, while possible, is discouraged for readability and maintainability reasons.

Shell scripts should be used when you need to automate a series of commands. For more complex string manipulation tasks consider awk or sed.

A good rule of thumb is if bash is getting in the way of what you are trying to achieve, consider more specialized tools.

Writing scripts in Bash

Below you will find a set of guidelines when writing scripts in bash. These practices have been defined over a couple of years and are based on following:

1. Headers and executable flag

The script needs a so-called shebang line in the top of the file:

#!/usr/bin/env bash
set -eo pipefail

The reason to use /usr/bin/env is that in some environments bash is not found at /bin/bash.

It is recommended to set -e and -o pipefail so the scripts exits if any command fails with exit code != 0. This is the default behavior in GitLab CI but will not carry over if you call a shell script.

The file also needs to be executable. On a linux machine you add the +x flag using chmod: chmod +x <your-script.sh>.

On Windows you do it when you commit the file to Git:

git update-index --chmod=+x <your-script.sh>

The executable flag can be ignored when the script is within CI YAML files.

2. Format your scripts with `shfmt`

Use shfmt to format your scripts:

shfmt -i 2 -ci -sr -w <your-script.sh>

The rules that the pipeline enforces are:

-i 2: indent with 2 spaces
-ci: indent switch cases
-sr: redirect operators are followed by a space
-w: modify the file instead of printing to stdout

3. Lint your scripts with `shellcheck`

Run shellcheck to lint your scripts and handle any and all problems it finds:

shellcheck <your-script.sh>

4. Use Bash `if` and `else` syntax

Always use double brackets in ifs:

if [[ -f filename ]]; then
  # do something
else
  # do something else
fi

The reasoning is that single bracket [ is a symlink to /usr/bin/test and variable expansions and globs work in non-obvious ways.

5. Always quote your variables

If you use a variable without quotes it undergoes variable expansion according to the $IFS variable.

6. Use builtins instead of relying on external command

Bash can do somewhat advanced string manipulation directly including string replacement, regex matching and upper/lowercasing.

# Replace 
echo "${str/Foo/Bar}" # /path/to/Bar.java

# Brace expansion
echo {Foo,Bar}.java # Foo.java Bar.java
echo {1..2}{3..4}   # 13 14 23 24

See more in the Devhints.io Bash cheatsheet and Bash Hackers Wiki.

7. Functions

Functions can be written in different ways but the preferred way is:

function name_of_function() {
    # Code
}

Using the function keyword is not syntactically necessary but it makes functions stand out more clearly.

Functions behave as if they were external programs, but they inherit shell variables from their calling scope. A couple of points to remember:

The return keyword returns an exit code
A function can only return an exit code, use echo and printf to pass values out
A function can change variables defined in the global scope, use local VAR=... to declare it local

8. `cd` vs `pushd`/`popd`

The recommendation is to use pushd and popd if you intend to go back and forth in a filesystem. When using pushd you push a directory on a stack and can popd back to the directory any time. If you use cd to go back you can only jump back once.

Using subshells to isolate a cd call

You can utilize subshells to make cd into directory and call command a little cleaner:

cd /some/path || exit
# $PWD is /some/path

(
    cd /other/path || exit
    # $PWD is /other/path
)

# $PWD is /some/path

Subshells inherit variables but cannot change variables in the parent shell

9. Use a tool that understands the syntax when parsing JSON/YAML/XML

These format are ubiquitous and you will encounter them fairly often. They are not parsable by regular expressions and attempting to do so will yield potentially incorrect results in all but the most trivial cases. It is therefore recommended to use the following tools for the respective formats:

For JSON use jq
For YAML use yq
For XML use xmllint or xmlstarlet or a combination thereof

10. Formatting output

When writing output from your script consider using:

The Bash builtin printf, mostly the same syntax as in C
The column utility for making aligned tables of text
If the script is primarily made for GitLab CI, see here for more

Coloring text

Coloring the output can add a much-needed visual separation. This is achieved using ANSI escape codes.

IF your script is meant to be consumed by other scripts or piped to a file, then consider checking whether the output is a TTY.

11. Cross-platform paths

When writing bash scripts on Windows (GitBash, Cygwin and WSL when crossing OS barriers) one has to be careful when dealing with paths.

Relative paths are probably safe
The Windows path separator \ marks an escape sequence in Bash, use / instead
Use //dir to pass a unix path as an argument because Git Bash automatically translates unix paths so /dir becomes C:\path-to-git\dir
Absolute paths must be converted into what the receiver understands

Use cygpath on Windows to convert between unix and native paths:

cygpath -w /some/path converts a path from unix to Windows
cygpath -u /some/path converts a path from Windows to unix

Detecting Windows

Since the release Windows Subsystem for Linux we can no longer rely on the built-in $OSTYPE. Instead check the output of uname:

unameOut=$(uname -a)
case "${unameOut}" in
    *Microsoft*) OS="WSL";;     # WSL 1
    *microsoft*) OS="WSL2";;    # WSL 2
    Linux*)      OS="Linux";;
    Darwin*)     OS="Mac";;
    CYGWIN*)     OS="Cygwin";;
    MINGW*)      OS="Windows";;
    *Msys)       OS="Windows";; 
    *)           OS="UNKNOWN:${unameOut}"
esac

Source: StackOverflow

Example of path mismatch with Docker bind mounts

The following command fails in Git Bash and cygwin:

# Does not work
docker run -it -v "${PWD}:/app" alpine sh

To make it run Git Bash and cygwin prepend a / to absolute paths:

# Works as intended
docker run -it -v "/${PWD}:/app" alpine sh

Which external commands/tools can I use?

These are guaranteed to be in the docker images running in GitLab:

GNU coreutils: cat, sort, uniq etc.
jq for JSON and the sister tool yq for YAML
curl and wget for network stuff
gawk the GNU version of AWK, aliased as awk
xmllint and xmlstarlet for your eXtensible Markup needs

For a full list of installed utilities look into the Dockerfile for the shellscripting image in DockerImages.

The language-specific images for Java, Rust and TypeScript have more tools available. See their respective Dockerfile for a full list.