Cli Tools

Shell Scripting Best Practices for Developers

Essential shell scripting patterns and best practices for software developers, covering bash fundamentals, error handling, portability, testing, and automation workflows.

Shell Scripting Best Practices for Developers

Every developer writes shell scripts. Deployment scripts, build automation, environment setup, data processing pipelines — the shell is the glue that holds your toolchain together. Most shell scripts start as one-liners and grow into unmaintainable monsters because no one applies the same engineering discipline to bash that they apply to application code. This guide covers the practices that keep shell scripts reliable, readable, and maintainable as they grow.

Prerequisites

  • Basic command-line experience
  • Access to a Unix-like shell (bash, zsh, or WSL on Windows)
  • A text editor
  • Willingness to treat shell scripts as real code

The Bash Preamble

Every non-trivial bash script should start with this:

#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'

What each line does:

  • #!/usr/bin/env bash — Uses the bash in $PATH instead of hardcoding /bin/bash. More portable across systems.
  • set -e — Exit immediately if any command fails. Without this, scripts silently continue past errors.
  • set -u — Treat unset variables as errors. Catches typos like $DATABASEURL when you meant $DATABASE_URL.
  • set -o pipefail — A pipeline fails if ANY command in it fails, not just the last one. Without this, curl http://bad-url | jq . succeeds (exit 0) because jq succeeds even though curl failed.
  • IFS=$'\n\t' — Change the Internal Field Separator to newline and tab only. Default IFS includes space, which causes word splitting bugs with filenames containing spaces.
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'

# Your script starts here
echo "This script will fail fast on any error"

Variables and Quoting

Always Quote Variables

# Bad: breaks on filenames with spaces
for file in $FILES; do
  cp $file /backup/
done

# Good: handles spaces correctly
for file in "${FILES[@]}"; do
  cp "$file" /backup/
done

The rule is simple: always double-quote variable expansions. The only exceptions are inside [[ ]] tests (where quoting is optional) and intentional word splitting (which you almost never want).

Variable Declarations

# Use readonly for constants
readonly PROJECT_ROOT="/opt/myapp"
readonly LOG_FILE="${PROJECT_ROOT}/logs/deploy.log"

# Use local in functions
deploy() {
  local version="$1"
  local target_dir="${PROJECT_ROOT}/releases/${version}"
  local timestamp
  timestamp=$(date +%Y%m%d_%H%M%S)

  echo "Deploying version ${version} at ${timestamp}"
}

Default Values

# Default value if unset
DATABASE_HOST="${DATABASE_HOST:-localhost}"
DATABASE_PORT="${DATABASE_PORT:-5432}"

# Error if unset (with set -u, this is automatic, but explicit messages help)
: "${API_KEY:?API_KEY environment variable is required}"

# Default value only if unset (not empty)
LOG_LEVEL="${LOG_LEVEL-info}"

Naming Conventions

# Environment variables and constants: UPPER_CASE
readonly MAX_RETRIES=3
DATABASE_URL="postgresql://localhost/myapp"

# Local variables and function names: lower_case
build_project() {
  local source_dir="$1"
  local output_dir="$2"
  local build_log="/tmp/build_${RANDOM}.log"
}

Error Handling

Trap for Cleanup

#!/usr/bin/env bash
set -euo pipefail

TEMP_DIR=""

cleanup() {
  local exit_code=$?
  if [[ -n "$TEMP_DIR" && -d "$TEMP_DIR" ]]; then
    rm -rf "$TEMP_DIR"
    echo "Cleaned up temp directory"
  fi
  exit $exit_code
}

trap cleanup EXIT

TEMP_DIR=$(mktemp -d)
echo "Working in ${TEMP_DIR}"

# Even if this fails, cleanup runs
cp important_files/* "$TEMP_DIR/"
process_files "$TEMP_DIR"

The trap cleanup EXIT ensures cleanup runs whether the script succeeds, fails, or is interrupted with Ctrl+C.

Error Messages

die() {
  echo "ERROR: $*" >&2
  exit 1
}

warn() {
  echo "WARNING: $*" >&2
}

info() {
  echo "INFO: $*"
}

# Usage
[[ -f "$CONFIG_FILE" ]] || die "Config file not found: $CONFIG_FILE"
command -v docker >/dev/null || die "Docker is not installed"

Always send error messages to stderr (>&2) so they do not pollute stdout output that might be piped to another command.

Retry Logic

retry() {
  local max_attempts="$1"
  local delay="$2"
  shift 2
  local cmd=("$@")

  local attempt=1
  while true; do
    if "${cmd[@]}"; then
      return 0
    fi

    if [[ $attempt -ge $max_attempts ]]; then
      echo "ERROR: Command failed after ${max_attempts} attempts: ${cmd[*]}" >&2
      return 1
    fi

    echo "Attempt ${attempt}/${max_attempts} failed. Retrying in ${delay}s..." >&2
    sleep "$delay"
    ((attempt++))
  done
}

# Usage
retry 3 5 curl -sf http://localhost:3000/health
retry 5 2 docker compose exec postgres pg_isready -U appuser

Functions

Writing Good Functions

# Functions should be focused and documented
# Returns 0 on success, 1 on failure

# Check if a port is available
is_port_available() {
  local port="$1"
  ! ss -tlnp 2>/dev/null | grep -q ":${port} " 2>/dev/null
}

# Wait for a service to be ready
wait_for_service() {
  local host="$1"
  local port="$2"
  local timeout="${3:-30}"
  local elapsed=0

  echo "Waiting for ${host}:${port}..."
  while ! nc -z "$host" "$port" 2>/dev/null; do
    if [[ $elapsed -ge $timeout ]]; then
      echo "ERROR: Timed out waiting for ${host}:${port}" >&2
      return 1
    fi
    sleep 1
    ((elapsed++))
  done
  echo "${host}:${port} is ready (${elapsed}s)"
}

# Create a backup of a file before modifying it
backup_file() {
  local file="$1"
  local backup="${file}.bak.$(date +%Y%m%d_%H%M%S)"

  if [[ ! -f "$file" ]]; then
    echo "WARNING: File does not exist: $file" >&2
    return 1
  fi

  cp "$file" "$backup"
  echo "Backup: $backup"
}

Return Values vs Output

# Use return codes for success/failure
file_exists() {
  [[ -f "$1" ]]
}

if file_exists "/app/config.json"; then
  echo "Config found"
fi

# Use stdout for data output
get_git_branch() {
  git rev-parse --abbrev-ref HEAD 2>/dev/null
}

current_branch=$(get_git_branch)
echo "On branch: ${current_branch}"

# NEVER use echo for boolean results
# Bad:
is_production() {
  if [[ "$NODE_ENV" == "production" ]]; then
    echo "true"  # This is fragile
  else
    echo "false"
  fi
}

# Good:
is_production() {
  [[ "${NODE_ENV:-}" == "production" ]]
}

Conditionals and Tests

Use [[ ]] Instead of [ ]

# [[ ]] is a bash keyword with better behavior
# Supports && and || inside
# No word splitting on variables
# Supports regex matching

# String comparison
if [[ "$status" == "healthy" ]]; then
  echo "Service is healthy"
fi

# Regex matching
if [[ "$email" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then
  echo "Valid email"
fi

# File tests
if [[ -f "$config_file" ]]; then echo "File exists"; fi
if [[ -d "$output_dir" ]]; then echo "Directory exists"; fi
if [[ -x "$script" ]]; then echo "Script is executable"; fi
if [[ -s "$log_file" ]]; then echo "File is non-empty"; fi

# Compound conditions
if [[ -f "$config" && -r "$config" ]]; then
  source "$config"
fi

Command Existence Checks

# Check if a command is available
require_command() {
  local cmd="$1"
  if ! command -v "$cmd" >/dev/null 2>&1; then
    die "Required command not found: $cmd"
  fi
}

require_command docker
require_command node
require_command git

Working with Files and Directories

Safe Temporary Files

# Always use mktemp
TEMP_FILE=$(mktemp)
TEMP_DIR=$(mktemp -d)

# Clean up on exit
trap 'rm -f "$TEMP_FILE"; rm -rf "$TEMP_DIR"' EXIT

Safe Directory Operations

# Never cd without checking success
cd "$PROJECT_DIR" || die "Cannot cd to $PROJECT_DIR"

# Or use subshells to avoid changing the parent's directory
(
  cd "$PROJECT_DIR"
  npm install
  npm run build
)
# Back in original directory here

# Use pushd/popd for nested directory changes
pushd "$BUILD_DIR" > /dev/null
make clean && make all
popd > /dev/null

Processing Files Safely

# Handle filenames with spaces and special characters
find "$SOURCE_DIR" -name "*.js" -print0 | while IFS= read -r -d '' file; do
  echo "Processing: $file"
  node --check "$file"
done

# Or use a glob (safer and simpler when possible)
for file in "$SOURCE_DIR"/*.js; do
  [[ -f "$file" ]] || continue  # Skip if no matches
  echo "Processing: $file"
done

Logging and Output

#!/usr/bin/env bash
set -euo pipefail

# Color codes (with fallback for non-color terminals)
if [[ -t 1 ]]; then
  RED='\033[0;31m'
  GREEN='\033[0;32m'
  YELLOW='\033[0;33m'
  BLUE='\033[0;34m'
  NC='\033[0m'
else
  RED='' GREEN='' YELLOW='' BLUE='' NC=''
fi

log_info()  { echo -e "${BLUE}[INFO]${NC}  $*"; }
log_ok()    { echo -e "${GREEN}[OK]${NC}    $*"; }
log_warn()  { echo -e "${YELLOW}[WARN]${NC}  $*" >&2; }
log_error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }

# Usage
log_info "Starting deployment"
log_ok "Database migrated"
log_warn "Cache not configured, using defaults"
log_error "Cannot connect to database"

Script-Level Logging

LOG_FILE="/var/log/deploy-$(date +%Y%m%d).log"

log() {
  local timestamp
  timestamp=$(date '+%Y-%m-%d %H:%M:%S')
  echo "[${timestamp}] $*" | tee -a "$LOG_FILE"
}

log "Deployment started by $(whoami)"

Argument Parsing

Simple Arguments

#!/usr/bin/env bash
set -euo pipefail

usage() {
  cat <<EOF
Usage: $(basename "$0") [OPTIONS] <command>

Commands:
  deploy    Deploy the application
  rollback  Rollback to previous version
  status    Show deployment status

Options:
  -e, --env ENV      Target environment (default: staging)
  -v, --version VER  Version to deploy
  -f, --force        Skip confirmation prompts
  -h, --help         Show this help message

Examples:
  $(basename "$0") deploy -e production -v 1.2.3
  $(basename "$0") rollback -e staging
EOF
}

# Defaults
ENVIRONMENT="staging"
VERSION=""
FORCE=false
COMMAND=""

# Parse arguments
while [[ $# -gt 0 ]]; do
  case "$1" in
    -e|--env)
      ENVIRONMENT="$2"
      shift 2
      ;;
    -v|--version)
      VERSION="$2"
      shift 2
      ;;
    -f|--force)
      FORCE=true
      shift
      ;;
    -h|--help)
      usage
      exit 0
      ;;
    deploy|rollback|status)
      COMMAND="$1"
      shift
      ;;
    *)
      echo "Unknown option: $1" >&2
      usage
      exit 1
      ;;
  esac
done

# Validate
[[ -n "$COMMAND" ]] || { usage; exit 1; }

Practical Script Examples

Deployment Script

#!/usr/bin/env bash
set -euo pipefail

readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly APP_DIR="/opt/myapp"
readonly RELEASES_DIR="${APP_DIR}/releases"
readonly CURRENT_LINK="${APP_DIR}/current"

: "${DEPLOY_USER:?DEPLOY_USER is required}"
: "${DEPLOY_HOST:?DEPLOY_HOST is required}"

log_info()  { echo "[$(date '+%H:%M:%S')] INFO:  $*"; }
log_error() { echo "[$(date '+%H:%M:%S')] ERROR: $*" >&2; }

die() { log_error "$@"; exit 1; }

deploy() {
  local version="$1"
  local release_dir="${RELEASES_DIR}/${version}"

  log_info "Deploying version ${version}"

  # Build
  log_info "Building application..."
  docker build -t "myapp:${version}" .

  # Push to registry
  log_info "Pushing to registry..."
  docker tag "myapp:${version}" "registry.example.com/myapp:${version}"
  docker push "registry.example.com/myapp:${version}"

  # Deploy on remote
  log_info "Deploying to ${DEPLOY_HOST}..."
  ssh "${DEPLOY_USER}@${DEPLOY_HOST}" << REMOTE
    set -euo pipefail
    mkdir -p "${release_dir}"
    docker pull "registry.example.com/myapp:${version}"
    docker stop myapp || true
    docker rm myapp || true
    docker run -d --name myapp \
      --restart unless-stopped \
      -p 3000:3000 \
      --env-file "${APP_DIR}/.env" \
      "registry.example.com/myapp:${version}"

    # Update current symlink
    ln -sfn "${release_dir}" "${CURRENT_LINK}"

    # Health check
    for i in \$(seq 1 30); do
      if curl -sf http://localhost:3000/health > /dev/null; then
        echo "Health check passed"
        exit 0
      fi
      sleep 2
    done
    echo "Health check failed!"
    exit 1
REMOTE

  log_info "Deployment complete: ${version}"
}

# Parse arguments and run
VERSION="${1:?Usage: deploy.sh <version>}"
deploy "$VERSION"

Database Backup Script

#!/usr/bin/env bash
set -euo pipefail

readonly BACKUP_DIR="/backups/postgres"
readonly RETENTION_DAYS=7
readonly TIMESTAMP=$(date +%Y%m%d_%H%M%S)
readonly BACKUP_FILE="${BACKUP_DIR}/myapp_${TIMESTAMP}.sql.gz"

: "${DATABASE_URL:?DATABASE_URL is required}"

log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; }

# Ensure backup directory exists
mkdir -p "$BACKUP_DIR"

# Create backup
log "Starting backup..."
if pg_dump "$DATABASE_URL" | gzip > "$BACKUP_FILE"; then
  local_size=$(du -sh "$BACKUP_FILE" | cut -f1)
  log "Backup created: ${BACKUP_FILE} (${local_size})"
else
  log "ERROR: Backup failed"
  rm -f "$BACKUP_FILE"
  exit 1
fi

# Remove old backups
log "Cleaning up backups older than ${RETENTION_DAYS} days..."
find "$BACKUP_DIR" -name "myapp_*.sql.gz" -mtime "+${RETENTION_DAYS}" -delete
remaining=$(find "$BACKUP_DIR" -name "myapp_*.sql.gz" | wc -l)
log "Cleanup complete. ${remaining} backups remaining."

Common Issues and Troubleshooting

1. Script Fails with "command not found" in Cron

/opt/scripts/deploy.sh: line 15: docker: command not found

Cron runs with a minimal PATH. Add the full PATH at the top of your script:

export PATH="/usr/local/bin:/usr/bin:/bin:$PATH"

2. Script Works Locally but Fails on Server

syntax error near unexpected token `$'\r''

Windows line endings (CRLF) in the script. Fix with:

sed -i 's/\r$//' script.sh
# Or use dos2unix
dos2unix script.sh

3. Variables Expanded in Heredoc When They Shouldn't Be

# Variables ARE expanded (unquoted delimiter)
cat << EOF
  Home: $HOME
EOF

# Variables are NOT expanded (quoted delimiter)
cat << 'EOF'
  Home: $HOME
EOF

4. Subshell Variable Scope

# This doesn't work — pipe creates a subshell
count=0
cat file.txt | while read -r line; do
  ((count++))
done
echo "$count"  # Always 0!

# Fix: use process substitution or redirect
count=0
while read -r line; do
  ((count++))
done < file.txt
echo "$count"  # Correct count

Best Practices

  • Start every script with set -euo pipefail. This catches 90% of bugs that would otherwise go unnoticed.
  • Quote all variable expansions. "$variable" not $variable. Always.
  • Use [[ ]] instead of [ ]. Double brackets handle edge cases better and support regex.
  • Clean up temporary files with trap ... EXIT. Ensure cleanup runs even on errors.
  • Send errors to stderr. echo "error" >&2 keeps error messages separate from data output.
  • Use functions for any logic you would copy-paste. Functions make scripts testable and readable.
  • Use shellcheck on every script. It catches bugs that even experienced shell programmers miss:
    shellcheck deploy.sh
    
  • Prefer portability when possible. Use #!/usr/bin/env bash over #!/bin/bash. Avoid bash-only features if the script might run on sh.
  • Test scripts in a clean environment. Docker containers make this easy — your script should work without your personal shell config.

References

Powered by Contentful