Shell Scripting Best Practices for Developers
Essential shell scripting patterns and best practices for software developers, covering bash fundamentals, error handling, portability, testing, and automation workflows.
Shell Scripting Best Practices for Developers
Every developer writes shell scripts. Deployment scripts, build automation, environment setup, data processing pipelines — the shell is the glue that holds your toolchain together. Most shell scripts start as one-liners and grow into unmaintainable monsters because no one applies the same engineering discipline to bash that they apply to application code. This guide covers the practices that keep shell scripts reliable, readable, and maintainable as they grow.
Prerequisites
- Basic command-line experience
- Access to a Unix-like shell (bash, zsh, or WSL on Windows)
- A text editor
- Willingness to treat shell scripts as real code
The Bash Preamble
Every non-trivial bash script should start with this:
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
What each line does:
#!/usr/bin/env bash— Uses thebashin$PATHinstead of hardcoding/bin/bash. More portable across systems.set -e— Exit immediately if any command fails. Without this, scripts silently continue past errors.set -u— Treat unset variables as errors. Catches typos like$DATABASEURLwhen you meant$DATABASE_URL.set -o pipefail— A pipeline fails if ANY command in it fails, not just the last one. Without this,curl http://bad-url | jq .succeeds (exit 0) because jq succeeds even though curl failed.IFS=$'\n\t'— Change the Internal Field Separator to newline and tab only. Default IFS includes space, which causes word splitting bugs with filenames containing spaces.
#!/usr/bin/env bash
set -euo pipefail
IFS=$'\n\t'
# Your script starts here
echo "This script will fail fast on any error"
Variables and Quoting
Always Quote Variables
# Bad: breaks on filenames with spaces
for file in $FILES; do
cp $file /backup/
done
# Good: handles spaces correctly
for file in "${FILES[@]}"; do
cp "$file" /backup/
done
The rule is simple: always double-quote variable expansions. The only exceptions are inside [[ ]] tests (where quoting is optional) and intentional word splitting (which you almost never want).
Variable Declarations
# Use readonly for constants
readonly PROJECT_ROOT="/opt/myapp"
readonly LOG_FILE="${PROJECT_ROOT}/logs/deploy.log"
# Use local in functions
deploy() {
local version="$1"
local target_dir="${PROJECT_ROOT}/releases/${version}"
local timestamp
timestamp=$(date +%Y%m%d_%H%M%S)
echo "Deploying version ${version} at ${timestamp}"
}
Default Values
# Default value if unset
DATABASE_HOST="${DATABASE_HOST:-localhost}"
DATABASE_PORT="${DATABASE_PORT:-5432}"
# Error if unset (with set -u, this is automatic, but explicit messages help)
: "${API_KEY:?API_KEY environment variable is required}"
# Default value only if unset (not empty)
LOG_LEVEL="${LOG_LEVEL-info}"
Naming Conventions
# Environment variables and constants: UPPER_CASE
readonly MAX_RETRIES=3
DATABASE_URL="postgresql://localhost/myapp"
# Local variables and function names: lower_case
build_project() {
local source_dir="$1"
local output_dir="$2"
local build_log="/tmp/build_${RANDOM}.log"
}
Error Handling
Trap for Cleanup
#!/usr/bin/env bash
set -euo pipefail
TEMP_DIR=""
cleanup() {
local exit_code=$?
if [[ -n "$TEMP_DIR" && -d "$TEMP_DIR" ]]; then
rm -rf "$TEMP_DIR"
echo "Cleaned up temp directory"
fi
exit $exit_code
}
trap cleanup EXIT
TEMP_DIR=$(mktemp -d)
echo "Working in ${TEMP_DIR}"
# Even if this fails, cleanup runs
cp important_files/* "$TEMP_DIR/"
process_files "$TEMP_DIR"
The trap cleanup EXIT ensures cleanup runs whether the script succeeds, fails, or is interrupted with Ctrl+C.
Error Messages
die() {
echo "ERROR: $*" >&2
exit 1
}
warn() {
echo "WARNING: $*" >&2
}
info() {
echo "INFO: $*"
}
# Usage
[[ -f "$CONFIG_FILE" ]] || die "Config file not found: $CONFIG_FILE"
command -v docker >/dev/null || die "Docker is not installed"
Always send error messages to stderr (>&2) so they do not pollute stdout output that might be piped to another command.
Retry Logic
retry() {
local max_attempts="$1"
local delay="$2"
shift 2
local cmd=("$@")
local attempt=1
while true; do
if "${cmd[@]}"; then
return 0
fi
if [[ $attempt -ge $max_attempts ]]; then
echo "ERROR: Command failed after ${max_attempts} attempts: ${cmd[*]}" >&2
return 1
fi
echo "Attempt ${attempt}/${max_attempts} failed. Retrying in ${delay}s..." >&2
sleep "$delay"
((attempt++))
done
}
# Usage
retry 3 5 curl -sf http://localhost:3000/health
retry 5 2 docker compose exec postgres pg_isready -U appuser
Functions
Writing Good Functions
# Functions should be focused and documented
# Returns 0 on success, 1 on failure
# Check if a port is available
is_port_available() {
local port="$1"
! ss -tlnp 2>/dev/null | grep -q ":${port} " 2>/dev/null
}
# Wait for a service to be ready
wait_for_service() {
local host="$1"
local port="$2"
local timeout="${3:-30}"
local elapsed=0
echo "Waiting for ${host}:${port}..."
while ! nc -z "$host" "$port" 2>/dev/null; do
if [[ $elapsed -ge $timeout ]]; then
echo "ERROR: Timed out waiting for ${host}:${port}" >&2
return 1
fi
sleep 1
((elapsed++))
done
echo "${host}:${port} is ready (${elapsed}s)"
}
# Create a backup of a file before modifying it
backup_file() {
local file="$1"
local backup="${file}.bak.$(date +%Y%m%d_%H%M%S)"
if [[ ! -f "$file" ]]; then
echo "WARNING: File does not exist: $file" >&2
return 1
fi
cp "$file" "$backup"
echo "Backup: $backup"
}
Return Values vs Output
# Use return codes for success/failure
file_exists() {
[[ -f "$1" ]]
}
if file_exists "/app/config.json"; then
echo "Config found"
fi
# Use stdout for data output
get_git_branch() {
git rev-parse --abbrev-ref HEAD 2>/dev/null
}
current_branch=$(get_git_branch)
echo "On branch: ${current_branch}"
# NEVER use echo for boolean results
# Bad:
is_production() {
if [[ "$NODE_ENV" == "production" ]]; then
echo "true" # This is fragile
else
echo "false"
fi
}
# Good:
is_production() {
[[ "${NODE_ENV:-}" == "production" ]]
}
Conditionals and Tests
Use [[ ]] Instead of [ ]
# [[ ]] is a bash keyword with better behavior
# Supports && and || inside
# No word splitting on variables
# Supports regex matching
# String comparison
if [[ "$status" == "healthy" ]]; then
echo "Service is healthy"
fi
# Regex matching
if [[ "$email" =~ ^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$ ]]; then
echo "Valid email"
fi
# File tests
if [[ -f "$config_file" ]]; then echo "File exists"; fi
if [[ -d "$output_dir" ]]; then echo "Directory exists"; fi
if [[ -x "$script" ]]; then echo "Script is executable"; fi
if [[ -s "$log_file" ]]; then echo "File is non-empty"; fi
# Compound conditions
if [[ -f "$config" && -r "$config" ]]; then
source "$config"
fi
Command Existence Checks
# Check if a command is available
require_command() {
local cmd="$1"
if ! command -v "$cmd" >/dev/null 2>&1; then
die "Required command not found: $cmd"
fi
}
require_command docker
require_command node
require_command git
Working with Files and Directories
Safe Temporary Files
# Always use mktemp
TEMP_FILE=$(mktemp)
TEMP_DIR=$(mktemp -d)
# Clean up on exit
trap 'rm -f "$TEMP_FILE"; rm -rf "$TEMP_DIR"' EXIT
Safe Directory Operations
# Never cd without checking success
cd "$PROJECT_DIR" || die "Cannot cd to $PROJECT_DIR"
# Or use subshells to avoid changing the parent's directory
(
cd "$PROJECT_DIR"
npm install
npm run build
)
# Back in original directory here
# Use pushd/popd for nested directory changes
pushd "$BUILD_DIR" > /dev/null
make clean && make all
popd > /dev/null
Processing Files Safely
# Handle filenames with spaces and special characters
find "$SOURCE_DIR" -name "*.js" -print0 | while IFS= read -r -d '' file; do
echo "Processing: $file"
node --check "$file"
done
# Or use a glob (safer and simpler when possible)
for file in "$SOURCE_DIR"/*.js; do
[[ -f "$file" ]] || continue # Skip if no matches
echo "Processing: $file"
done
Logging and Output
#!/usr/bin/env bash
set -euo pipefail
# Color codes (with fallback for non-color terminals)
if [[ -t 1 ]]; then
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[0;33m'
BLUE='\033[0;34m'
NC='\033[0m'
else
RED='' GREEN='' YELLOW='' BLUE='' NC=''
fi
log_info() { echo -e "${BLUE}[INFO]${NC} $*"; }
log_ok() { echo -e "${GREEN}[OK]${NC} $*"; }
log_warn() { echo -e "${YELLOW}[WARN]${NC} $*" >&2; }
log_error() { echo -e "${RED}[ERROR]${NC} $*" >&2; }
# Usage
log_info "Starting deployment"
log_ok "Database migrated"
log_warn "Cache not configured, using defaults"
log_error "Cannot connect to database"
Script-Level Logging
LOG_FILE="/var/log/deploy-$(date +%Y%m%d).log"
log() {
local timestamp
timestamp=$(date '+%Y-%m-%d %H:%M:%S')
echo "[${timestamp}] $*" | tee -a "$LOG_FILE"
}
log "Deployment started by $(whoami)"
Argument Parsing
Simple Arguments
#!/usr/bin/env bash
set -euo pipefail
usage() {
cat <<EOF
Usage: $(basename "$0") [OPTIONS] <command>
Commands:
deploy Deploy the application
rollback Rollback to previous version
status Show deployment status
Options:
-e, --env ENV Target environment (default: staging)
-v, --version VER Version to deploy
-f, --force Skip confirmation prompts
-h, --help Show this help message
Examples:
$(basename "$0") deploy -e production -v 1.2.3
$(basename "$0") rollback -e staging
EOF
}
# Defaults
ENVIRONMENT="staging"
VERSION=""
FORCE=false
COMMAND=""
# Parse arguments
while [[ $# -gt 0 ]]; do
case "$1" in
-e|--env)
ENVIRONMENT="$2"
shift 2
;;
-v|--version)
VERSION="$2"
shift 2
;;
-f|--force)
FORCE=true
shift
;;
-h|--help)
usage
exit 0
;;
deploy|rollback|status)
COMMAND="$1"
shift
;;
*)
echo "Unknown option: $1" >&2
usage
exit 1
;;
esac
done
# Validate
[[ -n "$COMMAND" ]] || { usage; exit 1; }
Practical Script Examples
Deployment Script
#!/usr/bin/env bash
set -euo pipefail
readonly SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
readonly APP_DIR="/opt/myapp"
readonly RELEASES_DIR="${APP_DIR}/releases"
readonly CURRENT_LINK="${APP_DIR}/current"
: "${DEPLOY_USER:?DEPLOY_USER is required}"
: "${DEPLOY_HOST:?DEPLOY_HOST is required}"
log_info() { echo "[$(date '+%H:%M:%S')] INFO: $*"; }
log_error() { echo "[$(date '+%H:%M:%S')] ERROR: $*" >&2; }
die() { log_error "$@"; exit 1; }
deploy() {
local version="$1"
local release_dir="${RELEASES_DIR}/${version}"
log_info "Deploying version ${version}"
# Build
log_info "Building application..."
docker build -t "myapp:${version}" .
# Push to registry
log_info "Pushing to registry..."
docker tag "myapp:${version}" "registry.example.com/myapp:${version}"
docker push "registry.example.com/myapp:${version}"
# Deploy on remote
log_info "Deploying to ${DEPLOY_HOST}..."
ssh "${DEPLOY_USER}@${DEPLOY_HOST}" << REMOTE
set -euo pipefail
mkdir -p "${release_dir}"
docker pull "registry.example.com/myapp:${version}"
docker stop myapp || true
docker rm myapp || true
docker run -d --name myapp \
--restart unless-stopped \
-p 3000:3000 \
--env-file "${APP_DIR}/.env" \
"registry.example.com/myapp:${version}"
# Update current symlink
ln -sfn "${release_dir}" "${CURRENT_LINK}"
# Health check
for i in \$(seq 1 30); do
if curl -sf http://localhost:3000/health > /dev/null; then
echo "Health check passed"
exit 0
fi
sleep 2
done
echo "Health check failed!"
exit 1
REMOTE
log_info "Deployment complete: ${version}"
}
# Parse arguments and run
VERSION="${1:?Usage: deploy.sh <version>}"
deploy "$VERSION"
Database Backup Script
#!/usr/bin/env bash
set -euo pipefail
readonly BACKUP_DIR="/backups/postgres"
readonly RETENTION_DAYS=7
readonly TIMESTAMP=$(date +%Y%m%d_%H%M%S)
readonly BACKUP_FILE="${BACKUP_DIR}/myapp_${TIMESTAMP}.sql.gz"
: "${DATABASE_URL:?DATABASE_URL is required}"
log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $*"; }
# Ensure backup directory exists
mkdir -p "$BACKUP_DIR"
# Create backup
log "Starting backup..."
if pg_dump "$DATABASE_URL" | gzip > "$BACKUP_FILE"; then
local_size=$(du -sh "$BACKUP_FILE" | cut -f1)
log "Backup created: ${BACKUP_FILE} (${local_size})"
else
log "ERROR: Backup failed"
rm -f "$BACKUP_FILE"
exit 1
fi
# Remove old backups
log "Cleaning up backups older than ${RETENTION_DAYS} days..."
find "$BACKUP_DIR" -name "myapp_*.sql.gz" -mtime "+${RETENTION_DAYS}" -delete
remaining=$(find "$BACKUP_DIR" -name "myapp_*.sql.gz" | wc -l)
log "Cleanup complete. ${remaining} backups remaining."
Common Issues and Troubleshooting
1. Script Fails with "command not found" in Cron
/opt/scripts/deploy.sh: line 15: docker: command not found
Cron runs with a minimal PATH. Add the full PATH at the top of your script:
export PATH="/usr/local/bin:/usr/bin:/bin:$PATH"
2. Script Works Locally but Fails on Server
syntax error near unexpected token `$'\r''
Windows line endings (CRLF) in the script. Fix with:
sed -i 's/\r$//' script.sh
# Or use dos2unix
dos2unix script.sh
3. Variables Expanded in Heredoc When They Shouldn't Be
# Variables ARE expanded (unquoted delimiter)
cat << EOF
Home: $HOME
EOF
# Variables are NOT expanded (quoted delimiter)
cat << 'EOF'
Home: $HOME
EOF
4. Subshell Variable Scope
# This doesn't work — pipe creates a subshell
count=0
cat file.txt | while read -r line; do
((count++))
done
echo "$count" # Always 0!
# Fix: use process substitution or redirect
count=0
while read -r line; do
((count++))
done < file.txt
echo "$count" # Correct count
Best Practices
- Start every script with
set -euo pipefail. This catches 90% of bugs that would otherwise go unnoticed. - Quote all variable expansions.
"$variable"not$variable. Always. - Use
[[ ]]instead of[ ]. Double brackets handle edge cases better and support regex. - Clean up temporary files with
trap ... EXIT. Ensure cleanup runs even on errors. - Send errors to stderr.
echo "error" >&2keeps error messages separate from data output. - Use functions for any logic you would copy-paste. Functions make scripts testable and readable.
- Use
shellcheckon every script. It catches bugs that even experienced shell programmers miss:shellcheck deploy.sh - Prefer portability when possible. Use
#!/usr/bin/env bashover#!/bin/bash. Avoid bash-only features if the script might run onsh. - Test scripts in a clean environment. Docker containers make this easy — your script should work without your personal shell config.