Skip to content

Lab 5: Shell Scripting

Field Value
Tier 1 — Foundations
Estimated Time 30 minutes
Prerequisites Bash basics
Auto-Grade Yes

Scenario

Your team runs five critical services on a bare-metal server: a web server, an API gateway, a cache (Redis), a database (PostgreSQL), and a message queue (RabbitMQ). Currently, there is no monitoring — someone just checks manually every few hours. Last week, the cache went down for six hours before anyone noticed. Customers complained about slow response times and your VP wants answers.

You have been asked to write a monitoring script that runs every minute via cron. The script must check each service, log results, and send an alert (write to an alert file) when any service is down. It should also track uptime percentages and generate a daily summary. The script needs to be production-grade: proper error handling, log rotation awareness, and clean exit codes.

Objectives

  • Create monitor.sh that checks 5 services via their health endpoints
  • Script logs results to /tmp/lab-shell/logs/monitor.log with timestamps
  • Script writes to /tmp/lab-shell/alerts/active.txt when a service is down
  • Script clears alerts when a previously-down service recovers
  • Script exits 0 when all services healthy, 1 when any service is down
  • Script handles missing directories gracefully (creates them if needed)
  • Script is idempotent (safe to run multiple times)

Setup

./setup.sh

Creates simulated service health endpoints under /tmp/lab-shell/services/.

Hints

Hint 1: Checking service health Each service has a health file in `/tmp/lab-shell/services//health`. If the file contains "ok", the service is healthy. If it contains anything else or is missing, the service is down.
Hint 2: Timestamps in bash Use `date '+%Y-%m-%d %H:%M:%S'` for ISO-like timestamps. Combine with service status: `echo "$(date '+%Y-%m-%d %H:%M:%S') [OK] web-server"`.
Hint 3: Alert management Write down services to an alert file, one per line. On each run, rebuild the alert file from scratch based on current status. If all services are healthy, the file should be empty or removed.
Hint 4: Exit codes Track a `failures` counter. Increment it for each unhealthy service. At the end: `exit $(( failures > 0 ? 1 : 0 ))`.
Hint 5: set -euo pipefail Always start with `set -euo pipefail` for production scripts. Use `|| true` for commands that are expected to fail (like checking a file that might not exist).

Grading

./grade.sh

Solution

See the solution/ directory for a complete monitoring script.