Cron & Job Scheduling Footguns¶
-
Assuming cron has the same environment as your login shell. You test your script by running it manually — it works. In cron, it fails because PATH is
/usr/bin:/bin, HOME is wrong, no .bashrc is loaded, and SHELL is/bin/shinstead of/bin/bash. Fix: Use absolute paths for all commands in cron jobs. Set PATH explicitly at the top of your crontab. Or use a wrapper script that sources the required environment.Default trap: Cron's default PATH is
/usr/bin:/bin. No/usr/local/bin, no/snap/bin, no~/bin. Yourpython3from pyenv, yourkubectlfrom a custom install path — none of it exists in cron's world. -
Not preventing overlapping runs. Your job runs every 5 minutes but occasionally takes 20 minutes. You now have 4 copies running concurrently, racing on the same files or database, causing corruption or deadlocks. Fix: Use
flock -n /var/lock/jobname.lock commandfor cron jobs. UseconcurrencyPolicy: Forbidfor Kubernetes CronJobs. Use systemdType=oneshotfor timers (overlap prevention is built-in). -
Scheduling jobs during DST transitions. You schedule a critical backup at 2:00 AM. During spring-forward, 2:00 AM does not exist — the job never runs. During fall-back, 2:00 AM happens twice — the job runs twice. Fix: Schedule critical jobs at times outside the DST transition window (not 1:00-3:00 AM local). Or run cron in UTC (
TZ=UTCin crontab). Make all jobs idempotent so a double run is harmless.War story: A distributed system running Kubernetes CronJobs across nodes with mixed timezones (some UTC, some local) had jobs run at different times after DST, causing race conditions in distributed locks and corrupted nightly reports.
-
Redirecting output to /dev/null. You add
> /dev/null 2>&1to silence cron email notifications. Now when the job fails, you have zero evidence of what went wrong. The job has been silently failing for weeks. Fix: Redirect to a log file (>> /var/log/job.log 2>&1), not /dev/null. Or let systemd timers handle logging via journald. Or use a dead man's switch service that alerts when the job stops pinging. -
Using
crontab -rinstead ofcrontab -e.-ris right next to-eon the keyboard. One typo and your entire crontab is deleted without confirmation. There is no undo. Fix: Usecrontab -ri(interactive) if your cron supports it. Keep crontab contents in version control. Use/etc/cron.d/drop-in files managed by Ansible/Puppet instead of user crontabs. Back up crontabs:crontab -l > ~/crontab.bak.Remember:
-ris for remove,-eis for edit. They are adjacent on QWERTY keyboards. One keystroke difference between editing your crontab and deleting it forever. -
Not monitoring whether scheduled jobs actually run. You set up a backup cron job and forget about it. Six months later, you need a restore and discover the backups stopped three months ago because a disk filled up. Fix: Use a dead man's switch (Healthchecks.io, Cronitor, PagerDuty heartbeat). Add
curl -fsS https://hc-ping.com/uuidas the last line of your script. The service alerts you if the ping does not arrive on schedule. -
Ignoring the startingDeadlineSeconds in Kubernetes CronJobs. The CronJob controller is overloaded or was restarted, missing a scheduled trigger. Without
startingDeadlineSeconds, the job is silently skipped forever. With it set too low, the job misses its window on a busy cluster. Fix: SetstartingDeadlineSecondsto a reasonable value (e.g., 600 for a daily job). This gives the controller a 10-minute window to start the job. Monitor CronJob last-schedule-time and alert on missed runs.Gotcha: If a CronJob misses more than 100 consecutive scheduled starts, Kubernetes permanently stops trying and logs "Too many missed start time (> 100)." The CronJob is dead until a human intervenes. This has bitten teams where ephemeral node unavailability quietly accumulated missed schedules.
-
Writing non-idempotent cron jobs. Your job appends data to a database without checking if it already ran. When the job runs twice (DST, manual trigger, controller retry), you get duplicate records. Fix: Make every scheduled job idempotent. Use upsert instead of insert. Check for a completion marker before running. Design jobs so that running them twice produces the same result as running them once.
-
Not setting resource limits on Kubernetes CronJobs. Your CronJob has no memory limit. A bug causes it to consume 8GB of RAM, evicting other pods on the node. Or the job runs forever because
activeDeadlineSecondsis not set. Fix: Setresources.limitsfor CPU and memory. SetactiveDeadlineSecondsto a sane maximum runtime. SetbackoffLimitto prevent infinite retries of a broken job. -
Putting complex logic directly in the crontab line. You chain 5 commands with
&&and||in a single crontab entry. It is unreadable, untestable, and the quoting is wrong. When it breaks, debugging is nearly impossible. Fix: Put the logic in a script. The crontab entry should be one command: the path to the script. Version-control the script. Test it manually before scheduling. Keep the crontab clean and readable.