- linux
- l2
- deep-dive
- systemd
- linux-hardening --- Portal | Level: L2: Operations | Topics: systemd, Linux Hardening | Domain: Linux
systemd Service Design, Debugging, and Hardening¶
Scope¶
This document focuses on .service units as they are used in real servers:
- service types
- lifecycle state machine
- restart behavior
- notifications
- watchdogs
- timeouts
- cleanup semantics
- sandboxing/hardening
- debugging broken units
- writing units that do not behave like cursed shell wrappers
Reference anchors: - https://www.freedesktop.org/software/systemd/man/systemd.service.html - https://www.freedesktop.org/software/systemd/man/systemd.unit.html - https://www.freedesktop.org/software/systemd/man/systemd-analyze.html - https://www.freedesktop.org/software/systemd/man/systemd-journald.service.html
Big Picture¶
A service unit is a contract between PID 1 and a workload.
The contract answers:
- how to start it
- when it is considered ready
- how to stop it
- how to restart it
- what to do if it crashes
- what resources and privileges it gets
- how its logs are captured
- how its children are tracked
If you cannot answer those cleanly, your unit file is probably bad.
Pick the Right Type=¶
This is where a lot of people step on landmines.
Type=simple¶
The default.
systemd considers the service started immediately after spawning the main process.
Use for: - foreground daemons - most modern services - things that do not need readiness protocol
Type=exec¶
Like simple but start is considered successful only after execve() succeeds.
Often a better default than people realize.
Type=forking¶
For classic daemons that fork into the background. Legacy-heavy. Often used because of habit, not because it is correct.
Type=oneshot¶
Run a task and exit.
Often paired with RemainAfterExit=yes when the effect, not the process, is the "state."
Type=notify¶
Service explicitly tells systemd when it is ready using sd_notify.
Best for services that need a real readiness point.
Type=dbus¶
Service readiness tied to D-Bus name acquisition.
Rule of thumb: if the daemon can stay in foreground, do that. Do not daemonize twice like a clown car.
Readiness Is Not Process Existence¶
A spawned process is not the same thing as a ready service.
Examples: - database process exists but still replaying logs - web server process exists but has not bound sockets - worker exists but has not loaded config
If readiness matters, use:
- Type=notify
- socket activation
- or another explicit mechanism
Avoid pretending "pid exists" means "service healthy."
Exec Directives¶
Common ones:
ExecStart=ExecStartPre=ExecStartPost=ExecReload=ExecStop=ExecStopPost=
Guidelines:
- keep them short
- avoid giant inline shell pipelines
- prefer dedicated scripts if logic is nontrivial
- know that failure in
ExecStartPre=can stop the unit before main start - understand that start/stop are part of one service state machine, not random shell hooks
Restart Policy¶
Important knobs:
Common values:
- no
- on-success
- on-failure
- on-abnormal
- always
Think through semantics carefully.
Example:
- Restart=always for a batch job is probably wrong
- Restart=on-failure for a daemon is usually sane
- combine with sane limits so crash loops do not melt the node
Timeouts and Stop Semantics¶
Relevant knobs:
- TimeoutStartSec=
- TimeoutStopSec=
- KillMode=
- KillSignal=
- SendSIGKILL=
- FinalKillSignal=
Because systemd tracks the whole cgroup, stop behavior is far better than old PID-file theater.
KillMode=control-group is usually what you want:
stop the service, not one random ancestor process while children survive like cockroaches.
MainPID and Tracking¶
With cgroup-aware supervision, systemd can track the whole unit even if the original process exits and workers remain.
Still, MainPID matters for status and signal routing.
Bad daemonization models and stale pidfiles are a recurring source of confusion.
This is another reason foreground-first designs are superior.
Environment and Credentials¶
Useful directives:
- Environment=
- EnvironmentFile=
- WorkingDirectory=
- User=
- Group=
- SupplementaryGroups=
Use these deliberately. Do not assume shell login environment semantics. Service environments are not your interactive shell.
Sandboxing / Hardening¶
This is one of systemd's best server-side features.
Examples:
- NoNewPrivileges=yes
- PrivateTmp=yes
- ProtectSystem=strict
- ProtectHome=yes
- PrivateDevices=yes
- ProtectKernelTunables=yes
- ProtectControlGroups=yes
- MemoryDenyWriteExecute=yes
- CapabilityBoundingSet=
- AmbientCapabilities=
- SystemCallFilter=
- RestrictAddressFamilies=
Real point: you can shrink blast radius even if the daemon is compromised.
Use systemd-analyze security as a starting lens, not holy scripture.
Watchdog¶
A Type=notify service can periodically ping systemd watchdog logic.
If it stops pinging, PID 1 can restart it.
That protects against some "hung but not dead" failures, which plain restart-on-exit cannot detect.
Logging Model¶
stdout/stderr can go straight to journald. That means:
- fewer pidfile-era logging hacks
- structured metadata
- per-unit logs
- boot-scoped investigation
Typical commands:
Debugging a Failed Service¶
Use this workflow:
systemctl status name.servicejournalctl -xeu name.servicesystemctl cat name.servicesystemctl show name.service- inspect exit code, signal, timeout, readiness, permissions, paths, env
- if needed, run the underlying command manually as the target user/environment
- inspect cgroup/resource issues
Common actual causes: - wrong path - wrong user - missing runtime directory - daemon forks unexpectedly - service never signals readiness - SELinux/AppArmor policy issue - dependency graph wrong - start limit hit
Bad Patterns to Avoid¶
Giant bash -c in ExecStart=¶
You lose transparency, quoting gets cursed, and failure modes become muddy.
Type=forking by reflex¶
Legacy habit, not a design principle.
Backgrounding inside the service command¶
Do not fight the supervisor.
Using unit files as config-management junk drawers¶
Keep app config in app config where possible.
No hardening at all¶
A lot of services can safely lose privileges and capabilities.
Example of a Solid Modern Pattern¶
[Unit]
Description=Example API
After=network.target
[Service]
Type=notify
User=api
Group=api
WorkingDirectory=/srv/api
ExecStart=/usr/local/bin/example-api --config /etc/example-api/config.yaml
Restart=on-failure
RestartSec=3s
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
[Install]
WantedBy=multi-user.target
This is usually better than daemonizing, pidfiles, wrapper scripts, and prayer.
Interview-Level Things to Explain¶
You should be able to explain:
- why
Type=notifyexists - why
Type=forkingis legacy-heavy - how
Restart=interacts with crash loops - how journald helps debugging
- why cgroups make service supervision stronger
- what hardening directives buy you
- how to debug "service says active but app still broken"
Fast Mental Model¶
A good systemd service unit describes a workload's lifecycle, readiness, privileges, limits, and failure policy in a way PID 1 can supervise cleanly.
Wiki Navigation¶
Prerequisites¶
- Linux Ops (Topic Pack, L0)
Related Content¶
- Case Study: Systemd Service Flapping (Case Study, L1) — systemd
- Compliance & Audit Automation (Topic Pack, L2) — Linux Hardening
- Cron & Job Scheduling (Topic Pack, L1) — systemd
- Deep Dive: Linux Boot Sequence (deep_dive, L2) — systemd
- Deep Dive: Systemd Architecture (deep_dive, L2) — systemd
- Deep Dive: Systemd Timers Journald Cgroups and Resource Control (deep_dive, L2) — systemd
- Deep Dive: Systemd Units Dependencies and Ordering (deep_dive, L2) — systemd
- Infrastructure Forensics (Topic Pack, L2) — Linux Hardening
- LDAP & Identity Management (Topic Pack, L2) — Linux Hardening
- LPIC / LFCS Exam Preparation (Topic Pack, L2) — systemd
Pages that link here¶
- Compliance & Audit Automation
- Compliance & Audit Automation - Primer
- Infrastructure Forensics
- Infrastructure Forensics - Primer
- LDAP & Identity Management
- LDAP & Identity Management - Primer
- Runbook: Systemd Service Crash Loop
- SELinux & Linux Hardening - Primer
- Symptoms
- systemd Street Ops
- systemd Units, Dependencies, and Ordering