Incident Replay: SELinux Denying Service¶
Setup¶
- System context: RHEL 8 production server. A web application was migrated from
/opt/appto/srv/webapp. After the move, the application cannot start — permission denied errors despite correct file permissions. - Time: Tuesday 13:00 UTC
- Your role: Linux systems engineer
Round 1: Alert Fires¶
[Pressure cue: "Application migration completed but the app cannot bind to port 8080 and cannot read its config files. Standard file permissions look correct. 'Works on the staging server.'"]
What you see:
ls -la /srv/webapp/ shows correct ownership and permissions (755 for dirs, 644 for files, owned by appuser). But systemctl start webapp fails with "Permission denied" in the journal. Staging server has SELinux set to Permissive.
Choose your action:
- A) Set SELinux to Permissive mode: setenforce 0
- B) Check SELinux audit log for denials
- C) Run the application as root to bypass permissions
- D) Check if the appuser is in the correct groups
If you chose B (recommended):¶
[Result:
ausearch -m avc -ts recentshows 3 denials: (1) httpd_t cannot read files with default_t context in /srv, (2) httpd_t cannot bind to port 8080, (3) httpd_t cannot connect to port 5432 (PostgreSQL). SELinux is blocking the application's access to files, ports, and network connections. Proceed to Round 2.]
If you chose A:¶
[Result: Application starts and works. But you have disabled a security control. The security team will flag this in the next audit. Not a real fix.]
If you chose C:¶
[Result: Running as root bypasses DAC permissions but SELinux still enforces MAC. The denials persist because they are type-based, not UID-based.]
If you chose D:¶
[Result: Group membership is correct. The issue is SELinux type enforcement, not DAC permissions.]
Round 2: First Triage Data¶
[Pressure cue: "SELinux denials identified. Fix them without disabling SELinux — this server must be Enforcing for compliance."]
What you see:
Three SELinux issues: (1) Files in /srv/webapp/ have default_t context instead of httpd_sys_content_t, (2) Port 8080 is not labeled for httpd_t, (3) The httpd_t domain cannot connect to PostgreSQL.
Choose your action:
- A) Fix all three: relabel files, add port, enable network connect
- B) Write a custom SELinux policy module to allow all three
- C) Use chcon to fix file contexts and semanage for port and boolean
- D) Use restorecon and semanage for the proper fixes
If you chose D (recommended):¶
[Result: (1)
semanage fcontext -a -t httpd_sys_content_t "/srv/webapp(/.*)?"thenrestorecon -Rv /srv/webapp/. (2)semanage port -a -t http_port_t -p tcp 8080. (3)setsebool -P httpd_can_network_connect_db on. All three fixes applied. Proceed to Round 3.]
If you chose A:¶
[Result: Correct intent but the method matters.
chconchanges are lost on relabel. Usesemanage fcontext+restoreconfor persistence.]
If you chose B:¶
[Result: Custom policy modules are for non-standard use cases. These three issues have standard solutions — fcontext rules, port labels, and booleans.]
If you chose C:¶
[Result:
chconfor file contexts will be overwritten on the nextrestoreconor relabel. Usesemanage fcontextfor persistence.]
Round 3: Root Cause Identification¶
[Pressure cue: "Application starting. Why was this not caught in migration planning?"]
What you see:
Root cause: The migration plan did not account for SELinux contexts. Files moved with cp or rsync (without -Z flag) inherit the destination directory's context (default_t for /srv/). The staging server had SELinux in Permissive, hiding the issue. Port 8080 was not pre-labeled, and the database connectivity boolean was never enabled.
Choose your action:
- A) Add SELinux context verification to the migration checklist
- B) Ensure staging servers run SELinux in Enforcing mode
- C) Add restorecon to the application deployment automation
- D) All of the above
If you chose D (recommended):¶
[Result: Migration checklist updated, staging parity enforced, deployment automation includes SELinux steps. Proceed to Round 4.]
If you chose A:¶
[Result: Checklist helps but staging parity is what catches issues before production.]
If you chose B:¶
[Result: Enforcing on staging catches issues earlier but does not automate the fixes.]
If you chose C:¶
[Result: Automation is good but should be tested in staging first.]
Round 4: Remediation¶
[Pressure cue: "Application running. Verify SELinux is fully configured."]
Actions:
1. Verify application is running: systemctl status webapp
2. Verify no SELinux denials: ausearch -m avc -ts recent returns empty
3. Verify file contexts: ls -Z /srv/webapp/ shows httpd_sys_content_t
4. Verify port label: semanage port -l | grep 8080
5. Verify boolean: getsebool httpd_can_network_connect_db shows on
Damage Report¶
- Total downtime: 45 minutes (application down during migration troubleshooting)
- Blast radius: Single application; users unable to access the web service
- Optimal resolution time: 10 minutes (check audit log -> apply fcontext + port + boolean)
- If every wrong choice was made: 3+ hours including disabling SELinux and security audit findings
Cross-References¶
- Primer: SELinux & AppArmor
- Primer: Linux Hardening
- Primer: Linux Ops
- Footguns: SELinux & AppArmor