SELinux & Linux Hardening Footguns¶

Mistakes that leave your servers exposed or lock you out entirely.

1. `setenforce 0` as the default fix¶

A service fails to start. SELinux denial in the logs. You run setenforce 0 and the service starts. You add SELINUX=permissive to the config and move on. Congratulations, you just disabled mandatory access control on a production server because you did not want to learn how SELinux works.

Fix: Read the AVC denial. Use sealert or audit2why to understand what is being denied. Fix the label, set the boolean, or write a targeted policy. Leave SELinux enforcing.

Debug clue: ausearch -m AVC -ts recent | audit2why translates SELinux denials into plain English with suggested fixes. 90% of the time, the fix is either setsebool -P <boolean> on or semanage fcontext -a -t <type> <path> && restorecon -Rv <path>. The other 10% needs a custom policy module via audit2allow.

2. Applying CIS benchmarks blindly on a running system¶

You download the CIS hardening script and run it against production. It disables IP forwarding on a server that routes traffic between subnets. It restricts cron access and breaks your backup jobs. It changes PAM config and locks out service accounts.

Fix: Review every CIS recommendation against your workload. Apply in staging first. Use a profile that matches your use case (server, workstation, L1, L2). Implement incrementally.

War story: A healthcare company applied the full CIS Level 2 benchmark to their Docker hosts without review. The script disabled IP forwarding (net.ipv4.ip_forward=0), breaking all container networking. It also restricted cron to root only, killing their automated backup jobs. The "hardening" caused a 4-hour outage that was worse than any attack they were trying to prevent.

3. SSH hardening without out-of-band access¶

You set PasswordAuthentication no and AllowUsers deploy via Ansible. The deploy user's key is not on 3 servers. Those 3 servers are now inaccessible. You do not have IPMI, iLO, or console access configured.

Fix: Always verify SSH access before restricting it. Run Ansible with --check first. Maintain out-of-band access (IPMI/iDRAC/cloud console) on every server. Test with sshd -T before restarting.

4. `chmod 777` to "fix" permission issues¶

Application cannot write to a directory. You run chmod 777 /opt/myapp/data. Problem solved. Also, every user and process on the system can now read, write, and delete your application data.

Fix: Understand which user needs what access. Use chown to set correct ownership. Use chmod 750 or chmod 770 with group membership. On SELinux systems, fix the file context instead.

5. auditd rules that log everything¶

You add -a always,exit -S all to audit rules because "we need full visibility." Auditd generates gigabytes of logs per hour. The disk fills up. The system slows to a crawl. Your actual security events are buried in noise.

Fix: Audit specific syscalls, specific paths, and specific users. Start with CIS-recommended rules. Add rules for your threat model, not for "everything." Monitor audit log volume as a metric.

6. Hardened sysctl that breaks containerized workloads¶

You set net.ipv4.ip_forward = 0 on a Docker host. Containers can no longer route traffic. You set kernel.yama.ptrace_scope = 3 and debuggers stop working. You set vm.max_map_count = 65530 (default) and Elasticsearch refuses to start.

Fix: Understand what each sysctl does before applying it. Container hosts need IP forwarding. Elasticsearch needs vm.max_map_count = 262144. Test hardening profiles against your actual workloads.

Default trap: Docker and Kubernetes both set net.ipv4.ip_forward=1 at startup. A hardening script that sets it back to 0 will appear to work until the next container or pod is created — then networking breaks. The symptom is intermittent: existing connections work (established routes in conntrack) but new connections fail.

7. Relabeling the entire filesystem during business hours¶

You need to relabel the filesystem after enabling SELinux. You run fixfiles -F onboot and reboot a production database server at 2 PM. The relabel takes 45 minutes on a 2 TB filesystem. The database is down for 45 minutes.

Fix: Schedule relabels during maintenance windows. Estimate relabel time on similar hardware first. For large filesystems, relabel incrementally by path: restorecon -Rv /var instead of full-system relabel.

8. Password policy so strict nobody can comply¶

You set minlen=20 dcredit=-3 ucredit=-3 ocredit=-3 lcredit=-3 plus 90-day rotation. Users write passwords on sticky notes. They use password patterns like Company2026Spring!!!. Your policy created the illusion of security while making actual security worse.

Fix: Use NIST 800-63B guidelines: longer minimum (12-14 chars), no complexity rules, no forced rotation unless compromised. Encourage passphrases. Deploy MFA instead of complex password rules.

Remember: NIST SP 800-63B (2017, reaffirmed 2024) explicitly recommends against forced password rotation and complexity rules. These requirements increase the rate of weak, predictable passwords. A 16-character passphrase ("correct horse battery staple") is stronger than a 12-character complex password ("P@ssw0rd2026!") that users write on sticky notes.

9. Removing SUID from binaries you do not understand¶

You find 40 SUID binaries and remove SUID from all of them because "SUID is dangerous." Now sudo does not work, passwd cannot change passwords, ping cannot open raw sockets, and mount cannot mount filesystems.

Fix: Audit SUID binaries against a known-good baseline. Only remove SUID from binaries your system does not need. Essential SUID binaries: sudo, passwd, su, mount, umount, ping. Remove from: at, unused network tools, legacy binaries.

10. Firewall rules that block monitoring and backups¶

You lock down iptables to allow only ports 22 and 443. Your Nagios/Zabbix checks stop working (NRPE on 5666). Your backup agent cannot connect (borg over SSH on a non-standard port). Your Prometheus node_exporter is unreachable (9100). You find out when alerting goes silent.

Fix: Inventory all legitimate traffic before writing firewall rules. Include monitoring ports, backup traffic, NTP, DNS, package repos, and cluster communication. Test firewall rules in permissive mode (log but allow) before enforcing.

Gotcha: When hardening iptables, don't forget outbound traffic. Blocking outbound DNS (port 53) breaks package manager updates. Blocking outbound NTP (port 123) causes clock drift. Blocking outbound HTTPS (port 443) breaks apt/yum repos, cloud metadata endpoints, and certificate revocation checks. Start with OUTPUT ACCEPT and tighten incrementally.