Skip to content

LDAP & Identity Management Footguns

  1. Binding to LDAP over plaintext (ldap:// without STARTTLS). Credentials are sent in the clear. Anyone with a packet sniffer on the network can capture bind DN passwords and user passwords during authentication. This includes SSSD service account credentials. Fix: Always use ldaps:// (port 636) or ldap:// with ldap_id_use_start_tls = true in sssd.conf. Verify with tcpdump -i any port 389 -A — if you see readable text, it is not encrypted.

    War story: A 2024 USENIX study found 526 internet-facing LDAP servers leaking passwords in plaintext, including admin accounts. STARTTLS is also vulnerable to downgrade attacks — an active attacker can strip the STARTTLS upgrade and keep the connection plaintext. Prefer ldaps:// (implicit TLS) over STARTTLS when possible.

  2. Getting PAM module order wrong. You put pam_unix.so with sufficient before pam_sss.so. Local accounts work, but SSSD users cannot log in because PAM stops processing after the first sufficient module succeeds (or skips to the next required on failure). Order is everything. Fix: Understand the PAM control flags (required, sufficient, requisite, optional). Use authselect on RHEL 8+ to manage PAM configs instead of hand-editing. Test with pamtester before deploying changes.

  3. Forgetting to clear SSSD cache after LDAP changes. You add a user to a group in LDAP but the Linux server does not see it for hours because SSSD caches identity data. You (or the user) conclude the change did not work. Fix: Run sss_cache -u username to invalidate a specific user, or sss_cache -E to flush everything. Set reasonable cache timeouts in sssd.conf (entry_cache_timeout = 300). Document cache behavior for your team.

  4. Setting enumerate = true on large directories. SSSD tries to enumerate (list) every user and group in the directory. On Active Directory with 50,000+ accounts, this causes massive load on the LDAP server, slow logins, and high memory usage on the client. Fix: Set enumerate = false (the default) for any directory with more than a few hundred entries. Users and groups are resolved on demand, not pre-loaded.

  5. Over-permissive LDAP ACLs. The LDAP bind account used by SSSD has write access to the directory, or anonymous binds can read password hashes. An attacker who compromises any SSSD client now has keys to the kingdom. Fix: SSSD needs only read access to the user and group subtrees. Create a dedicated bind DN with minimal permissions. Disable anonymous binds. Audit LDAP ACLs annually.

  6. Not configuring pam_mkhomedir for LDAP users. A user authenticates successfully via SSSD/LDAP but lands in / because their home directory does not exist. They cannot write dotfiles, SSH keys fail, and their shell prompt is broken. Fix: Add session required pam_mkhomedir.so skel=/etc/skel/ umask=077 to the PAM session stack. On RHEL, enable oddjobd-mkhomedir via authselect enable-feature with-mkhomedir.

  7. Ignoring Kerberos clock skew. Kerberos authentication fails with cryptic errors because the client and server clocks are more than 5 minutes apart. This is the default tolerance and it is strict. Fix: Configure NTP/chrony on every machine. Verify time sync before troubleshooting Kerberos: timedatectl and chronyc tracking. If joining AD, the domain controller often serves NTP.

    Debug clue: Kerberos errors like Clock skew too great or kinit: KDC reply did not match expectations almost always mean time drift. Run date on both client and KDC and compare. The default tolerance is exactly 300 seconds (5 minutes).

  8. Leaving sssd.conf world-readable. SSSD configuration contains the LDAP bind password in plaintext. If the file permissions are not 0600, any local user can read the service account credentials. Fix: chmod 0600 /etc/sssd/sssd.conf and verify ownership is root:root. SSSD will refuse to start if permissions are wrong — this is intentional and correct behavior.

    Debug clue: If SSSD refuses to start with sssd.conf has an invalid permission in the journal, check both permissions AND ownership: stat -c '%a %U:%G' /etc/sssd/sssd.conf. Must be 0600 root:root. Ansible's copy module can silently change ownership if become_user is wrong.

  9. Testing LDAP changes in production instead of staging. You modify a PAM config or SSSD setting on a production server and lock out all LDAP users. Since you are also an LDAP user, you just locked yourself out. Fix: Always keep a local root or admin account that does not depend on LDAP (passwd in /etc/passwd). Test PAM and SSSD changes on a non-production server first. Keep an active root session open when making auth changes.

    Gotcha: Before touching PAM or SSSD config, open a second root shell and keep it alive. If you lock yourself out, that shell is your only way back in without console/BMC access. Test with pamtester in the second shell before closing your safety net.

  10. Hardcoding LDAP server IPs instead of using DNS SRV records. When an LDAP server is decommissioned or its IP changes, every client config must be manually updated. If you miss one, that server loses authentication. Fix: Use DNS SRV records for service discovery (_ldap._tcp.example.com). In sssd.conf, use dns_discovery_domain instead of ldap_uri with hardcoded IPs. This way, LDAP server changes are a DNS update, not a config change on every client.