LDAP & Identity Management Footguns¶
-
Binding to LDAP over plaintext (ldap:// without STARTTLS). Credentials are sent in the clear. Anyone with a packet sniffer on the network can capture bind DN passwords and user passwords during authentication. This includes SSSD service account credentials. Fix: Always use
ldaps://(port 636) orldap://withldap_id_use_start_tls = truein sssd.conf. Verify withtcpdump -i any port 389 -A— if you see readable text, it is not encrypted.War story: A 2024 USENIX study found 526 internet-facing LDAP servers leaking passwords in plaintext, including admin accounts. STARTTLS is also vulnerable to downgrade attacks — an active attacker can strip the STARTTLS upgrade and keep the connection plaintext. Prefer
ldaps://(implicit TLS) over STARTTLS when possible. -
Getting PAM module order wrong. You put
pam_unix.sowithsufficientbeforepam_sss.so. Local accounts work, but SSSD users cannot log in because PAM stops processing after the firstsufficientmodule succeeds (or skips to the next required on failure). Order is everything. Fix: Understand the PAM control flags (required,sufficient,requisite,optional). Useauthselecton RHEL 8+ to manage PAM configs instead of hand-editing. Test withpamtesterbefore deploying changes. -
Forgetting to clear SSSD cache after LDAP changes. You add a user to a group in LDAP but the Linux server does not see it for hours because SSSD caches identity data. You (or the user) conclude the change did not work. Fix: Run
sss_cache -u usernameto invalidate a specific user, orsss_cache -Eto flush everything. Set reasonable cache timeouts in sssd.conf (entry_cache_timeout = 300). Document cache behavior for your team. -
Setting
enumerate = trueon large directories. SSSD tries to enumerate (list) every user and group in the directory. On Active Directory with 50,000+ accounts, this causes massive load on the LDAP server, slow logins, and high memory usage on the client. Fix: Setenumerate = false(the default) for any directory with more than a few hundred entries. Users and groups are resolved on demand, not pre-loaded. -
Over-permissive LDAP ACLs. The LDAP bind account used by SSSD has write access to the directory, or anonymous binds can read password hashes. An attacker who compromises any SSSD client now has keys to the kingdom. Fix: SSSD needs only read access to the user and group subtrees. Create a dedicated bind DN with minimal permissions. Disable anonymous binds. Audit LDAP ACLs annually.
-
Not configuring
pam_mkhomedirfor LDAP users. A user authenticates successfully via SSSD/LDAP but lands in/because their home directory does not exist. They cannot write dotfiles, SSH keys fail, and their shell prompt is broken. Fix: Addsession required pam_mkhomedir.so skel=/etc/skel/ umask=077to the PAM session stack. On RHEL, enable oddjobd-mkhomedir viaauthselect enable-feature with-mkhomedir. -
Ignoring Kerberos clock skew. Kerberos authentication fails with cryptic errors because the client and server clocks are more than 5 minutes apart. This is the default tolerance and it is strict. Fix: Configure NTP/chrony on every machine. Verify time sync before troubleshooting Kerberos:
timedatectlandchronyc tracking. If joining AD, the domain controller often serves NTP.Debug clue: Kerberos errors like
Clock skew too greatorkinit: KDC reply did not match expectationsalmost always mean time drift. Rundateon both client and KDC and compare. The default tolerance is exactly 300 seconds (5 minutes). -
Leaving sssd.conf world-readable. SSSD configuration contains the LDAP bind password in plaintext. If the file permissions are not 0600, any local user can read the service account credentials. Fix:
chmod 0600 /etc/sssd/sssd.confand verify ownership is root:root. SSSD will refuse to start if permissions are wrong — this is intentional and correct behavior.Debug clue: If SSSD refuses to start with
sssd.conf has an invalid permissionin the journal, check both permissions AND ownership:stat -c '%a %U:%G' /etc/sssd/sssd.conf. Must be0600 root:root. Ansible'scopymodule can silently change ownership ifbecome_useris wrong. -
Testing LDAP changes in production instead of staging. You modify a PAM config or SSSD setting on a production server and lock out all LDAP users. Since you are also an LDAP user, you just locked yourself out. Fix: Always keep a local root or admin account that does not depend on LDAP (
passwdin/etc/passwd). Test PAM and SSSD changes on a non-production server first. Keep an active root session open when making auth changes.Gotcha: Before touching PAM or SSSD config, open a second root shell and keep it alive. If you lock yourself out, that shell is your only way back in without console/BMC access. Test with
pamtesterin the second shell before closing your safety net. -
Hardcoding LDAP server IPs instead of using DNS SRV records. When an LDAP server is decommissioned or its IP changes, every client config must be manually updated. If you miss one, that server loses authentication. Fix: Use DNS SRV records for service discovery (
_ldap._tcp.example.com). In sssd.conf, usedns_discovery_domaininstead ofldap_uriwith hardcoded IPs. This way, LDAP server changes are a DNS update, not a config change on every client.