SSH Deep Dive — Footguns¶

Security mistakes and misconfigurations that expose your infrastructure, leak credentials, or create persistent vulnerabilities.

1. Agent forwarding on untrusted hosts¶

ssh -A forwards your SSH agent to the remote host. Anyone with root on that host can use your agent socket to authenticate as you to any other server. If you forward your agent to a compromised bastion, the attacker can SSH to every server your key can reach.

# Dangerous: forwarding your agent to a shared jump box
ssh -A bastion.example.com
# root on bastion can now: SSH_AUTH_SOCK=/tmp/ssh-xxx/agent.12345 ssh prod-db

# The attack is invisible to you. The attacker uses your agent while you're connected.
# They can reach any host your key is authorized on.

# Fix: use ProxyJump instead of agent forwarding
ssh -J bastion.example.com prod-server.internal
# Your key never touches the bastion — TCP connection is proxied

# In ~/.ssh/config:
Host prod-*
    ProxyJump bastion.example.com

Host bastion.example.com
    ForwardAgent no   # explicit deny

# Fix: if you MUST forward (rare), use ssh-agent confirmation
ssh-add -c ~/.ssh/id_ed25519
# Every use of the key triggers a desktop confirmation prompt
# At least you'll know when someone else uses your agent

Rule: Never use ForwardAgent yes in your global SSH config. Use ProxyJump instead of agent forwarding in nearly all cases.

2. World-readable private keys¶

If your private key file has permissions wider than 600 (file) or 700 (directory), SSH refuses to use it — unless you are on a system where it does not check (some Windows SSH clients, some containers). Even when SSH does refuse, the error message is cryptic and people "fix" it by copying the key somewhere else instead of fixing permissions.

# SSH refuses to use this key
ssh -i /home/deploy/.ssh/id_ed25519 server
# WARNING: UNPROTECTED PRIVATE KEY FILE!
# Permissions 0644 for '/home/deploy/.ssh/id_ed25519' are too open.

# Common bad "fix": copy the key to /tmp and chmod it there
# Now your private key is in /tmp where anyone can read it

# Correct fix:
chmod 600 ~/.ssh/id_ed25519       # private key: owner read/write only
chmod 644 ~/.ssh/id_ed25519.pub   # public key: anyone can read
chmod 700 ~/.ssh                   # directory: owner only
chmod 600 ~/.ssh/config           # config: owner only
chmod 600 ~/.ssh/authorized_keys  # authorized keys: owner only

# In Docker containers, keys copied with COPY get root:root 644
# Fix in Dockerfile:
COPY --chown=appuser:appuser --chmod=600 id_ed25519 /home/appuser/.ssh/

3. Password authentication still enabled¶

You have set up SSH keys for all your users. But password auth is still enabled in sshd_config. Attackers brute-force passwords 24/7. Your keys are irrelevant if the password "admin123" also works.

# Check if password auth is enabled
sshd -T | grep passwordauthentication
# passwordauthentication yes  <- bad

# Check your actual config (not just the file — includes drop-ins)
grep -r 'PasswordAuthentication' /etc/ssh/sshd_config /etc/ssh/sshd_config.d/

# Fix: disable password auth
# /etc/ssh/sshd_config (or a drop-in file):
# PasswordAuthentication no
# KbdInteractiveAuthentication no   # also disable this (PAM keyboard-interactive)

# Restart sshd (test in a SEPARATE session first!)
sudo systemctl reload sshd

# CRITICAL: before disabling password auth, verify:
# 1. Your SSH key works: ssh -o PasswordAuthentication=no user@server
# 2. You have console access (IPMI, cloud console) as a fallback
# 3. All users who need access have their keys deployed

PermitRootLogin yes means attackers only need to guess one password (or find one key) to get full system access. No audit trail of who logged in — just "root."

# Check current setting
sshd -T | grep permitrootlogin
# permitrootlogin yes  <- bad
# permitrootlogin prohibit-password  <- better (key only)
# permitrootlogin no  <- best

# Fix:
# /etc/ssh/sshd_config:
# PermitRootLogin no

# If you need root access, use sudo:
# User logs in as themselves (audit trail), then sudo
# This way you know WHO did what as root

# For automation (Ansible, etc.) where a service account needs root:
# PermitRootLogin prohibit-password
# And use a dedicated key with forced command:
# In /root/.ssh/authorized_keys:
# command="/usr/local/bin/ansible-pull" ssh-ed25519 AAAA... ansible@deploy

5. Small RSA key size (use Ed25519)¶

RSA-2048 is the minimum considered safe today, but RSA-4096 is recommended. Many systems still have RSA-1024 keys from years ago. Ed25519 is faster, shorter, and more secure than any RSA key size.

# Check your key type and size
ssh-keygen -l -f ~/.ssh/id_rsa
# 1024 SHA256:xxx... user@host (RSA)  <- dangerously weak
# 2048 SHA256:xxx... user@host (RSA)  <- minimum, should upgrade
# 4096 SHA256:xxx... user@host (RSA)  <- acceptable

# Check what keys are on a server
for f in /etc/ssh/ssh_host_*_key.pub; do
    ssh-keygen -l -f "$f"
done

# Generate a modern key
ssh-keygen -t ed25519 -C "user@host"
# Ed25519: 256-bit, faster than RSA-4096, fixed size, no weak parameters

# If you MUST use RSA (legacy systems that don't support Ed25519):
ssh-keygen -t rsa -b 4096 -C "user@host"

# Remove weak host keys from the server
sudo rm /etc/ssh/ssh_host_dsa_key*     # DSA: deprecated, remove
sudo rm /etc/ssh/ssh_host_ecdsa_key*   # ECDSA: acceptable but Ed25519 preferred
# Regenerate:
sudo ssh-keygen -t ed25519 -f /etc/ssh/ssh_host_ed25519_key -N ""
sudo systemctl restart sshd

6. Not verifying host keys (MITM vulnerability)¶

The first time you connect to a host, SSH shows you the host key fingerprint and asks you to verify it. Everyone types "yes" without checking. If an attacker has intercepted the connection, you have just trusted their key. Every future connection to that IP goes through the attacker.

# What you see:
# The authenticity of host 'prod-db (10.0.1.50)' can't be established.
# ED25519 key fingerprint is SHA256:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
# Are you sure you want to continue connecting (yes/no)?

# What everyone does: types "yes" without checking
# What you should do: verify the fingerprint out-of-band

# Get the fingerprint from the server (via console, IPMI, or cloud metadata):
ssh-keygen -l -f /etc/ssh/ssh_host_ed25519_key.pub

# Pre-deploy known host keys via config management (Ansible, etc.)
# This is the real fix — verify once, deploy everywhere
# /etc/ssh/ssh_known_hosts (system-wide):
# prod-db.internal ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIxxx...

# Or use SSHFP DNS records (SSH fingerprints in DNS)
ssh-keygen -r prod-db.internal -f /etc/ssh/ssh_host_ed25519_key.pub
# Add the resulting SSHFP records to your DNS zone
# Then enable verification:
# ~/.ssh/config:
# VerifyHostKeyDNS yes

# After infrastructure changes (server rebuild, IP reuse):
ssh-keygen -R 10.0.1.50  # remove the old key
# Then verify the new key properly before accepting

7. ForwardAgent yes in global config¶

Setting ForwardAgent yes in the global section of ~/.ssh/config forwards your agent to EVERY host you connect to. You meant to enable it for one trusted host and instead enabled it everywhere.

# BAD: ~/.ssh/config
Host *
    ForwardAgent yes     # agent forwarded to every host you SSH to
    ServerAliveInterval 60

# Fix: only enable for specific hosts (if you must — see footgun #1)
Host *
    ForwardAgent no      # default deny
    ServerAliveInterval 60

Host trusted-bastion
    HostName bastion.example.com
    ForwardAgent yes     # only here

# Better fix: don't use ForwardAgent at all
Host prod-*
    ProxyJump bastion.example.com
    # No agent forwarding needed — connection is proxied

8. Leaving SSH tunnels open¶

You set up a tunnel for debugging, then forgot about it. That tunnel is now a persistent backdoor — it exposes an internal service (database, admin panel, metrics) on a local or remote port indefinitely.

# "Quick tunnel for debugging" — left running for 6 months
ssh -L 5432:prod-db:5432 bastion -N &
# Local port 5432 now reaches the production database
# Anyone on your laptop (or anyone who compromises your laptop) has DB access

# Remote tunnel left open:
ssh -R 8080:localhost:8080 public-server -N &
# Your local dev server is now exposed on the internet

# Fix: always use tunnels with explicit timeouts
ssh -L 5432:prod-db:5432 bastion -N -o ServerAliveInterval=30 -o ServerAliveCountMax=3 &
TUNNEL_PID=$!
sleep 3600  # one hour
kill $TUNNEL_PID

# Fix: use autossh with monitoring
autossh -M 0 -L 5432:prod-db:5432 bastion -N \
    -o ServerAliveInterval=30 -o ServerAliveCountMax=3

# Fix: list and kill forgotten tunnels
ps aux | grep 'ssh.*-[LRD]'  # find tunnel processes
ss -tlnp | grep ssh           # find listening ports from SSH

# Fix: audit /etc/ssh/sshd_config on servers
# AllowTcpForwarding no          # disable all forwarding
# Or restrict:
# AllowTcpForwarding local       # only local forwarding
# PermitOpen prod-db:5432        # only to specific destinations

9. SSH key without passphrase for automation¶

Automation needs unattended SSH access, so the key has no passphrase. If that key is stolen (from a CI server, a container image, a backup), the attacker has permanent access to every server the key is authorized on.

# The common pattern:
ssh-keygen -t ed25519 -f /deploy/key -N ""  # no passphrase
# Used by: CI/CD, Ansible, cron jobs, backup scripts

# Risk: key is stored on disk, in CI secrets, in container images
# Anyone who gets the key has access forever (until you rotate)

# Mitigation 1: restrict the key on the server side
# /home/deploy/.ssh/authorized_keys:
# command="/usr/local/bin/deploy.sh",no-port-forwarding,no-X11-forwarding,no-agent-forwarding ssh-ed25519 AAAA...
# Key can ONLY run deploy.sh — nothing else

# Mitigation 2: restrict by source IP
# from="10.0.0.0/8,192.168.1.0/24" ssh-ed25519 AAAA...
# Key only works from specific networks

# Mitigation 3: use short-lived certificates instead of permanent keys
# Sign a key that expires in 1 hour:
ssh-keygen -s /ca/key -I "deploy-$(date +%s)" -V +1h -n deploy deploy_key.pub
# The certificate expires automatically — no rotation needed

# Mitigation 4: use SSH certificate authorities (HashiCorp Vault, etc.)
# vault write ssh/sign/deploy public_key=@deploy_key.pub ttl=1h
# Vault issues a time-limited certificate

10. StrictHostKeyChecking=no everywhere¶

To "fix" the host key verification prompt (see footgun #6), someone adds StrictHostKeyChecking=no to the SSH config. Now SSH accepts ANY host key without verification. MITM attacks succeed silently.

# The "fix" that breaks security:
# Host *
#     StrictHostKeyChecking no
#     UserKnownHostsFile /dev/null

# This is in SO MANY Docker images, CI configs, and Ansible setups
# "It was the only way to make it work in automation"

# What this does: connects to ANYTHING without verification
# An attacker who redirects your DNS or sits on the network path
# can intercept every SSH connection

# Fix for automation: pre-deploy known host keys
# In your Ansible playbook or Docker image:
# COPY known_hosts /home/deploy/.ssh/known_hosts
# Or:
ssh-keyscan -t ed25519 prod-server.internal >> /home/deploy/.ssh/known_hosts

# Fix for dynamic infrastructure (cloud VMs that change IPs):
# Use SSH certificates — the CA key is trusted, individual host keys don't matter
# /etc/ssh/ssh_known_hosts:
# @cert-authority *.example.com ssh-ed25519 AAAA... (CA public key)
# Any host presenting a certificate signed by this CA is trusted

# Fix: accept-new (trust on first use, reject changes)
# Host *
#     StrictHostKeyChecking accept-new
# First connection: accepts and saves the key
# Later connections: rejects if the key changed (detects MITM/rebuild)
# This is a reasonable middle ground for non-critical environments

11. .ssh/config and directory permissions too open¶

SSH checks permissions on ~/.ssh/ and its contents. If the directory or files are group/world readable, SSH may silently ignore your config or refuse to use keys. The error messages are often misleading.

# Symptoms:
# - SSH ignores your config (wrong host, wrong user, wrong key)
# - "Bad owner or permissions on ~/.ssh/config"
# - Key authentication fails silently, falls back to password

# Check permissions
ls -la ~/.ssh/
# drwxr-xr-x  deploy deploy  .ssh/           <- too open (755)
# -rw-r--r--  deploy deploy  config          <- too open (644)
# -rw-r--r--  deploy deploy  id_ed25519      <- WAY too open (644)

# Fix all permissions at once
chmod 700 ~/.ssh
chmod 600 ~/.ssh/config
chmod 600 ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519.pub   # not strictly required, but good practice
chmod 600 ~/.ssh/authorized_keys
chmod 600 ~/.ssh/known_hosts

# In Docker: files copied with COPY inherit build context permissions
# which are often too open
# Fix:
RUN mkdir -p /home/appuser/.ssh && chmod 700 /home/appuser/.ssh
COPY --chown=appuser:appuser ssh_config /home/appuser/.ssh/config
RUN chmod 600 /home/appuser/.ssh/config

# The home directory itself matters too
chmod 750 /home/deploy   # or 700
# If the home directory is world-writable, SSH may refuse all key auth