Skip to content

Symptoms: Ansible Playbook Hangs, SSH Agent Forwarding Broken, Root Cause Is Firewall Rule

Domains: devops_tooling | linux_ops | networking Level: L2 Estimated time: 30-45 min

Initial Alert

CI/CD notification at 16:30 UTC:

:warning: Ansible Playbook TIMEOUT  infrastructure-update
Playbook: devops/ansible/playbooks/rolling-update.yml
Status: Hung at task "Pull updated config from git repo" on app-server-03
Duration: 15 minutes (timeout: 10 minutes)
Previous 2 servers completed successfully

Observable Symptoms

  • The Ansible playbook hangs on app-server-03 at a task that clones a Git repository from an internal GitLab server.
  • The same playbook ran successfully on app-server-01 and app-server-02 (the first two hosts in the inventory).
  • SSH from the Ansible control node to app-server-03 works fine (Ansible can connect and run simple tasks).
  • Tasks before the Git clone (package updates, config file copies) completed successfully on app-server-03.
  • Manually running git clone git@gitlab.internal:infra/configs.git from app-server-03 hangs indefinitely.
  • The same git clone from app-server-01 and app-server-02 works fine.
  • app-server-03 was freshly provisioned last week. The other servers have been running for months.

The Misleading Signal

An Ansible playbook hanging on a specific host looks like a host-level issue — maybe SSH is slow, a task has a dependency that is missing on this host, or there is a resource exhaustion problem. The fact that it is a newly provisioned server makes engineers suspect a provisioning issue — missing packages, wrong SSH keys, or an incomplete configuration. The focus goes to comparing app-server-03's configuration against the working servers.