Diagnostic Questions¶
Before revealing the investigation path:¶
-
The Ansible playbook hangs on one host at a
git clonetask but completes successfully on two other hosts. What are the first three things you would check to determine if the problem is host-specific, network-specific, or task-specific? -
SSH from the Ansible control node to
app-server-03works fine, butgit clone git@gitlab.internalfromapp-server-03hangs. What does this tell you about the difference between the two SSH connections? -
You find that SSH agent forwarding works (
ssh-add -lshows the key) but thebecome: deploysudo context stripsSSH_AUTH_SOCK. Is this the root cause of the hang, or a contributing factor? How would you distinguish between agent forwarding issues and network connectivity issues? -
nc -zv gitlab.internal 22times out fromapp-server-03but succeeds fromapp-server-01. The servers are in different security groups. Why is the fix a networking change (security group) rather than a Linux change (sudoers) or an Ansible change (playbook config)? -
What alternative to SSH agent forwarding would eliminate this entire class of failures? What are the trade-offs?