Ansible Ops¶

55 cards — 🟢 10 easy | 🟡 18 medium | 🔴 12 hard

🟢 Easy (10)¶

1. What is an Ansible playbook?

Show answer

A playbook is a YAML file containing one or more "plays", which map a set of tasks to a group of hosts. It defines what tasks to run on which hosts (and in what order) to automate configuration or deployment.

Remember: playbook = 'the script', play = 'a scene', task = 'a line.' A playbook has plays, plays have tasks.

Example: ansible-playbook site.yml runs the playbook. Add -C for check mode (dry run), -D for diff output.

2. What are Ansible "playbooks"?

Show answer

YAML scripts that declare the desired state of a system.

Example: a playbook targeting [webservers] installs nginx, templates the config, and notifies a handler to restart — all in one YAML file.

Name origin: from sports — a 'playbook' is a collection of plays (strategies) to run in sequence.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

3. How do you run an Ansible playbook on a specific group of hosts?

Show answer

By using the -l (limit) flag or by specifying hosts in the playbook. For example: ansible-playbook site.yml -l webservers would run the playbook only on hosts in the "webservers" group.

Example: ansible-playbook -i inventory.ini site.yml -l webservers --check --diff — runs in check mode on webservers only, showing what would change.

4. What is a task in Ansible?

Show answer

A task is the smallest unit of action in a playbook, typically calling an Ansible module with specific arguments (e.g., a task to install a package or copy a file).

Remember: task = one action on one or more hosts. A task calls a module (apt, copy, service) with parameters. Tasks execute in order within a play.

Example: - name: Install nginx
apt: name=nginx state=present — this is one task calling the apt module.

5. What is "Ansible Vault"?

Show answer

A tool for encrypting sensitive data like passwords within playbooks.

Example: ansible-vault encrypt secrets.yml encrypts in place. Use --ask-vault-pass at runtime.

Under the hood: uses AES-256 symmetric encryption. The password never goes in the repo.

Remember: Vault = AES-256 encryption for secrets in YAML. Never commit vault passwords — use --vault-password-file pointing to a CI/CD secret.

6. What is a "strategy" in Ansible? What is the default strategy?

Show answer

A strategy in Ansible describes how Ansible will execute the different tasks on the hosts. By default Ansible is using the "Linear strategy" which defines that each task will run on all hosts before proceeding to the next task.

Remember: Linear = 'task-at-a-time across all hosts.' Free = 'all tasks on each host as fast as possible.' Debug = 'step-by-step interactive.'

7. What are Ansible "tags"?

Show answer

Labels used to selectively run or skip specific tasks.

Example: ansible-playbook site.yml --tags=deploy runs only deploy-tagged tasks.

Gotcha: untagged tasks run by default. Use the special 'always' tag for must-run tasks.

Remember: tags = surgical targeting. --tags=deploy runs only deploy tasks. --skip-tags=setup skips setup tasks. Great for partial runs.

8. What is "Ansible Tower" or "AWX"?

Show answer

A web-based interface for managing Ansible at scale.

Fun fact: AWX is the open-source upstream; Tower (now Ansible Automation Platform) is the Red Hat commercial build.

Under the hood: adds RBAC, job scheduling, credential vaults, REST API, and audit trail for team-scale Ansible.

Remember: AWX = free upstream, Tower = paid Red Hat product (now 'Ansible Automation Platform'). Both add web UI, RBAC, scheduling, and REST API.

9. What's your experience with Ansible?

Show answer

I've used Ansible heavily for server provisioning, patching, switch configuration, and enforcing consistency across large server fleets. I write clean, modular roles, use inventories effectively, and rely on Jinja2 templating for dynamic configs. I've automated PXE/bootstrap workflows and built idempotent playbooks for both servers and network gear.

Remember: in interviews, structure your experience answer as: scope (how many servers/services), tools used, key challenges solved, and measurable impact (time saved, incidents reduced).

10. What is the "Control Node"?

Show answer

The machine where Ansible is installed and commands are run.

Gotcha: must run Linux or macOS — Windows is not supported as control node (use WSL2). Windows works only as a managed node via WinRM.

Under the hood: reads playbooks, resolves inventory, opens SSH connections, aggregates results.

Remember: control node = 'command center.' Must be Linux/macOS. Windows support is managed-node only (via WinRM).

🟡 Medium (18)¶

1. What steps would you take to debug an issue with Ansible Container?

Show answer

Debug by:
* Reviewing Ansible Container logs.
* Checking container runtime logs.
* Inspecting container build outputs.
* Using the --debug option for detailed debugging information.

Gotcha: Ansible Container was archived/deprecated in 2019. Modern container workflows use ansible-builder for Execution Environments or standard Dockerfiles.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

2. What strategies are you familiar with in Ansible?

Show answer

- Linear: the default strategy in Ansible. Run each task on all hosts before proceeding.
- Free: For each host, run all the tasks until the end of the play as soon as possible
- Debug: Run tasks in an interactive way

Remember: Linear (default) = task-by-task across all hosts. Free = each host runs independently. Debug = interactive step-through. Set via strategy: free in the play.

3. How does Ansible ensure idempotence?

Show answer

Ansible ensures idempotence by executing tasks only if the desired state is different from the current state. Tasks are designed to be repeatable without causing unintended changes, ensuring consistency in configurations.

Example: 'Ensure nginx is installed' is idempotent — it checks first. 'Run apt install nginx' via shell is not — it runs every time.

4. What the serial keyword is used for?

Show answer

It's used to specify the number (or percentage) of hosts to run the full play on, before moving to next number of hosts in the group.

For example:
```\n- name: Some play\n hosts: databases\n serial: 4\n```

If your group has 8 hosts. It will run the whole play on 4 hosts and then the same play on another 4 hosts.

Example: serial: '25%' updates a quarter of your fleet at a time. Combine with max_fail_percentage: 10 to abort if too many hosts fail.

5. How do you manage secrets in Ansible?

Show answer

Never inline plaintext secrets. Options:

**Ansible Vault**:
```bash\nansible-vault create secrets.yml\nansible-vault edit secrets.yml\nansible-playbook --ask-vault-pass playbook.yml\n```
Good for: Smaller teams, simple needs.

**External secret managers**:
* HashiCorp Vault (`hashi_vault` lookup)
* AWS Secrets Manager
* Azure Key Vault
* CyberArk, etc.

Good for: Enterprise, dynamic secrets, audit requirements.

**Best practices**:
* Separate vault files per environment
* Use vault IDs for multiple passwords
* CI/CD: vault password from secure variable, never committed
* Rotate secrets regularly

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

6. True or False? By default, Ansible will execute all the tasks in play on a single host before proceeding to the next host

Show answer

False. Ansible will execute a single task on all hosts before moving to the next task in a play. As for today, it uses 5 forks by default.
This behavior is described as "strategy" in Ansible and it's configurable.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

7. Why is shell in Ansible dangerous?

Show answer

The `shell` and `command` modules should be last resort:

**Breaks idempotence**: Shell commands run every time unless you add complex `creates`/`removes` or `when` conditions.

**Hides failures**: Exit codes aren't always meaningful. Silent failures corrupt state.

**Non-portable**: Shell commands vary across distros, versions, shells.

**Hard to test**: No structured output to validate.

**Example - bad**:
```yaml\n- shell: useradd myuser\n```

**Example - good**:
```yaml\n- user:\n name: myuser\n state: present\n```

The module handles idempotence, cross-platform differences, and returns structured results.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

8. How can you test and validate a dynamic inventory script?

Show answer

Test the script by running it manually and examining output. Validate by checking if it produces JSON-formatted data with required host information.

Example: ./inventory.py --list | python -m json.tool validates JSON output. Compare against ansible-inventory -i inventory.py --graph to see how Ansible interprets the groups.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

9. What is an Ansible "handler"?

Show answer

A task triggered by another task that only runs if notified.

Example: a handler 'restart nginx' triggers via notify on config change. Runs once at end of play, even if notified multiple times.

Gotcha: handlers execute in definition order, not notification order. Use meta: flush_handlers for immediate execution.

Remember: handler = 'only run if something changed.' A notify on config file change triggers a restart handler. Runs once at end of play.

10. How do you create and manage inventories in Ansible Tower?

Show answer

Inventories in Ansible Tower can be managed through the web interface. You can create and organize inventories, define variables, and configure sources such as static files, dynamic scripts, or cloud providers. Tower also supports syncing with external inventory systems.

Remember: AWX = free upstream, Tower = paid Red Hat product (now 'Ansible Automation Platform'). Both add web UI, RBAC, scheduling, and REST API.

11. Explain how to use the --debug option with Ansible Container commands.

Show answer

Append --debug to Ansible Container commands for increased verbosity. Example:
```bash\nansible-container build --debug\n```

Gotcha: Ansible Container has been archived/deprecated since 2019. For modern container builds, use ansible-builder or standard Dockerfile workflows instead.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

12. Explain how to view job output and logs in Ansible Tower.

Show answer

View job output in the Ansible Tower UI under the specific job details. Logs can be accessed through the UI or retrieved using the Tower API. Additionally, logs are stored in the Tower log directory on the Tower server.

Remember: AWX = free upstream, Tower = paid Red Hat product (now 'Ansible Automation Platform'). Both add web UI, RBAC, scheduling, and REST API.

13. Explain the process of editing an encrypted file with Ansible Vault.

Show answer

Use the ansible-vault edit command to edit an encrypted file.
Example:
```bash\nansible-vault edit secret_file.yml\n```
Ansible will prompt for the Vault password before allowing access.

Under the hood: ansible-vault edit decrypts to a temp file, opens your $EDITOR, then re-encrypts on save. The plaintext never touches disk persistently.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

14. What does idempotency mean in Ansible and why is it important?

Show answer

Idempotency means running the same operation multiple times produces the same
result as running it once. In Ansible, a task is idempotent if running it
repeatedly doesn't change the system after the first application.

Why it matters:
- Safe to re-run playbooks (won't break things)
- Enables "desired state" configuration
- Allows for drift detection and correction
- Makes automation predictable and reliable

Good (idempotent):
```yaml
- name: Ensure nginx is installed
apt:
name: nginx
state: present

Example: 'Ensure nginx is installed' is idempotent — it checks first. 'Run apt install nginx' via shell is not — it runs every time.

15. How do you limit Ansible to run on one host at a time (serial execution)?

Show answer

In a playbook, use the serial keyword (e.g., serial: 1 in a play limits Ansible to configure one host at a time from the inventory).

Gotcha: serial: 1 is essential for rolling updates — it ensures one host is fully configured and healthy before moving to the next, preventing fleet-wide outages from a bad config.

Example: serial: '25%' updates a quarter of your fleet at a time. Combine with max_fail_percentage: 10 to abort if too many hosts fail.

16. How do you create an encrypted file using Ansible Vault?

Show answer

Use the ansible-vault create command to create an encrypted file.
Example:
```bash\nansible-vault create secret_file.yml\n```

Under the hood: ansible-vault create opens $EDITOR for a new file and encrypts it on save using AES-256. The encrypted file can be committed to git safely.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

17. Explain the role of orchestration in automating complex tasks across servers and networking devices.

Show answer

• Task Sequencing: Orchestration involves the coordination and sequencing of multiple tasks across servers and networking devices to achieve a specific objective. • Workflow Automation: Orchestration tools automate workflows by defining the order and dependencies of tasks, ensuring that each step is executed in the correct sequence. • Cross-Platform Integration: Orchestration facilitates the integration of tasks across diverse platforms, allowing for the automation of complex processes involving servers, networking devices, and external services.

Remember: orchestration = coordinating tasks across multiple systems in a defined order. Ansible handles orchestration through plays (host targeting), serial (rolling), and delegation (task routing).

18. Have you used automation tools like PowerShell or Ansible for server management tasks?

Show answer

Automation tools such as PowerShell and Ansible streamline server management tasks: • **PowerShell:* • • Windows Environment: PowerShell is a scripting language and automation framework designed for Windows environments. • Task Automation: It allows the automation of various tasks, including server configuration, software deployment, and system administration. • Scripting Capabilities: PowerShell scripts can be written to execute commands and tasks, making it efficient for managing Windows servers.

Remember: PowerShell is Windows-centric (though cross-platform via pwsh). Ansible is Linux-first but supports Windows via WinRM. Choose based on your fleet's OS mix.

🔴 Hard (12)¶

1. Managing Multiple Environments

Show answer

To manage multiple environments, I would use separate inventory files and group variables for each environment.

Example directory structure:

```\ninventories/\n development/\n hosts\n group_vars/\n all.yml\n staging/\n hosts\n group_vars/\n all.yml\n production/\n hosts\n group_vars/\n all.yml\nsite.yml\n```

In site.yml, specify the inventory file based on the environment:

```\nansible-playbook -i inventories/development/hosts site.yml\nansible-playbook -i inventories/staging/hosts site.yml\nansible-playbook -i inventories/production/hosts site.yml\n```

This approach keeps environment-specific configurations separate and manageable.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

2. What's the most common non-obvious Ansible scalability killer?

Show answer

Fact gathering - enabled by default, runs on every host, before any tasks.

The problem:
- `gather_facts: true` is default
- Collects extensive system info via setup module
- Runs sequentially per fork
- Takes 2-10 seconds PER HOST
- 1000 hosts * 5 seconds = 83 minutes before first task

Other scalability killers:

1. SSH connection setup
- New connection per host per play
- SSH handshake overhead
- Mitigation: pipelining, ControlPersist

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

3. Migrating Legacy Scripts to Ansible

Show answer

To migrate legacy scripts to Ansible:

Identify Tasks: Break down the shell script into discrete tasks.

Use Ansible Modules: Replace shell commands with equivalent Ansible modules.

Structure Playbooks: Organize tasks into roles and playbooks for better management.

Example migration:

Original shell script:

```\n#!/bin/bash\napt-get update\napt-get install -y nginx\necho "Hello, World!" > /var/www/html/index.html\n```

Migrated Ansible playbook:

```\n- hosts: web\n tasks:\n - name: Update apt cache\n apt:\n update_cache: yes\n\n - name: Install Nginx\n apt:\n name: nginx\n state: present\n\n - name: Create index.html\n copy:\n content: "Hello, World!"\n dest: /var/www/html/index.html\n```

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

4. Configuring Ansible for Network Automation

Show answer

For network automation, I would use Ansible network modules and collections like ansible.netcommon and vendor-specific collections.

Example playbook for Cisco devices:

```\n- hosts: cisco_routers\n gather_facts: no\n tasks:\n - name: Configure interface\n cisco.ios.ios_interface:\n name: GigabitEthernet1\n description: "Configured by Ansible"\n enabled: yes\n```

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

5. Error Handling in Playbooks

Show answer

To handle errors:

Ignore Errors: Use ignore_errors: yes for non-critical tasks.

```\n- name: Task that might fail\n command: /bin/false\n ignore_errors: yes\n```

Retries: Use retries and delay for tasks that might fail intermittently.

```\n- name: Retry task\n command: /path/to/command\n retries: 5\n delay: 10\n until: result.rc == 0\n register: result\n```

Rescue and Always: Use block, rescue, and always for structured error handling.

```\n- name: Structured error handling\n block:\n - name: Task that might fail\n command: /bin/false\n rescue:\n - name: Handle failure\n debug:\n msg: "Task failed"\n always:\n - name: Always run\n debug:\n msg: "This always runs"\n```

6. Optimizing Playbook Performance

Show answer

To optimize playbook performance:

Parallelism: Increase the number of forks to run tasks in parallel.

```\nansible-playbook -i inventory playbook.yml -f 10\n```

Delegate Tasks: Delegate tasks to appropriate hosts to distribute the load.

```\n- name: Fetch something\n delegate_to: localhost\n```

Limit Scope: Use --limit to target specific hosts.

```\nansible-playbook -i inventory playbook.yml --limit web_servers\n```

Asynchronous Tasks: Use asynchronous tasks for long-running operations.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

7. Handling Dependencies

Show answer

To integrate Ansible with a CI/CD pipeline, I would use tools like Jenkins, GitLab CI, or GitHub Actions. The CI/CD pipeline would trigger Ansible playbooks to deploy or update infrastructure.

Example using GitLab CI:

```\nstages:\n - deploy\n\ndeploy:\n stage: deploy\n script:\n - ansible-playbook -i inventory playbook.yml\n```

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

8. Rolling Updates with Zero Downtime

Show answer

To implement a rolling update with zero downtime, I would use a combination of Ansible playbooks and a load balancer. The process involves updating a subset of servers at a time while ensuring the load balancer only directs traffic to healthy nodes. Here’s a high-level approach:

Drain Traffic: Use Ansible to interact with the load balancer API to drain traffic from the first subset of servers.

Update Servers: Apply the updates to the drained servers.

Health Check: Ensure the updated servers pass health checks.

9. Multi-Tier Application Deployment

Show answer

To handle a multi-tier application deployment efficiently, I would use a modular approach with Ansible roles. Each tier (web server, application server, database server) would have its own role. The directory structure might look like this:

```sh
site.yml
roles/
web/
tasks/
main.yml
templates/
web.conf.j2
app/
tasks/
main.yml
templates/
app.conf.j2
db/
tasks/
main.yml
templates/
db.conf.j2
inventory/
production/
hosts
group_vars/

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

10. Handling Configuration Drift

Show answer

To ensure servers remain in the desired state, I would implement regular configuration enforcement using Ansible Tower/AWX or a cron job that runs playbooks periodically. Additionally, I would use Ansible’s check mode to detect drift without making changes.

Example cron job:

```\n0 2 * * * ansible-playbook -i inventory playbook.yml\n```

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

11. Ansible Dynamic Inventory

Show answer

To manage a dynamic infrastructure, I would use Ansible’s dynamic inventory feature. This can be achieved by using inventory scripts or plugins that query external data sources such as cloud provider APIs (e.g., AWS, Azure).

Example using AWS EC2 dynamic inventory:

Install the AWS Inventory Plugin:

```\npip install boto boto3\n```

Configure the AWS Inventory Plugin:

```\nplugin: aws_ec2\nregions:\n - us-east-1\nfilters:\n tag:Environment: production\n\n```

Use the Dynamic Inventory in Playbooks:

```\nansible-playbook -i aws_ec2.yml playbook.yml\n\n```

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.

12. Troubleshooting Playbook Failures

Show answer

To troubleshoot a failing playbook:

Check Recent Changes: Review recent changes in the playbook, roles, or inventory files.

Verbose Output: Run the playbook with increased verbosity (-vvv) to get detailed output and identify where it fails.

Environment Consistency: Ensure the environment where the playbook is run hasn’t changed (e.g., different Ansible version, OS updates, or network configurations).

Isolate the Issue: Isolate the failing task by running it independently or within a minimal playbook.

Remember: Ansible operations best practice — always use version control for playbooks, test in staging before production, and document your inventory structure for the team.