Skip to content

Portal | Level: L1: Foundations | Topics: Ansible | Domain: DevOps & Tooling

Ansible for Infrastructure Automation - Primer

Why This Matters

Ansible automates configuration management, application deployment, and orchestration. Unlike Terraform (which creates infrastructure), Ansible configures what's already running: installing packages, managing config files, deploying applications, and enforcing system state. It's agentless (uses SSH), has a low learning curve, and is used everywhere from small teams to enterprise scale.

Who made it: Ansible was created by Michael DeHaan and released as open source in February 2012. DeHaan previously wrote Cobbler (a Linux provisioning server) and co-authored Func (Fedora Unified Network Controller). Red Hat acquired Ansible, Inc. in October 2015. The name comes from Ursula K. Le Guin's 1966 novel Rocannon's World (later popularized in Orson Scott Card's Ender's Game), where an "ansible" is a device for instantaneous communication across any distance — fitting for a tool that communicates with hundreds of servers simultaneously.

Fun fact: Ansible's "agentless" design was a deliberate reaction to Chef and Puppet, which both require agents installed on every managed node. DeHaan's philosophy: if a machine has SSH and Python, it is already ready for configuration management. This zero-bootstrap approach is why Ansible became the default choice for network device automation — switches and routers have SSH but cannot run Ruby or install agents.

Core Concepts

How Ansible Works

Control Node (your laptop / CI server)
     |
     | SSH (or WinRM for Windows)
     v
Managed Nodes (target servers)

No agent required on managed nodes - just Python and SSH. Ansible connects, pushes modules, executes them, and returns results. Everything is idempotent by default: running the same playbook twice produces the same result without side effects.

Inventory

Inventory defines your target hosts. Can be static files or dynamic (from cloud APIs).

# inventory/hosts.ini (INI format)
[webservers]
web1.example.com
web2.example.com ansible_host=10.0.1.10

[dbservers]
db1.example.com ansible_port=2222

[production:children]
webservers
dbservers

[webservers:vars]
http_port=8080
app_env=production
# inventory/hosts.yml (YAML format)
all:
  children:
    webservers:
      hosts:
        web1.example.com:
        web2.example.com:
          ansible_host: 10.0.1.10
      vars:
        http_port: 8080
    dbservers:
      hosts:
        db1.example.com:
          ansible_port: 2222

Dynamic inventory scripts pull hosts from AWS, GCP, Azure, etc.:

ansible-inventory -i aws_ec2.yml --list

Playbooks

Playbooks are YAML files that define the desired state. A playbook contains plays; each play targets hosts and runs tasks.

---
- name: Configure web servers
  hosts: webservers
  become: yes                    # Run as root (sudo)

  vars:
    app_port: 8080
    packages:
      - nginx
      - python3
      - certbot

  tasks:
    - name: Install required packages
      apt:
        name: "{{ packages }}"
        state: present
        update_cache: yes

    - name: Copy nginx config
      template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/sites-available/default
        owner: root
        group: root
        mode: '0644'
      notify: Restart nginx

    - name: Ensure nginx is running
      service:
        name: nginx
        state: started
        enabled: yes

  handlers:
    - name: Restart nginx
      service:
        name: nginx
        state: restarted

Key Module Categories

Category Modules Purpose
Package apt, dnf, pip Install/remove packages
File file, copy, template, lineinfile Manage files and content
Service service, systemd Manage services
User user, group, authorized_key Manage users and access
Command command, shell, script, raw Run arbitrary commands
Cloud ec2, gcp_compute_instance Cloud resource management

Important: Prefer specific modules over command/shell. Modules are idempotent; raw commands usually aren't.

Variables and Facts

Variable precedence (simplified, lowest to highest): 1. Role defaults (defaults/main.yml) 2. Inventory vars 3. Group vars (group_vars/) 4. Host vars (host_vars/) 5. Play vars 6. Task vars 7. Extra vars (-e on command line) - always win

# group_vars/webservers.yml
http_port: 8080
app_user: deploy

# host_vars/web1.example.com.yml
http_port: 9090    # Override for this specific host

Facts are system information Ansible gathers automatically:

- debug:
    msg: "OS: {{ ansible_distribution }} {{ ansible_distribution_version }}"
    # Output: "OS: Ubuntu 22.04"

- debug:
    msg: "IP: {{ ansible_default_ipv4.address }}"

Templates (Jinja2)

Templates generate config files with dynamic content:

{# templates/nginx.conf.j2 #}
server {
    listen {{ http_port }};
    server_name {{ ansible_fqdn }};

    location / {
        proxy_pass http://127.0.0.1:{{ app_port }};
    }

{% if enable_ssl %}
    listen 443 ssl;
    ssl_certificate /etc/ssl/{{ ansible_fqdn }}.crt;
{% endif %}
}

Roles

Roles are the standard way to organize reusable Ansible content:

roles/
  webserver/
    tasks/main.yml       # Task list
    handlers/main.yml    # Handlers
    templates/           # Jinja2 templates
    files/               # Static files
    vars/main.yml        # Variables (high precedence)
    defaults/main.yml    # Default variables (low precedence)
    meta/main.yml        # Role metadata and dependencies

Using roles in a playbook:

- hosts: webservers
  roles:
    - common
    - webserver
    - { role: monitoring, tags: ['monitoring'] }

Ansible Vault

Encrypt sensitive data (passwords, keys, certificates):

# Create an encrypted file
ansible-vault create secrets.yml

# Edit an encrypted file
ansible-vault edit secrets.yml

# Encrypt an existing file
ansible-vault encrypt vars/passwords.yml

# Run playbook with vault password
ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass

# Encrypt a single variable
ansible-vault encrypt_string 'my_secret_password' --name 'db_password'

Default trap: Ansible's variable precedence has 22 levels, and extra vars (-e on the command line) always win — they override everything, including role vars and host vars. This is powerful for one-off overrides but dangerous if used as a habit. A common mistake: passing -e app_env=staging in a CI pipeline that deploys to production, overriding the inventory's app_env=production. Use extra vars sparingly and prefer inventory-based variable hierarchy for environment-specific config.

Handlers

Handlers run only when notified, and only once at the end of the play (even if notified multiple times):

tasks:
  - name: Update nginx config
    template:
      src: nginx.conf.j2
      dest: /etc/nginx/nginx.conf
    notify: Restart nginx

  - name: Update SSL cert
    copy:
      src: cert.pem
      dest: /etc/ssl/cert.pem
    notify: Restart nginx    # Won't restart twice

handlers:
  - name: Restart nginx
    service:
      name: nginx
      state: restarted

What Experienced People Know

  • command and shell modules are not idempotent. If you use them, add creates: or when: conditions to make them safe to re-run.
  • --check mode (dry run) doesn't work with command/shell tasks. Design your playbooks to be check-mode compatible.
  • Variable precedence in Ansible is complex (22 levels). When in doubt, use ansible -m debug -a "var=my_variable" hostname to check what value a host sees.
  • ansible-playbook --diff shows what changed in files. Use it with --check for a safe preview.
  • Tags let you run specific parts of a playbook: ansible-playbook site.yml --tags "nginx". Design your tags from the start.
  • Test with a single host first: ansible-playbook site.yml --limit web1.example.com.

Wiki Navigation

Prerequisites

Next Steps