Ansible in Depth: Architecture, Commands, Playbooks, and Production Patterns

Ansible is an agentless automation engine for configuration management, application deployment, orchestration, and ad-hoc operations. You describe desired state in YAML playbooks (or one-line module calls), push changes over SSH or WinRM, and let idempotent modules do the work—no daemon on managed hosts. This guide explains how Ansible fits beside Terraform and CI/CD, how its CLI tools map to real tasks, and what operators need to run it safely at scale.

In short

Ansible uses a control node (your laptop, bastion, or CI runner) plus an inventory of hosts. ansible runs ad-hoc modules; ansible-playbook runs ordered tasks with variables, handlers, and roles. Success depends on inventory truth, vault for secrets, check mode in CI, and treating playbooks like application code—reviewed in Git, tested before prod.

What Ansible is—and what it is not

Ansible was created by Michael DeHaan in 2012 and later acquired by Red Hat. It belongs to the configuration management family alongside Chef, Puppet, and Salt—but unlike those classic agents, Ansible is agentless: it connects to targets, runs modules, and exits. Managed nodes need only Python (on Linux, usually preinstalled) and SSH access (or WinRM on Windows).

Ansible is not a cloud provisioning tool like Terraform. It excels at what runs on machines after they exist: packages, users, files, services, firewall rules, app configs, rolling restarts. Many teams use Terraform for infrastructure and Ansible for bootstrapping and drift correction on VMs and bare metal.

For where config management sits in the DevOps timeline, see Historical foundations of DevOps. For delivery pipelines that invoke Ansible, see GitHub CI/CD, Jenkins CI/CD, and GitOps principles.

Ansible vs Terraform vs Puppet/Chef

ToolPrimary jobState modelTypical interface
Ansible Configure OS and apps on existing hosts; orchestrate steps across fleet Procedural playbooks + idempotent modules (no central CMDB required) YAML playbooks, SSH push
Terraform Provision cloud/network/IAM resources Declarative desired state with .tfstate HCL, provider APIs
Puppet / Chef Long-running desired state on servers Agent pulls catalog from server DSL + agent daemon

Ansible can call cloud APIs (via collections like amazon.aws), but that is not its sweet spot. Use the right tool per layer: Terraform creates the VPC; Ansible hardens the instances and installs the app.

Architecture: control node, inventory, modules, collections

  • Control node — Machine where you install Ansible and run commands. Must be Linux/macOS/WSL for full support (Windows control node is limited).
  • Managed nodes — Hosts in inventory; no Ansible install required on most Unix targets.
  • Inventory — List of hosts and groups (/etc/ansible/hosts, inventory.ini, or dynamic scripts/plugins).
  • Modules — Units of work (apt, yum, copy, template, service, user, …). Idempotent: second run should report ok if nothing changed.
  • Plugins — Connection (SSH), inventory (AWS, Azure), callbacks (logging), lookup (secrets).
  • Collections — Packaged modules/roles on Ansible Galaxy namespace model (e.g. community.general, ansible.posix).
  • Playbooks — YAML files listing plays (host pattern + tasks).
  • Roles — Reusable bundles: tasks, handlers, templates, defaults, vars.
# Mental model
You write playbook (desired steps)
  → ansible-playbook reads inventory + vars
    → for each host: SSH (or winrm) + module JSON over connection plugin
      → module executes on target (often via Python)
        → result (changed/ok/failed/skipped) returned to control node
          → handlers run if notified; play recap printed

Installation and version lines

Modern Ansible ships as the ansible-core package plus optional ansible “batteries included” bundle (many collections). Check version:

ansible --version
# ansible-core 2.16.x ... python 3.11+

Common install paths:

  • pip (venv recommended): python3 -m pip install ansible
  • OS packages: dnf install ansible-core (RHEL/Fedora), apt install ansible (Ubuntu—verify version)
  • Execution environments: Container images with locked collections for AWX/Automation Controller

Pin versions in CI and production controllers. Collection major bumps can change module FQCNs (fully qualified collection names).

CLI tools at a glance

CommandPurpose
ansibleAd-hoc module runs against inventory patterns
ansible-playbookRun playbooks (main automation path)
ansible-inventoryView, graph, or export inventory
ansible-configDump or validate ansible.cfg
ansible-docModule and plugin documentation
ansible-galaxyInstall roles and collections
ansible-vaultEncrypt/decrypt sensitive YAML
ansible-consoleInteractive REPL for ad-hoc tasks
ansible-pullRun playbooks from target (pull model)
ansible-testTest collections/plugins (developer)

Global flags shared by most commands: -i INVENTORY, -u REMOTE_USER, -k (SSH password prompt), -K (become password), -b / --become, -e KEY=VAL extra vars, -v verbosity (-vvv for connection debugging), --private-key, -l LIMIT host pattern limit, -C check mode, -D diff.

Inventory: static, YAML, and dynamic

Inventory defines hosts, groups, and group_vars / host_vars.

INI style

[web]
web01.example.com ansible_host=10.0.1.11
web02.example.com

[db]
db01.example.com

[web:vars]
http_port=8080
ansible_user=deploy

YAML style

all:
  children:
    web:
      hosts:
        web01.example.com:
          ansible_host: 10.0.1.11
        web02.example.com:
    db:
      hosts:
        db01.example.com:

Dynamic inventory — Plugins or scripts query AWS EC2, Azure, VMware, etc., and build groups at runtime (ansible-inventory -i aws_ec2.yml --graph). Prefer inventory plugins over legacy scripts when possible.

localhostansible_connection=local for modules that talk to APIs (cloud modules) without SSH.

ansible — ad-hoc commands

Syntax: ansible <pattern> -m <module> [-a "args"]

# Ping all hosts in inventory
ansible all -m ping

# Uptime on web group
ansible web -m command -a "uptime"

# Install package (become root)
ansible web -b -m apt -a "name=nginx state=present update_cache=yes"

# Copy file
ansible web -m copy -a "src=./app.conf dest=/etc/app/app.conf owner=root mode=0644"

# Gather facts only
ansible db -m setup

# Parallel forks (default 5)
ansible all -m ping -f 20
FlagMeaning
-mModule name (ping, command, shell, apt, …)
-aModule arguments as quoted string or key=value pairs
-iInventory path or plugin config
-lLimit to subset pattern (web01, web:!web02)
-fForks (parallelism)
-b / --becomeEscalate privileges (sudo)
--become-userTarget user after become (default root)
-CCheck mode (dry run where supported)
-oOne-line output (useful in scripts)
-tRun only tasks tagged (playbooks only; N/A for raw ad-hoc)

Prefer command or shell only when no module exists—modules capture idempotency and structured change data.

ansible-playbook — playbooks and plays

Syntax: ansible-playbook site.yml [options]

---
# site.yml — minimal play
- name: Configure web tier
  hosts: web
  become: true
  vars:
    app_version: "2.4.1"
  tasks:
    - name: Ensure nginx package
      ansible.builtin.apt:
        name: nginx
        state: present
        update_cache: true

    - name: Deploy config from template
      ansible.builtin.template:
        src: templates/nginx.conf.j2
        dest: /etc/nginx/nginx.conf
        validate: nginx -t -c %s
      notify: Reload nginx

  handlers:
    - name: Reload nginx
      ansible.builtin.service:
        name: nginx
        state: reloaded

Common ansible-playbook flags:

FlagUse
--syntax-checkValidate YAML and playbook structure without running
--checkCheck mode (predict changes)
--diffShow file diffs for template/copy modules
-t / --tagsRun tagged tasks only
--skip-tagsSkip tagged tasks
-l / --limitLimit hosts
-e / --extra-varsOverride variables (-e @vars.json)
--start-at-taskResume from named task (break-glass)
-v-vvvvIncrease verbosity for debugging
--vault-password-fileDecrypt vault vars non-interactively (protect file permissions)
ansible-playbook site.yml --syntax-check
ansible-playbook site.yml --check --diff -l web01
ansible-playbook deploy.yml -e "version=3.0.0" -t deploy
ansible-playbook site.yml --skip-tags never_prod

Playbook building blocks

Variables and precedence

Ansible merges variables from many sources (lowest to highest precedence simplified): role defaults → inventory → play vars → task vars → extra vars (-e, highest). Use group_vars/ and host_vars/ directories beside inventory for clarity.

Facts

setup module gathers facts (ansible_os_family, ansible_distribution_version, IP addresses). Reference in templates: {{ ansible_hostname }}. Cache facts in large fleets with fact caching (Redis, JSON file) configured in ansible.cfg.

Registers, conditionals, loops

- name: Check app health
  ansible.builtin.uri:
    url: "http://127.0.0.1:8080/health"
    status_code: 200
  register: health

- name: Restart only if unhealthy
  ansible.builtin.service:
    name: myapp
    state: restarted
  when: health.status != 200

- name: Create users
  ansible.builtin.user:
    name: "{{ item.name }}"
    groups: "{{ item.groups }}"
  loop:
    - { name: alice, groups: wheel }
    - { name: bob, groups: users }

Handlers

Handlers run once at end of play if notified—ideal for service restarts. Use notify on tasks that change config; avoid restarting on every play if nothing changed.

Blocks, rescue, always

- block:
    - name: Risky change
      ansible.builtin.command: /opt/migrate.sh
  rescue:
    - name: Rollback marker
      ansible.builtin.file:
        path: /var/run/migration_failed
        state: touch
  always:
    - name: Notify ops
      ansible.builtin.debug:
        msg: "Migration attempt finished"

Roles and Ansible Galaxy

Role layout:

roles/nginx/
  defaults/main.yml
  vars/main.yml
  tasks/main.yml
  handlers/main.yml
  templates/nginx.conf.j2
  files/static.html
  meta/main.yml   # dependencies, platforms

In playbook:

- hosts: web
  roles:
    - role: nginx
      vars:
        worker_processes: 4

ansible-galaxy commands:

ansible-galaxy role install geerlingguy.nginx
ansible-galaxy collection install community.docker
ansible-galaxy init my_role
ansible-galaxy role list
ansible-galaxy collection list

Declare dependencies in requirements.yml for reproducible CI:

---
roles:
  - name: geerlingguy.nginx
collections:
  - name: community.general
  - name: amazon.aws
ansible-galaxy install -r requirements.yml

ansible-inventory

ansible-inventory -i inventory.yml --list
ansible-inventory -i aws_ec2.yml --graph
ansible-inventory --host web01.example.com

Use --graph to debug group membership; --list exports JSON for external tools.

ansible-config and ansible.cfg

Configuration search order (highest wins): ANSIBLE_CONFIG env → ./ansible.cfg~/.ansible.cfg/etc/ansible/ansible.cfg.

ansible-config dump
ansible-config dump --only-changed
ansible-config list

Production-friendly snippets:

[defaults]
inventory = ./inventory
remote_user = deploy
host_key_checking = True
retry_files_enabled = False
stdout_callback = yaml
interpreter_python = auto_silent

[privilege_escalation]
become = True
become_method = sudo

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s

ansible-doc

ansible-doc apt
ansible-doc -l | grep docker
ansible-doc ansible.builtin.template
ansible-doc -s copy   # snippet: args only

Module FQCN format: ansible.builtin.copy or collection shorthand when documented. Always read RETURN and NOTES sections for cloud modules.

ansible-vault

Encrypt secrets at rest in Git. Never commit plaintext prod passwords.

ansible-vault create group_vars/prod/secrets.yml
ansible-vault edit group_vars/prod/secrets.yml
ansible-vault encrypt_string 's3cr3t' --name db_password
ansible-vault view group_vars/prod/secrets.yml

ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass

In CI, inject vault password from OIDC-backed secret store—same discipline as GitHub Actions secrets. Rotate vault IDs with ansible-vault rekey when people leave the team.

ansible-console and ansible-pull

# Interactive ad-hoc (tab-complete hosts/modules)
ansible-console web
# ansible> apt name=htop state=present

# Pull model: target clones repo and runs playbook locally
ansible-pull -U https://git.example.com/config.git -C main site.yml

Pull mode suits autoscaling workers without inbound SSH from a central controller—at the cost of distributed trust on the repo URL and keys.

Connection plugins and targets

PluginWhenInventory vars
ssh (default)Linux, BSD, network devices with SSHansible_host, ansible_user, ansible_ssh_private_key_file
winrmWindows Serveransible_connection=winrm, cert validation settings
localModule runs on control nodeansible_connection=local
docker / podmanContainer as targetVia community.docker collection
network_cli / netconfRouter/switch automationansible_network_os, ansible_become

Bastion hops: set ansible_ssh_common_args='-o ProxyJump=bastion' or use ansible_ssh_args in group_vars.

Templates, files, and Jinja2

template renders .j2 files with Jinja2; copy pushes static files. Use validate on templates when a bad config could brick a service (e.g. nginx -t).

# templates/app.env.j2
APP_ENV={{ environment }}
DB_HOST={{ hostvars[groups['db'][0]]['ansible_host'] }}
{% for user in app_users %}
ALLOW_USER={{ user }}
{% endfor %}

Execution strategies and performance

  • linear (default) — Each task waits for all hosts before next task.
  • free — Hosts proceed independently (faster, order not guaranteed across hosts).
  • serial — Rolling updates: serial: "25%" on play for canary-style deploys.

Speed tips: enable SSH pipelining, increase forks cautiously, use ansible.builtin.package with native module not command apt, mitogen connection plugin (third party) for very large fleets—test before prod.

AWX and Red Hat Ansible Automation Platform

AWX is the upstream open-source web UI and API for scheduling job templates, credentials, inventories, and RBAC. Automation Controller is the supported enterprise product (formerly Tower). Features you get beyond CLI:

  • Job templates and workflow templates (orchestrate multiple playbooks)
  • Credential types injected at runtime (machine, cloud, vault)
  • Survey variables for operators
  • RBAC and audit logs for regulated environments
  • Execution environments (containerized toolchain)

Pattern: developers run ansible-playbook locally; production runs only from Controller with signed collections and approved inventories.

Security and operations

  • Use SSH keys, disable password SSH on servers (Linux in depth).
  • Limit become with sudoers templates; never store become passwords in Git.
  • Run playbooks in CI with --check on PRs; apply on merge with approval.
  • Pin collections; scan playbooks for shell injection (shell module with user input).
  • Separate inventories per environment (prod vs staging); require -l confirmation in wrapper scripts for prod.
  • Log to centralized SIEM via callback plugins; avoid logging vault contents.

Troubleshooting playbook

SymptomLikely causeWhat to check
UNREACHABLE SSH firewall, wrong IP, key not loaded ssh deploy@host, -vvv, security groups
Permission denied become Sudoers, missing -b, wrong become user ansible -m ping -b, sudo -l on target
Failed to find required executable python Minimal image without Python raw bootstrap or interpreter_python discovery
Works ad-hoc, playbook fails Different inventory, missing vars, tag skip --syntax-check, ansible-inventory --host
Handler never runs Task did not change (changed: false) Use force_handlers or notify on failing branch
Slow runs Low forks, no pipelining, gather_facts every time gather_facts: false when safe, fact cache, SSH multiplexing

Example: layered project layout

ansible/
  ansible.cfg
  inventory/
    production/
      hosts.yml
      group_vars/all.yml
      group_vars/web.yml
  playbooks/
    site.yml
    deploy.yml
  roles/
    common/
    nginx/
    app/
  group_vars/
  host_vars/
  requirements.yml

site.yml imports other plays:

---
- import_playbook: common.yml
- import_playbook: web.yml
- import_playbook: deploy.yml

Learning path (hands-on)

  1. Install ansible-core in a Python venv; create inventory/hosts with one Vagrant or cloud VM.
  2. Run ansible all -m ping and ansible all -m setup | less.
  3. Write a playbook that installs a package and templates a config; run with --check --diff.
  4. Extract tasks into a role; publish nothing yet—use ansible-galaxy init.
  5. Add ansible-vault for a fake DB password; run playbook with vault password.
  6. Add a GitHub Actions job that runs ansible-playbook --syntax-check and molecule test (optional) on PR.
  7. Pair with Terraform: provision VM → Ansible play for hardening and app install.

Further reading on this site

Blog index · Terraform & IaC · Linux in depth · GitHub CI/CD

Back to blog list

Ansible® is a registered trademark of Red Hat, Inc. Command behavior may vary slightly by ansible-core version; verify against your installed docs with ansible-doc.