Ansible in Depth: Architecture, Commands, Playbooks, and Production Patterns
Ansible is an agentless automation engine for configuration management, application deployment, orchestration, and ad-hoc operations. You describe desired state in YAML playbooks (or one-line module calls), push changes over SSH or WinRM, and let idempotent modules do the work—no daemon on managed hosts. This guide explains how Ansible fits beside Terraform and CI/CD, how its CLI tools map to real tasks, and what operators need to run it safely at scale.
In short
Ansible uses a control node (your laptop, bastion, or CI runner) plus an inventory of hosts. ansible runs ad-hoc modules; ansible-playbook runs ordered tasks with variables, handlers, and roles. Success depends on inventory truth, vault for secrets, check mode in CI, and treating playbooks like application code—reviewed in Git, tested before prod.
What Ansible is—and what it is not
Ansible was created by Michael DeHaan in 2012 and later acquired by Red Hat. It belongs to the configuration management family alongside Chef, Puppet, and Salt—but unlike those classic agents, Ansible is agentless: it connects to targets, runs modules, and exits. Managed nodes need only Python (on Linux, usually preinstalled) and SSH access (or WinRM on Windows).
Ansible is not a cloud provisioning tool like Terraform. It excels at what runs on machines after they exist: packages, users, files, services, firewall rules, app configs, rolling restarts. Many teams use Terraform for infrastructure and Ansible for bootstrapping and drift correction on VMs and bare metal.
For where config management sits in the DevOps timeline, see Historical foundations of DevOps. For delivery pipelines that invoke Ansible, see GitHub CI/CD, Jenkins CI/CD, and GitOps principles.
Ansible vs Terraform vs Puppet/Chef
| Tool | Primary job | State model | Typical interface |
|---|---|---|---|
| Ansible | Configure OS and apps on existing hosts; orchestrate steps across fleet | Procedural playbooks + idempotent modules (no central CMDB required) | YAML playbooks, SSH push |
| Terraform | Provision cloud/network/IAM resources | Declarative desired state with .tfstate |
HCL, provider APIs |
| Puppet / Chef | Long-running desired state on servers | Agent pulls catalog from server | DSL + agent daemon |
Ansible can call cloud APIs (via collections like amazon.aws), but that is not its sweet spot. Use the right tool per layer: Terraform creates the VPC; Ansible hardens the instances and installs the app.
Architecture: control node, inventory, modules, collections
- Control node — Machine where you install Ansible and run commands. Must be Linux/macOS/WSL for full support (Windows control node is limited).
- Managed nodes — Hosts in inventory; no Ansible install required on most Unix targets.
- Inventory — List of hosts and groups (
/etc/ansible/hosts,inventory.ini, or dynamic scripts/plugins). - Modules — Units of work (
apt,yum,copy,template,service,user, …). Idempotent: second run should reportokif nothing changed. - Plugins — Connection (SSH), inventory (AWS, Azure), callbacks (logging), lookup (secrets).
- Collections — Packaged modules/roles on Ansible Galaxy namespace model (e.g.
community.general,ansible.posix). - Playbooks — YAML files listing plays (host pattern + tasks).
- Roles — Reusable bundles: tasks, handlers, templates, defaults, vars.
# Mental model
You write playbook (desired steps)
→ ansible-playbook reads inventory + vars
→ for each host: SSH (or winrm) + module JSON over connection plugin
→ module executes on target (often via Python)
→ result (changed/ok/failed/skipped) returned to control node
→ handlers run if notified; play recap printed
Installation and version lines
Modern Ansible ships as the ansible-core package plus optional ansible “batteries included” bundle (many collections). Check version:
ansible --version
# ansible-core 2.16.x ... python 3.11+
Common install paths:
- pip (venv recommended):
python3 -m pip install ansible - OS packages:
dnf install ansible-core(RHEL/Fedora),apt install ansible(Ubuntu—verify version) - Execution environments: Container images with locked collections for AWX/Automation Controller
Pin versions in CI and production controllers. Collection major bumps can change module FQCNs (fully qualified collection names).
CLI tools at a glance
| Command | Purpose |
|---|---|
ansible | Ad-hoc module runs against inventory patterns |
ansible-playbook | Run playbooks (main automation path) |
ansible-inventory | View, graph, or export inventory |
ansible-config | Dump or validate ansible.cfg |
ansible-doc | Module and plugin documentation |
ansible-galaxy | Install roles and collections |
ansible-vault | Encrypt/decrypt sensitive YAML |
ansible-console | Interactive REPL for ad-hoc tasks |
ansible-pull | Run playbooks from target (pull model) |
ansible-test | Test collections/plugins (developer) |
Global flags shared by most commands: -i INVENTORY, -u REMOTE_USER, -k (SSH password prompt), -K (become password), -b / --become, -e KEY=VAL extra vars, -v verbosity (-vvv for connection debugging), --private-key, -l LIMIT host pattern limit, -C check mode, -D diff.
Inventory: static, YAML, and dynamic
Inventory defines hosts, groups, and group_vars / host_vars.
INI style
[web]
web01.example.com ansible_host=10.0.1.11
web02.example.com
[db]
db01.example.com
[web:vars]
http_port=8080
ansible_user=deploy
YAML style
all:
children:
web:
hosts:
web01.example.com:
ansible_host: 10.0.1.11
web02.example.com:
db:
hosts:
db01.example.com:
Dynamic inventory — Plugins or scripts query AWS EC2, Azure, VMware, etc., and build groups at runtime (ansible-inventory -i aws_ec2.yml --graph). Prefer inventory plugins over legacy scripts when possible.
localhost — ansible_connection=local for modules that talk to APIs (cloud modules) without SSH.
ansible — ad-hoc commands
Syntax: ansible <pattern> -m <module> [-a "args"]
# Ping all hosts in inventory
ansible all -m ping
# Uptime on web group
ansible web -m command -a "uptime"
# Install package (become root)
ansible web -b -m apt -a "name=nginx state=present update_cache=yes"
# Copy file
ansible web -m copy -a "src=./app.conf dest=/etc/app/app.conf owner=root mode=0644"
# Gather facts only
ansible db -m setup
# Parallel forks (default 5)
ansible all -m ping -f 20
| Flag | Meaning |
|---|---|
-m | Module name (ping, command, shell, apt, …) |
-a | Module arguments as quoted string or key=value pairs |
-i | Inventory path or plugin config |
-l | Limit to subset pattern (web01, web:!web02) |
-f | Forks (parallelism) |
-b / --become | Escalate privileges (sudo) |
--become-user | Target user after become (default root) |
-C | Check mode (dry run where supported) |
-o | One-line output (useful in scripts) |
-t | Run only tasks tagged (playbooks only; N/A for raw ad-hoc) |
Prefer command or shell only when no module exists—modules capture idempotency and structured change data.
ansible-playbook — playbooks and plays
Syntax: ansible-playbook site.yml [options]
---
# site.yml — minimal play
- name: Configure web tier
hosts: web
become: true
vars:
app_version: "2.4.1"
tasks:
- name: Ensure nginx package
ansible.builtin.apt:
name: nginx
state: present
update_cache: true
- name: Deploy config from template
ansible.builtin.template:
src: templates/nginx.conf.j2
dest: /etc/nginx/nginx.conf
validate: nginx -t -c %s
notify: Reload nginx
handlers:
- name: Reload nginx
ansible.builtin.service:
name: nginx
state: reloaded
Common ansible-playbook flags:
| Flag | Use |
|---|---|
--syntax-check | Validate YAML and playbook structure without running |
--check | Check mode (predict changes) |
--diff | Show file diffs for template/copy modules |
-t / --tags | Run tagged tasks only |
--skip-tags | Skip tagged tasks |
-l / --limit | Limit hosts |
-e / --extra-vars | Override variables (-e @vars.json) |
--start-at-task | Resume from named task (break-glass) |
-v … -vvvv | Increase verbosity for debugging |
--vault-password-file | Decrypt vault vars non-interactively (protect file permissions) |
ansible-playbook site.yml --syntax-check
ansible-playbook site.yml --check --diff -l web01
ansible-playbook deploy.yml -e "version=3.0.0" -t deploy
ansible-playbook site.yml --skip-tags never_prod
Playbook building blocks
Variables and precedence
Ansible merges variables from many sources (lowest to highest precedence simplified): role defaults → inventory → play vars → task vars → extra vars (-e, highest). Use group_vars/ and host_vars/ directories beside inventory for clarity.
Facts
setup module gathers facts (ansible_os_family, ansible_distribution_version, IP addresses). Reference in templates: {{ ansible_hostname }}. Cache facts in large fleets with fact caching (Redis, JSON file) configured in ansible.cfg.
Registers, conditionals, loops
- name: Check app health
ansible.builtin.uri:
url: "http://127.0.0.1:8080/health"
status_code: 200
register: health
- name: Restart only if unhealthy
ansible.builtin.service:
name: myapp
state: restarted
when: health.status != 200
- name: Create users
ansible.builtin.user:
name: "{{ item.name }}"
groups: "{{ item.groups }}"
loop:
- { name: alice, groups: wheel }
- { name: bob, groups: users }
Handlers
Handlers run once at end of play if notified—ideal for service restarts. Use notify on tasks that change config; avoid restarting on every play if nothing changed.
Blocks, rescue, always
- block:
- name: Risky change
ansible.builtin.command: /opt/migrate.sh
rescue:
- name: Rollback marker
ansible.builtin.file:
path: /var/run/migration_failed
state: touch
always:
- name: Notify ops
ansible.builtin.debug:
msg: "Migration attempt finished"
Roles and Ansible Galaxy
Role layout:
roles/nginx/
defaults/main.yml
vars/main.yml
tasks/main.yml
handlers/main.yml
templates/nginx.conf.j2
files/static.html
meta/main.yml # dependencies, platforms
In playbook:
- hosts: web
roles:
- role: nginx
vars:
worker_processes: 4
ansible-galaxy commands:
ansible-galaxy role install geerlingguy.nginx
ansible-galaxy collection install community.docker
ansible-galaxy init my_role
ansible-galaxy role list
ansible-galaxy collection list
Declare dependencies in requirements.yml for reproducible CI:
---
roles:
- name: geerlingguy.nginx
collections:
- name: community.general
- name: amazon.aws
ansible-galaxy install -r requirements.yml
ansible-inventory
ansible-inventory -i inventory.yml --list
ansible-inventory -i aws_ec2.yml --graph
ansible-inventory --host web01.example.com
Use --graph to debug group membership; --list exports JSON for external tools.
ansible-config and ansible.cfg
Configuration search order (highest wins): ANSIBLE_CONFIG env → ./ansible.cfg → ~/.ansible.cfg → /etc/ansible/ansible.cfg.
ansible-config dump
ansible-config dump --only-changed
ansible-config list
Production-friendly snippets:
[defaults]
inventory = ./inventory
remote_user = deploy
host_key_checking = True
retry_files_enabled = False
stdout_callback = yaml
interpreter_python = auto_silent
[privilege_escalation]
become = True
become_method = sudo
[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=60s
ansible-doc
ansible-doc apt
ansible-doc -l | grep docker
ansible-doc ansible.builtin.template
ansible-doc -s copy # snippet: args only
Module FQCN format: ansible.builtin.copy or collection shorthand when documented. Always read RETURN and NOTES sections for cloud modules.
ansible-vault
Encrypt secrets at rest in Git. Never commit plaintext prod passwords.
ansible-vault create group_vars/prod/secrets.yml
ansible-vault edit group_vars/prod/secrets.yml
ansible-vault encrypt_string 's3cr3t' --name db_password
ansible-vault view group_vars/prod/secrets.yml
ansible-playbook site.yml --ask-vault-pass
ansible-playbook site.yml --vault-password-file ~/.vault_pass
In CI, inject vault password from OIDC-backed secret store—same discipline as GitHub Actions secrets. Rotate vault IDs with ansible-vault rekey when people leave the team.
ansible-console and ansible-pull
# Interactive ad-hoc (tab-complete hosts/modules)
ansible-console web
# ansible> apt name=htop state=present
# Pull model: target clones repo and runs playbook locally
ansible-pull -U https://git.example.com/config.git -C main site.yml
Pull mode suits autoscaling workers without inbound SSH from a central controller—at the cost of distributed trust on the repo URL and keys.
Connection plugins and targets
| Plugin | When | Inventory vars |
|---|---|---|
ssh (default) | Linux, BSD, network devices with SSH | ansible_host, ansible_user, ansible_ssh_private_key_file |
winrm | Windows Server | ansible_connection=winrm, cert validation settings |
local | Module runs on control node | ansible_connection=local |
docker / podman | Container as target | Via community.docker collection |
network_cli / netconf | Router/switch automation | ansible_network_os, ansible_become |
Bastion hops: set ansible_ssh_common_args='-o ProxyJump=bastion' or use ansible_ssh_args in group_vars.
Templates, files, and Jinja2
template renders .j2 files with Jinja2; copy pushes static files. Use validate on templates when a bad config could brick a service (e.g. nginx -t).
# templates/app.env.j2
APP_ENV={{ environment }}
DB_HOST={{ hostvars[groups['db'][0]]['ansible_host'] }}
{% for user in app_users %}
ALLOW_USER={{ user }}
{% endfor %}
Execution strategies and performance
- linear (default) — Each task waits for all hosts before next task.
- free — Hosts proceed independently (faster, order not guaranteed across hosts).
- serial — Rolling updates:
serial: "25%"on play for canary-style deploys.
Speed tips: enable SSH pipelining, increase forks cautiously, use ansible.builtin.package with native module not command apt, mitogen connection plugin (third party) for very large fleets—test before prod.
AWX and Red Hat Ansible Automation Platform
AWX is the upstream open-source web UI and API for scheduling job templates, credentials, inventories, and RBAC. Automation Controller is the supported enterprise product (formerly Tower). Features you get beyond CLI:
- Job templates and workflow templates (orchestrate multiple playbooks)
- Credential types injected at runtime (machine, cloud, vault)
- Survey variables for operators
- RBAC and audit logs for regulated environments
- Execution environments (containerized toolchain)
Pattern: developers run ansible-playbook locally; production runs only from Controller with signed collections and approved inventories.
Security and operations
- Use SSH keys, disable password SSH on servers (Linux in depth).
- Limit
becomewith sudoers templates; never store become passwords in Git. - Run playbooks in CI with
--checkon PRs; apply on merge with approval. - Pin collections; scan playbooks for shell injection (
shellmodule with user input). - Separate inventories per environment (
prodvsstaging); require-lconfirmation in wrapper scripts for prod. - Log to centralized SIEM via callback plugins; avoid logging vault contents.
Troubleshooting playbook
| Symptom | Likely cause | What to check |
|---|---|---|
UNREACHABLE |
SSH firewall, wrong IP, key not loaded | ssh deploy@host, -vvv, security groups |
Permission denied become |
Sudoers, missing -b, wrong become user |
ansible -m ping -b, sudo -l on target |
Failed to find required executable python |
Minimal image without Python | raw bootstrap or interpreter_python discovery |
| Works ad-hoc, playbook fails | Different inventory, missing vars, tag skip | --syntax-check, ansible-inventory --host |
| Handler never runs | Task did not change (changed: false) |
Use force_handlers or notify on failing branch |
| Slow runs | Low forks, no pipelining, gather_facts every time | gather_facts: false when safe, fact cache, SSH multiplexing |
Example: layered project layout
ansible/
ansible.cfg
inventory/
production/
hosts.yml
group_vars/all.yml
group_vars/web.yml
playbooks/
site.yml
deploy.yml
roles/
common/
nginx/
app/
group_vars/
host_vars/
requirements.yml
site.yml imports other plays:
---
- import_playbook: common.yml
- import_playbook: web.yml
- import_playbook: deploy.yml
Learning path (hands-on)
- Install ansible-core in a Python venv; create
inventory/hostswith one Vagrant or cloud VM. - Run
ansible all -m pingandansible all -m setup | less. - Write a playbook that installs a package and templates a config; run with
--check --diff. - Extract tasks into a role; publish nothing yet—use
ansible-galaxy init. - Add
ansible-vaultfor a fake DB password; run playbook with vault password. - Add a GitHub Actions job that runs
ansible-playbook --syntax-checkandmolecule test(optional) on PR. - Pair with Terraform: provision VM → Ansible play for hardening and app install.
Further reading on this site
- Terraform & IaC for everyone — provision before you configure
- Linux in depth — SSH, sudo, systemd targets Ansible manages
- Docker’s hidden side — container connection plugins
- Kubernetes architecture — when to prefer operators over SSH playbooks
- GitHub CI/CD in depth — run Ansible in pipelines with OIDC
- Jenkins CI/CD in depth — legacy estates often invoke Ansible from jobs
- GitOps principles — cluster desired state vs SSH push
- Historical foundations of DevOps — Chef, Puppet, Ansible in context
Blog index · Terraform & IaC · Linux in depth · GitHub CI/CD
Ansible® is a registered trademark of Red Hat, Inc. Command behavior may vary slightly by ansible-core version; verify against your installed docs with ansible-doc.