Anyone Can Terraform: Why IaC Exists, How It Works, and the Anatomy of .tf and State
Infrastructure as Code means you describe servers, networks, and permissions in files—then a tool makes the cloud match those files. Terraform is the most common tool for that job. You do not need to be a “cloud architect” to start: if you can read a recipe and follow steps, you can read a .tf file and run terraform plan.
In short
Write what you want in .tf files, run plan to preview changes, run apply to create or update real resources. Terraform remembers what it created in a state file so the next run knows what to change—not what you clicked in a console last week.
What is Infrastructure as Code (IaC)?
Infrastructure as Code treats your cloud setup—VPCs, VMs, databases, IAM roles—the same way you treat application source code: stored in Git, reviewed in pull requests, tested, and applied by automation.
Before IaC, “how production works” often lived in one engineer’s head, a wiki page, or fifty console clicks nobody wrote down. That is fragile: people leave, regions fail, and nobody can rebuild staging at 2 a.m.
IaC flips the model: the files are the truth. The tool (Terraform, AWS CloudFormation, Pulumi, Ansible for configuration, and others) turns that truth into real infrastructure.
Why teams adopt IaC (the “why” in plain language)
- Repeatability — Dev, staging, and production can share the same module with different variables instead of “we clicked something different in prod.”
- Review before change — A pull request shows a diff: “this will add one subnet and open port 443.” That is safer than solo console work.
- Audit trail — Git history answers who changed the firewall rule and when.
- Faster recovery — Rebuild a lost environment from code instead of guessing.
- Less drift — When someone fixes prod by hand, reality and code disagree. IaC makes that gap visible (
planshows unexpected resources). - Documentation that runs — The config is always up to date because wrong docs break the next
apply.
IaC is not magic. You still need good design, secrets hygiene, and backups. But it turns infrastructure work from artisanal clicking into engineering you can teach, test, and improve.
How Terraform works (the loop everyone uses)
Think of Terraform as a translator between your description (HCL in .tf files) and cloud APIs (AWS, Azure, GCP, Kubernetes, GitHub, and hundreds more via providers).
- Write — You declare resources: “one S3 bucket named
logs-prodwith versioning on.” - Init —
terraform initdownloads the provider plugins and prepares the working directory. - Plan —
terraform plancompares desired config + state to reality and prints: create 1, change 0, destroy 0. - Apply —
terraform apply(after you approve) calls the APIs and makes it so. - State update — Terraform records IDs and metadata in the state file so the next plan is accurate.
Important idea: Terraform is declarative. You do not script “step 1 create VPC, step 2 create subnet.” You declare the end state; Terraform figures out the order (dependency graph) and the minimal API calls.
Core vocabulary (memorize these five)
| Term | Meaning |
|---|---|
| Provider | Plugin that talks to a platform (aws, azurerm, google) |
| Resource | Something Terraform manages (aws_s3_bucket, aws_instance) |
| Data source | Read-only lookup of existing things (e.g. “find the VPC named prod”) |
| Variable | Input you pass in (environment = "staging") |
| Output | Value Terraform prints after apply (bucket ARN, load balancer DNS) |
Your first project layout (what goes in the folder)
A small repo is enough to learn:
my-terraform/
├── main.tf # resources (or split across files)
├── variables.tf # inputs
├── outputs.tf # exports
├── terraform.tfvars # values for variables (often not committed if sensitive)
└── .terraform/ # plugins (created by init; usually gitignored)
After your first apply, you will also see terraform.tfstate locally unless you configure remote state (recommended for teams—see below).
Anatomy of a .tf file (syntax anyone can read)
Terraform files use HCL (HashiCorp Configuration Language). It looks like JSON with less punctuation: blocks, labels, and key = value pairs.
1. Terraform block — version and backends
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
# Optional: store state in S3 instead of a laptop file
backend "s3" {
bucket = "my-org-terraform-state"
key = "networking/terraform.tfstate"
region = "ap-south-1"
}
}
required_providers pins which plugin versions init downloads. The backend block is where state lives for the team (S3 + DynamoDB lock table is a common AWS pattern).
2. Provider block — credentials and region
provider "aws" {
region = var.aws_region
}
Providers inherit authentication from the environment (AWS CLI profile, environment variables, or CI OIDC). Avoid hard-coding access keys in .tf files.
3. Resource block — the workhorse
Syntax: resource "<TYPE>" "<LOCAL_NAME>" { ... }
- TYPE — Provider resource, e.g.
aws_s3_bucket - LOCAL_NAME — Your label inside this project (e.g.
logs); used in references - Arguments — Settings the API understands
resource "aws_s3_bucket" "logs" {
bucket = "${var.project}-logs-${var.environment}"
tags = {
Environment = var.environment
ManagedBy = "terraform"
}
}
resource "aws_s3_bucket_versioning" "logs" {
bucket = aws_s3_bucket.logs.id
versioning_configuration {
status = "Enabled"
}
}
Notice aws_s3_bucket.logs.id: that is a reference from one resource to another. Terraform builds a dependency graph—versioning waits until the bucket exists.
4. Variable block — inputs
variable "environment" {
description = "dev, staging, or prod"
type = string
validation {
condition = contains(["dev", "staging", "prod"], var.environment)
error_message = "environment must be dev, staging, or prod."
}
}
variable "aws_region" {
type = string
default = "ap-south-1"
}
Values come from terraform.tfvars, -var flags, or CI environment variables.
5. Output block — values for humans and other stacks
output "logs_bucket_arn" {
description = "ARN of the centralized logs bucket"
value = aws_s3_bucket.logs.arn
}
6. Data source — read without owning
data "aws_caller_identity" "current" {}
output "account_id" {
value = data.aws_caller_identity.current.account_id
}
Use data sources when resources already exist (shared VPC, corporate DNS zone) or when you need context (current account, AMI lookup).
7. Locals, expressions, and functions
locals {
name_prefix = "${var.project}-${var.environment}"
common_tags = {
Project = var.project
Environment = var.environment
}
}
# Examples: string concat, conditionals, for loops
# "${local.name_prefix}-app"
# var.enable_nat ? 1 : 0
# [for s in var.subnets : s.cidr]
Common functions: merge, tolist, jsonencode, file, templatefile. You learn them as you need them—start with strings and maps.
8. Modules — reusable packages
module "vpc" {
source = "./modules/vpc"
cidr_block = "10.0.0.0/16"
environment = var.environment
}
Modules are just folders of .tf files with their own variables and outputs. Platform teams publish “golden path” VPC or EKS modules; product teams call them with different inputs.
State file anatomy (terraform.tfstate)
After apply, Terraform writes state: a JSON map of everything it manages and the cloud IDs it received back. Without state, Terraform would not know whether aws_instance.web is the same server you created yesterday or a brand-new one.
Golden rule: treat state as sensitive. It often contains secrets and always describes your architecture. For teams, store it remotely with locking—not on one laptop.
Top-level fields you will see
{
"version": 4,
"terraform_version": "1.8.2",
"serial": 42,
"lineage": "a1b2c3d4-....",
"outputs": { },
"resources": [ ],
"check_results": null
}
| Field | Purpose |
|---|---|
version | State file format version (managed by Terraform) |
terraform_version | Which CLI last wrote the file |
serial | Increments on each write; helps detect concurrent edits |
lineage | Unique ID for this state lineage—prevents accidentally pointing at the wrong backend |
outputs | Cached output values from the last apply |
resources | The list of managed objects and their attributes |
Inside one resource entry
{
"mode": "managed",
"type": "aws_s3_bucket",
"name": "logs",
"provider": "provider[\"registry.terraform.io/hashicorp/aws\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"id": "my-project-logs-prod",
"arn": "arn:aws:s3:::my-project-logs-prod",
"bucket": "my-project-logs-prod",
"tags": { "Environment": "prod", "ManagedBy": "terraform" }
},
"dependencies": ["aws_iam_role.logging"],
"sensitive_attributes": []
}
]
}
mode—managed(Terraform owns lifecycle) ordata(read-only snapshot)type/name— Matchresource "aws_s3_bucket" "logs"instances— Supportscountandfor_each(multiple similar resources)attributes— Everything the provider returned; includes real cloud IDsdependencies— Graph edges Terraform used for ordering
Remote state and locking (team hygiene)
When state lives in S3 (or Terraform Cloud, Azure Blob, GCS), collaborators share one truth. A lock (DynamoDB table on AWS) stops two people from applying at once and corrupting state.
Never edit terraform.tfstate by hand unless you know exactly why—use terraform state mv, terraform import, or terraform state rm for surgery.
Plan output: how to read the diff
A plan line like + aws_s3_bucket.logs means create. ~ means update in place. - means destroy. -/+ means destroy and recreate (often when a force-new attribute changes).
Always read the plan before apply in production. If the plan says it will destroy something you did not expect, stop and fix the config—do not “just apply and see.”
A ten-minute first apply (local practice)
- Install Terraform from HashiCorp and configure AWS credentials (or use another provider you have access to).
- Create a folder with
main.tfcontaining a single harmless resource (e.g. an S3 bucket with a unique name, or a random_id resource for dry practice). - Run
terraform init, thenterraform plan, thenterraform apply. - Open
terraform.tfstateand find your resource’sidandarn. - Change a tag in
main.tf, plan again, apply, and watch state update. - Run
terraform destroywhen finished so you do not pay for stray resources.
Common mistakes (and simple fixes)
- Committing secrets — Use environment variables, AWS Secrets Manager, or SSM; never put passwords in Git.
- Committing state to a public repo — Use remote backend; add
*.tfstate*to.gitignore. - Console edits alongside Terraform — Next plan shows drift; either import the change or revert the manual fix.
- Giant monolithic
main.tf— Split by concern (network.tf,iam.tf) or modules when it hurts readability. - No variable validation — Typos in
environmentbecome expensive prod buckets.
How this fits your learning path
Terraform is the glue behind much of what I write about elsewhere: AWS network design, IAM policies, GitOps repos, and DevOps history. For configuring what Terraform provisions on hosts, see Ansible in depth. Learn IaC once; the mindset travels across clouds and tools.
Next steps when you are comfortable: remote state + CI pipeline that runs plan on every PR and apply only on merge; then reusable modules; then policy checks (Sentinel, OPA, or terraform validate in CI).
Further reading
- HashiCorp — Terraform documentation (providers, language, CLI)
- Gruntwork — Terraform: Up & Running (practical patterns)
- AWS — Well-Architected Operational Excellence pillar (IaC and change management)
Blog index · GitOps principles · Cloud Architecting · Network architecture