Portal | Level: L1: Foundations | Topics: Terraform | Domain: DevOps & Tooling

Terraform Drills¶

Remember: Terraform's core loop: init (download providers + backend) -> plan (dry-run diff) -> apply (execute). In CI, always use plan -out=plan.tfplan then apply plan.tfplan — this guarantees you apply exactly what was reviewed, not a newer state. Mnemonic: "IPA" — Init, Plan, Apply.

Gotcha: terraform destroy -target=X removes the resource from infrastructure BUT leaves it in state if something goes wrong mid-destroy. If you see "resource already exists" errors on re-apply, run terraform state rm to clean up orphaned state entries.

Debug clue: If terraform plan shows unexpected changes on resources you did not touch, someone or something modified the infrastructure outside of Terraform (console click, CLI command, another pipeline). This is called drift. Run terraform refresh to update state, then plan again to see the true delta.

Drill 1: Init, Plan, Apply¶

Difficulty: Easy

Q: What are the three core Terraform commands and what does each do?

Answer

terraform init    # Download providers, initialize backend, install modules
terraform plan    # Show what will change (dry-run). No modifications.
terraform apply   # Execute the changes. Prompts for confirmation.

Always `plan` before `apply`. In CI, use `terraform plan -out=plan.tfplan` then `terraform apply plan.tfplan` for deterministic applies.

Drill 2: State File¶

Difficulty: Easy

Q: What is the Terraform state file? Why should you never store it in Git?

Answer

The state file (`terraform.tfstate`) maps your config to real infrastructure. It contains: - Resource IDs, attributes, and metadata - **Sensitive values** (passwords, tokens) in plaintext Never in Git because: - Contains secrets - Concurrent edits cause corruption - No locking mechanism Use a **remote backend** instead:

terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    dynamodb_table = "terraform-locks"  # State locking
    encrypt        = true
  }
}

Drill 3: Variables and Outputs¶

Difficulty: Easy

Q: Create a variable for instance type with a default, and an output that exposes the instance's public IP.

Answer

variable "instance_type" {
  description = "EC2 instance type"
  type        = string
  default     = "t3.micro"

  validation {
    condition     = can(regex("^t3\\.", var.instance_type))
    error_message = "Only t3 instance types are allowed."
  }
}

output "public_ip" {
  description = "Public IP of the instance"
  value       = aws_instance.web.public_ip
  sensitive   = false
}

Override variables: `terraform apply -var="instance_type=t3.small"` or via `terraform.tfvars`.

Drill 4: Resource Dependencies¶

Difficulty: Medium

Q: What's the difference between implicit and explicit dependencies? When do you need depends_on?

Answer

**Implicit** (preferred): Terraform infers from references.

resource "aws_instance" "web" {
  subnet_id = aws_subnet.main.id  # Terraform knows: create subnet first
}

**Explicit** (`depends_on`): When there's no direct reference but ordering matters.

resource "aws_instance" "web" {
  # No direct reference to the IAM role, but needs it at boot
  depends_on = [aws_iam_role_policy_attachment.web_policy]
}

Use `depends_on` sparingly — it's a code smell. Usually means you should restructure your references.

Drill 5: Import Existing Resources¶

Difficulty: Medium

Q: There's an existing S3 bucket my-data-bucket not managed by Terraform. How do you bring it under management?

Answer

# 1. Write the resource block
resource "aws_s3_bucket" "data" {
  bucket = "my-data-bucket"
}

# 2. Import into state
terraform import aws_s3_bucket.data my-data-bucket

# 3. Run plan to see drift
terraform plan
# Fix any differences between config and actual state

# Terraform 1.5+: import blocks (declarative)
import {
  to = aws_s3_bucket.data
  id = "my-data-bucket"
}

After import, `terraform plan` should show no changes if your config matches reality.

Drill 6: Modules¶

Difficulty: Medium

Q: Create a reusable module for a VPC and call it from your root config.

Answer

modules/vpc/
├── main.tf
├── variables.tf
└── outputs.tf

# modules/vpc/variables.tf
variable "cidr_block" { type = string }
variable "name"       { type = string }

# modules/vpc/main.tf
resource "aws_vpc" "this" {
  cidr_block = var.cidr_block
  tags       = { Name = var.name }
}

# modules/vpc/outputs.tf
output "vpc_id" { value = aws_vpc.this.id }

# Root main.tf
module "production_vpc" {
  source     = "./modules/vpc"
  cidr_block = "10.0.0.0/16"
  name       = "production"
}

# Reference module output
resource "aws_subnet" "web" {
  vpc_id = module.production_vpc.vpc_id
  # ...
}

Drill 7: State Manipulation¶

Difficulty: Hard

Q: You renamed a resource from aws_instance.web to aws_instance.app. Plan shows destroy + create. How do you avoid downtime?

Answer

# Option 1: moved block (Terraform 1.1+, preferred)
moved {
  from = aws_instance.web
  to   = aws_instance.app
}

# Option 2: state mv
terraform state mv aws_instance.web aws_instance.app

# Verify
terraform plan  # Should show no changes

The `moved` block is better because it's declarative, version-controlled, and works for team members too.

Drill 8: Workspaces vs Directory Structure¶

Difficulty: Medium

Q: How do you manage multiple environments (dev, staging, prod) in Terraform?

Answer

**Option 1: Workspaces** (simple, same config)

terraform workspace new staging
terraform workspace new production
terraform workspace select staging
terraform apply -var-file=staging.tfvars

Access with `terraform.workspace` in config. **Option 2: Directory structure** (recommended for real teams)

infrastructure/
├── modules/           # Reusable modules
├── environments/
│   ├── dev/
│   │   ├── main.tf    # Calls modules with dev values
│   │   └── terraform.tfvars
│   ├── staging/
│   └── prod/

Directory structure is better because: - Separate state files (blast radius) - Different providers/versions per env - Clearer PR diffs

Drill 9: Destroy and Taint¶

Difficulty: Easy

Q: How do you destroy a single resource without tearing down everything? How do you force recreation?

Answer

# Destroy one resource
terraform destroy -target=aws_instance.web

# Force recreation (Terraform 1.0+)
terraform apply -replace=aws_instance.web

# Old way (deprecated since Terraform 1.0 — avoid in new code)
terraform taint aws_instance.web  # deprecated
terraform apply

`-replace` is preferred over `taint` — it's a single command and doesn't modify state until apply.

Drill 10: Prevent Accidental Destruction¶

Difficulty: Medium

Q: How do you protect critical resources from accidental terraform destroy?

Answer

# 1. Lifecycle block
resource "aws_rds_instance" "production" {
  # ...
  lifecycle {
    prevent_destroy = true
  }
}

# 2. Backend state locking (prevents concurrent operations)
terraform {
  backend "s3" {
    dynamodb_table = "terraform-locks"
  }
}

# 3. Policy-as-code (Sentinel or OPA)
# Block destroys of tagged resources in CI

# 4. Separate state files for critical infra
# Database state != application state

`prevent_destroy = true` causes Terraform to error if any plan would destroy the resource.

Drill 11: Data Sources¶

Difficulty: Easy

Q: Look up an existing AMI and use it in an instance without hardcoding the ID.

Answer

data "aws_ami" "ubuntu" {
  most_recent = true
  owners      = ["099720109477"]  # Canonical

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*"]
  }
}

resource "aws_instance" "web" {
  ami           = data.aws_ami.ubuntu.id
  instance_type = "t3.micro"
}

Data sources read existing infrastructure without managing it. Use them to reference shared resources, look up AMIs, and query remote state.

Drill 12: Troubleshoot "State Lock"¶

Difficulty: Medium

Q: Terraform says "Error acquiring the state lock." What do you do?

Answer

# 1. Check who holds the lock
# The error message includes a Lock ID and who acquired it

# 2. Wait — someone else might be running terraform
# Check with your team

# 3. If the lock is stale (crashed terraform, killed CI job):
terraform force-unlock <LOCK_ID>

# 4. Verify
terraform plan

Never force-unlock while someone else is actually running Terraform — you'll corrupt state. Always check first.

Prerequisites¶

Terraform / IaC (Topic Pack, L1)

Case Study: SSH Timeout — MTU Mismatch, Fix Is Terraform Variable (Case Study, L2) — Terraform
Case Study: Terraform Apply Fails — State Lock Stuck, DynamoDB Throttle (Case Study, L2) — Terraform
Crossplane (Topic Pack, L2) — Terraform
Deep Dive: Terraform State Internals (deep_dive, L2) — Terraform
Mental Models (Core Concepts) (Topic Pack, L0) — Terraform
OpenTofu & Terraform Ecosystem (Topic Pack, L2) — Terraform
Pulumi (Topic Pack, L2) — Terraform
Runbook: Cloud Capacity Limit Hit (Runbook, L2) — Terraform
Runbook: Terraform Drift Detection Response (Runbook, L2) — Terraform
Runbook: Terraform State Lock Stuck (Runbook, L2) — Terraform

Terraform Drills¶

Drill 1: Init, Plan, Apply¶

Drill 2: State File¶

Drill 3: Variables and Outputs¶

Drill 4: Resource Dependencies¶

Drill 5: Import Existing Resources¶

Drill 6: Modules¶

Drill 7: State Manipulation¶

Drill 8: Workspaces vs Directory Structure¶

Drill 9: Destroy and Taint¶

Drill 10: Prevent Accidental Destruction¶

Drill 11: Data Sources¶

Drill 12: Troubleshoot "State Lock"¶

Wiki Navigation¶

Prerequisites¶

Pages that link here¶

Terraform Drills¶

Drill 1: Init, Plan, Apply¶

Drill 2: State File¶

Drill 3: Variables and Outputs¶

Drill 4: Resource Dependencies¶

Drill 5: Import Existing Resources¶

Drill 6: Modules¶

Drill 7: State Manipulation¶

Drill 8: Workspaces vs Directory Structure¶

Drill 9: Destroy and Taint¶

Drill 10: Prevent Accidental Destruction¶

Drill 11: Data Sources¶

Drill 12: Troubleshoot "State Lock"¶

Wiki Navigation¶

Prerequisites¶

Related Content¶

Pages that link here¶