Terraform Modules: Building Infrastructure LEGOs

lesson
terraform-modules
code-reuse
versioning
composition
testing
governance
vpc-design
l2 ---# Terraform Modules — Building Infrastructure LEGOs

Topics: Terraform modules, code reuse, versioning, composition, testing, governance, VPC design Level: L2 (Operations) Time: 60–75 minutes Prerequisites: None required; basic Terraform familiarity helpful

The Mission¶

Your team's Terraform codebase has a problem. Three different engineers wrote three different VPC configurations for dev, staging, and production. They started as copies. Over six months, they drifted:

# environments/dev/vpc.tf        — CIDR 10.0.0.0/16, 2 AZs, no NAT gateway
# environments/staging/vpc.tf    — CIDR 10.1.0.0/16, 2 AZs, single NAT gateway
# environments/prod/vpc.tf       — CIDR 10.2.0.0/16, 3 AZs, NAT per AZ, flow logs enabled

Dev is missing flow logs that compliance requires. Staging has a subnet CIDR overlap with prod because someone fat-fingered it. Prod has a security group rule that was hotfixed during an incident and never backported. Nobody knows which version is "right."

Your job: refactor these three snowflakes into a single VPC module that all environments share. Along the way, you'll learn why modules exist, how to build them, how to version them, and how to avoid the mistakes that make module upgrades terrifying.

Why Modules (The 3-Minute Case)¶

You could keep three separate VPC configs. Here's what happens:

Without modules	With modules
Bug fix? Patch 3 files (and remember all 3)	Bug fix? Patch 1 module, all envs get it
New requirement? Add to 3 files (differently)	New requirement? Add once, parameterize
Security audit? Review 3 implementations	Security audit? Review 1 module
New engineer? "Which VPC file is canonical?"	New engineer? "Read the module"
Drift between envs? Guaranteed	Drift between envs? Impossible by design

Mental Model: A Terraform module is a function. It takes inputs (variables), does work (creates resources), and returns outputs. Just like you wouldn't copy-paste a function body into three places in your code, you shouldn't copy-paste infrastructure definitions.

The three reasons modules exist, in order of importance:

Consistency — every VPC looks the same because they come from the same code
DRY — fix a bug once, not N times
Governance — your platform team publishes the approved VPC module; app teams consume it

That third one is underappreciated. Modules aren't just about saving typing. They're how organizations enforce standards without writing policy documents nobody reads.

Module Anatomy: What's in the Box¶

A module is a directory of .tf files. That's it. No special syntax, no magic. Every Terraform configuration you've ever written is already a module — the "root module."

Here's the standard layout:

modules/vpc/
├── main.tf          # Resources — the actual infrastructure
├── variables.tf     # Inputs — what the caller passes in
├── outputs.tf       # Outputs — what the caller gets back
├── versions.tf      # Provider and Terraform version constraints
├── locals.tf        # Internal computed values
└── README.md        # How to use this module

Each file has a job:

File	Purpose	Analogy
`variables.tf`	Function parameters	`def create_vpc(cidr, azs, environment):`
`main.tf`	Function body	The actual resource creation logic
`outputs.tf`	Return values	`return {"vpc_id": vpc.id, "subnet_ids": [...]}`
`versions.tf`	Compatibility contract	"Works with Terraform >= 1.5 and AWS provider ~> 5.0"
`locals.tf`	Internal scratch space	Local variables you don't expose

Trivia: The Terraform Registry enforces this layout. To publish a module, you need the standard structure, a GitHub repo with semantic version tags, and the naming convention terraform-<PROVIDER>-<NAME> (e.g., terraform-aws-vpc). The most downloaded module on the registry — terraform-aws-modules/vpc/aws — has been downloaded over 50 million times.

Building the VPC Module (Hands On)¶

Let's build the module that replaces those three snowflake VPCs. We'll start simple and add complexity as the requirements demand it.

Step 1: Variables — The Module's API¶

# modules/vpc/variables.tf

variable "vpc_name" {
  description = "Name prefix for all resources"
  type        = string
  validation {
    condition     = length(var.vpc_name) > 0 && length(var.vpc_name) <= 32
    error_message = "VPC name must be 1-32 characters."
  }
}

variable "vpc_cidr" {
  description = "CIDR block for the VPC (e.g., 10.0.0.0/16)"
  type        = string
  validation {
    condition     = can(cidrnetmask(var.vpc_cidr))
    error_message = "Must be a valid IPv4 CIDR block."
  }
}

variable "availability_zones" {
  description = "List of AZs to deploy into"
  type        = list(string)
  validation {
    condition     = length(var.availability_zones) >= 2
    error_message = "At least 2 AZs required for high availability."
  }
}

variable "environment" {
  description = "Environment name (dev, staging, prod)"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

variable "enable_nat_gateway" {
  type    = bool
  default = false
}

variable "single_nat_gateway" {
  type    = bool
  default = true
}

variable "enable_flow_logs" {
  type    = bool
  default = true     # Secure default — teams must explicitly opt out
}

variable "common_tags" {
  type    = map(string)
  default = {}
}

The validation blocks catch mistakes at terraform plan time. The CIDR validation uses can(cidrnetmask(...)) — a Terraform built-in that returns false if the CIDR is malformed.

Gotcha: Validation blocks can only reference their own variable. For cross-variable validation ("if NAT is enabled, you need at least 2 AZs"), use precondition blocks on resources (Terraform 1.2+).

Step 2: Resources — The Module's Body¶

# modules/vpc/main.tf

locals {
  nat_gateway_count = var.enable_nat_gateway ? (var.single_nat_gateway ? 1 : length(var.availability_zones)) : 0
  tags = merge(var.common_tags, { Environment = var.environment, ManagedBy = "terraform" })
}

resource "aws_vpc" "this" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
  tags = merge(local.tags, { Name = "${var.vpc_name}-vpc" })
}

resource "aws_subnet" "public" {
  for_each = toset(var.availability_zones)

  vpc_id                  = aws_vpc.this.id
  cidr_block              = cidrsubnet(var.vpc_cidr, 8, index(var.availability_zones, each.value))
  availability_zone       = each.value
  map_public_ip_on_launch = true
  tags = merge(local.tags, { Name = "${var.vpc_name}-public-${each.value}", Tier = "public" })
}

resource "aws_subnet" "private" {
  for_each = toset(var.availability_zones)

  vpc_id            = aws_vpc.this.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 8, index(var.availability_zones, each.value) + 100)
  availability_zone = each.value
  tags = merge(local.tags, { Name = "${var.vpc_name}-private-${each.value}", Tier = "private" })
}

resource "aws_internet_gateway" "this" {
  vpc_id = aws_vpc.this.id
  tags = merge(local.tags, { Name = "${var.vpc_name}-igw" })
}

Two things to notice:

for_each instead of count. Adding an AZ creates one subnet. Removing an AZ from the middle of a count list would shift indexes and destroy/recreate everything after it.
cidrsubnet for automatic CIDR math. cidrsubnet("10.0.0.0/16", 8, 0) gives 10.0.0.0/24, index 100 gives 10.0.100.0/24. No more fat-fingered CIDRs.

Under the Hood: cidrsubnet(prefix, newbits, netnum) adds newbits to the prefix length (/16 + 8 = /24) and selects the netnum-th network of that size. Binary math on IP addresses, guaranteed correct.

Step 3: Outputs — The Module's Return Values¶

# modules/vpc/outputs.tf

output "vpc_id" {
  description = "ID of the VPC"
  value       = aws_vpc.this.id
}

output "vpc_cidr" {
  description = "CIDR block of the VPC"
  value       = aws_vpc.this.cidr_block
}

output "public_subnet_ids" {
  description = "IDs of public subnets, keyed by AZ"
  value       = { for az, subnet in aws_subnet.public : az => subnet.id }
}

output "private_subnet_ids" {
  description = "IDs of private subnets, keyed by AZ"
  value       = { for az, subnet in aws_subnet.private : az => subnet.id }
}

output "nat_gateway_ids" {
  description = "IDs of NAT gateways (empty if NAT disabled)"
  value       = [for nat in aws_nat_gateway.this : nat.id]
}

Outputs are your module's API contract. Downstream callers depend on these names and types. Change an output name and you break every caller. This is why output stability matters — and why semantic versioning matters for modules.

Calling the Module: Three Environments, One Source¶

Now the payoff. Each environment is a thin wrapper:

# environments/dev/main.tf                  # environments/prod/main.tf
module "vpc" {                               module "vpc" {
  source = "../../modules/vpc"                 source = "../../modules/vpc"

  vpc_name           = "dev"                   vpc_name           = "prod"
  vpc_cidr           = "10.0.0.0/16"           vpc_cidr           = "10.2.0.0/16"
  availability_zones = ["us-east-1a",          availability_zones = ["us-east-1a",
                        "us-east-1b"]                               "us-east-1b",
  environment        = "dev"                                        "us-east-1c"]
  enable_nat_gateway = false                   environment        = "prod"
}                                              enable_nat_gateway = true
                                               single_nat_gateway = false
                                             }

Same module, different inputs. Dev skips the NAT gateway (~$32/month savings). Prod gets one per AZ. Both get flow logs because the module defaults to true.

Remember: Module defaults are your governance lever. Set enable_flow_logs = true by default, and teams must explicitly opt out. The PR review catches the opt-out — compare that to a policy doc nobody reads.

Flashcard Check #1¶

Cover the right column. Test yourself.

Question	Answer
What three files does every module need at minimum?	`main.tf` (resources), `variables.tf` (inputs), `outputs.tf` (return values)
Why `for_each` instead of `count` for subnets?	`count` uses numeric indexes — removing an item shifts all subsequent indexes, causing destroy/recreate. `for_each` uses stable keys.
What does `can(cidrnetmask(var.vpc_cidr))` do in a validation block?	Returns `true` if the string is a valid CIDR block, `false` otherwise. Catches malformed CIDRs at plan time.
How do you access a module's output?	`module.<NAME>.<OUTPUT>` — e.g., `module.vpc.vpc_id`
What's the Terraform Registry naming convention?	`terraform-<PROVIDER>-<NAME>` — e.g., `terraform-aws-vpc`

Local vs. Remote Modules: Where Does the Code Live?¶

So far we used a local path (source = "../../modules/vpc"). That works for a single repo. But when multiple repos need the same module, or when you need versioning, local paths break down.

# Local path — no versioning, tied to repo structure
module "vpc" {
  source = "../../modules/vpc"
}

# Terraform Registry — versioned, discoverable, documented
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.5.0"
}

# GitHub — versioned via Git tags
module "vpc" {
  source = "git::https://github.com/mycompany/terraform-modules.git//vpc?ref=v2.1.0"
}

# S3 — for air-gapped or private environments
module "vpc" {
  source = "s3::https://s3-us-east-1.amazonaws.com/mycompany-modules/vpc/v2.1.0.zip"
}

Source	Versioning	Best for
Local path	None (whatever's on disk)	Rapid iteration within one repo
Terraform Registry	Semantic version constraints	Public modules, shared across orgs
Git (GitHub/GitLab)	Tag or SHA ref	Private modules, org-wide sharing
S3/GCS	Directory per version	Air-gapped environments, artifact-based workflows

Gotcha: Every time you change the source of a module, you must run terraform init (or terraform init -upgrade). Terraform caches modules in .terraform/modules/ and won't notice a source change without re-initialization.

The Versioning Problem (Or: Why "Latest" Is a Four-Letter Word)¶

Here's a story that happens at every organization using Terraform at scale.

War Story: A platform team published v2.0.0 of their VPC module. It renamed an output from private_subnets to private_subnet_ids for consistency. Reasonable change, clearly a major version bump. But 12 application teams had source pointed at the module without version pinning. Monday morning, 50 CI pipelines broke simultaneously. Engineers across 4 time zones filed tickets against the platform team. The fix took 10 minutes per team — just pin the version — but coordinating it took two days. The postmortem action item: "All module references MUST include a version constraint."

The rules:

# BAD — pulls latest on every terraform init
module "vpc" {
  source = "terraform-aws-modules/vpc/aws"
}

# BAD — pins to a Git branch that can change under you
module "vpc" {
  source = "git::https://github.com/mycompany/terraform-modules.git//vpc?ref=main"
}

# GOOD — exact version pin
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "5.5.0"
}

# GOOD — allows patch updates but not minor/major
module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "~> 5.5.0"    # >= 5.5.0, < 5.6.0
}

# GOOD — immutable Git SHA
module "vpc" {
  source = "git::https://github.com/mycompany/terraform-modules.git//vpc?ref=abc123def"
}

Name Origin: The ~> operator is borrowed from Ruby's Bundler, where it's called the "twiddle-wakka" or "pessimistic version constraint." ~> 5.5.0 means ">= 5.5.0, < 5.6.0" (patches only). ~> 5.0 means ">= 5.0, < 6.0" (minor updates too).

Module Composition: Modules Calling Modules¶

Real infrastructure is modules wired together. Outputs from one become inputs to the next:

# environments/prod/main.tf — root module composes everything

module "network" {
  source             = "../../modules/vpc"
  vpc_name           = "prod"
  vpc_cidr           = "10.2.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
  environment        = "prod"
  enable_nat_gateway = true
  single_nat_gateway = false
}

module "eks" {
  source       = "../../modules/eks"
  cluster_name = "prod-cluster"
  vpc_id       = module.network.vpc_id                  # network → EKS
  subnet_ids   = values(module.network.private_subnet_ids)
}

module "rds" {
  source     = "../../modules/rds"
  identifier = "prod-db"
  vpc_id     = module.network.vpc_id                    # network → RDS
  subnet_ids = values(module.network.private_subnet_ids)
}

Terraform builds the dependency graph from these references: network first, then EKS + RDS in parallel. No depends_on needed.

Mental Model: Module composition is LEGO. The VPC module is the baseplate. EKS and RDS snap onto it. Each piece has defined connection points (outputs/inputs). Swap the RDS module for Aurora without rebuilding the baseplate — as long as it provides the same outputs.

The Two-Level Rule¶

Keep nesting to two levels max. Deeper nesting makes state paths unreadable:

module.network.aws_vpc.this                              # Good — readable
module.platform.module.network.aws_vpc.this              # Pain starts here
module.prod.module.platform.module.network.aws_vpc.this  # Nobody can debug this

`for_each` and `count` at the Module Level¶

Since Terraform 0.13, you can use for_each on module blocks — create per-region infrastructure from a single definition:

module "vpc" {
  source   = "../../modules/vpc"
  for_each = {
    "us-east-1" = { cidr = "10.0.0.0/16", azs = ["us-east-1a", "us-east-1b", "us-east-1c"] }
    "eu-west-1" = { cidr = "10.1.0.0/16", azs = ["eu-west-1a", "eu-west-1b"] }
  }

  vpc_name           = "prod-${each.key}"
  vpc_cidr           = each.value.cidr
  availability_zones = each.value.azs
  environment        = "prod"
  enable_nat_gateway = true
}

# Access: module.vpc["us-east-1"].vpc_id

Adding ap-southeast-1 later only creates new resources — existing regions are untouched.

Gotcha: for_each keys must be known at plan time. If keys come from a data source, you get: The "for_each" map includes keys derived from resource attributes that cannot be determined until apply. Fix: use static keys, not dynamic ones.

Testing Modules: Trust, But Verify¶

The HCL-Native Testing Framework (Terraform 1.6+)¶

Terraform has a built-in test framework. Test files use .tftest.hcl extension:

# tests/vpc.tftest.hcl

variables {
  vpc_name           = "test"
  vpc_cidr           = "10.99.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b"]
  environment        = "dev"
  enable_nat_gateway = false
  enable_flow_logs   = false
}

# Plan-only test — fast, free, no real resources
run "validates_inputs" {
  command = plan

  assert {
    condition     = aws_vpc.this.cidr_block == "10.99.0.0/16"
    error_message = "VPC CIDR doesn't match input."
  }

  assert {
    condition     = length(aws_subnet.public) == 2
    error_message = "Expected 2 public subnets."
  }
}

# Test input validation catches bad CIDRs
run "rejects_invalid_cidr" {
  command = plan
  expect_failures = [var.vpc_cidr]

  variables {
    vpc_cidr = "not-a-cidr"
  }
}

terraform test                                      # Run all tests
terraform test -filter=tests/vpc.tftest.hcl         # Specific file
terraform test -verbose                             # See each assertion

Test type	`command = plan`	`command = apply`
Speed	Seconds	Minutes
Cost	Free	Creates real cloud resources
Catches	Config errors, validation, logic	API errors, permission issues, provider bugs
Use when	Fast feedback during development	CI pipeline before publishing a module version

Under the Hood: terraform test creates an isolated state per test run. Apply tests create real infrastructure, run assertions, then destroy everything at the end. If a test crashes mid-run, resources are orphaned. Test environments need aggressive cost alerts.

Before the native framework, Terratest (a Go library) was the standard. It's still useful for cross-module integration tests and validating things the Terraform provider doesn't expose — like making HTTP calls to deployed services.

Flashcard Check #2¶

Question	Answer
What does `version = "~> 5.5.0"` mean?	>= 5.5.0 and < 5.6.0 (patch updates only)
Why should you never point a module source at a Git branch like `main`?	Branches change — your next `terraform init` could pull breaking changes without warning
What command do you run after changing a module's `source`?	`terraform init` (or `terraform init -upgrade`)
What's the max recommended nesting depth for modules?	Two levels (root → child → grandchild)
`terraform test` with `command = plan` vs `command = apply` — which costs money?	`apply` creates real cloud resources; `plan` is free
What does `expect_failures` do in a test block?	Asserts that the specified variable or resource validation should fail — used to test that input validation catches bad inputs

Anti-Patterns: Modules Gone Wrong¶

The God Module¶

# DON'T: One module that creates everything
module "platform" {
  source = "../../modules/platform"

  # 47 input variables covering VPC, EKS, RDS, ElastiCache,
  # S3, CloudFront, Route53, ACM, WAF, and monitoring
  vpc_cidr            = "10.0.0.0/16"
  cluster_name        = "prod"
  db_instance_class   = "db.r6g.xlarge"
  cache_node_type     = "cache.r6g.large"
  # ... 43 more variables ...
}

A god module has the blast radius of a monolith. Change anything, risk everything. It's also impossible to test — you can't test the VPC logic without also provisioning an EKS cluster and a database.

Fix: One module per concern. A VPC module, an EKS module, an RDS module. Compose them in the root module.

Hidden Provider Configuration¶

Modules should never contain provider blocks — it hardcodes the region/account. The caller passes providers in via the providers argument.

Circular Dependencies¶

Module A outputs a security group. Module B uses it and outputs a subnet. Module A needs that subnet. Terraform can't resolve cycles. Fix: extract shared resources into a third module, or restructure so dependencies flow one direction.

Overly Generic Inputs¶

map(any) hides required structure. Use typed objects instead — they're self-documenting and catch type errors at plan time:

# DON'T                              # DO
variable "config" {                   variable "config" {
  type = map(any)                       type = object({
}                                         cidr = string
                                          azs  = list(string)
                                        })
                                      }

Module Governance: Scaling Beyond One Team¶

At scale, you need guardrails: private registries (Terraform Cloud, Artifactory, or S3-backed) for hosting approved modules, and policy-as-code (Sentinel or OPA) to enforce rules on plans before they apply:

# OPA: require encryption on all S3 buckets
deny[msg] {
    resource := input.resource_changes[_]
    resource.type == "aws_s3_bucket"
    resource.change.actions[_] == "create"
    not resource.change.after.server_side_encryption_configuration
    msg := sprintf("S3 bucket %s must have encryption enabled", [resource.name])
}

The pipeline: Engineer writes Terraform → CI runs terraform plan → Plan JSON evaluated against policies → Violations block the apply.

Interview Bridge: "How would you enforce infrastructure standards across 50 teams?" Modules encode how to build things right. Sentinel/OPA enforce that teams must use them.

War Story: The Module Upgrade That Broke 50 Environments¶

War Story: An infrastructure team maintained a shared RDS module used by 50 service teams. Version 3.2.0 added a parameter group with log_min_duration_statement = 1000 (log queries over 1 second). Sensible default. But the parameter group name was derived from the database identifier using a new naming scheme. When teams upgraded from 3.1.x to 3.2.0, Terraform detected the parameter group name change and planned a replacement — which on RDS means a database reboot. Fifty databases, fifty reboots. The teams that ran terraform plan caught it. The three teams that had auto-approve in CI did not. Three production databases rebooted during business hours. The fix: the module team released 3.2.1 within hours, using lifecycle { create_before_destroy = true } on the parameter group and preserving the old naming scheme with a deprecation notice. The postmortem action items: (1) Never change resource naming schemes in a minor version. (2) All module upgrades require terraform plan review in a PR before apply. (3) auto-approve in production CI is banned.

This story illustrates why module versioning isn't academic. A naming change in a module can cascade into infrastructure destruction across dozens of teams. Semantic versioning is a contract: patch versions fix bugs, minor versions add features without breaking existing behavior, major versions may break things.

Exercises¶

Exercise 1: Spot the Anti-Pattern (2 minutes)¶

What's wrong with this module call?

module "vpc" {
  source = "git::https://github.com/company/tf-modules.git//vpc?ref=main"

  config = {
    cidr = "10.0.0.0/16"
    azs  = ["us-east-1a"]
  }
}

Solution

Three problems: 1. **`ref=main`** — pointing at a branch, not a version tag. Any push to `main` changes what you get on `terraform init`. 2. **Single AZ** — no high availability. The module should validate that at least 2 AZs are provided. 3. **`config = {}`** — untyped map input. Should be individual typed variables for clarity and validation. Fixed:

module "vpc" {
  source = "git::https://github.com/company/tf-modules.git//vpc?ref=v2.1.0"

  vpc_cidr           = "10.0.0.0/16"
  availability_zones = ["us-east-1a", "us-east-1b"]
  environment        = "dev"
}

Exercise 2: Cross-Variable Validation (5 minutes)¶

Write a precondition that allows any instance_type in production but restricts non-production to t3.micro, t3.small, and t3.medium.

Solution

resource "aws_instance" "this" {
  ami           = var.ami_id
  instance_type = var.instance_type

  lifecycle {
    precondition {
      condition = (
        var.environment == "prod" ||
        contains(["t3.micro", "t3.small", "t3.medium"], var.instance_type)
      )
      error_message = "Non-prod is limited to t3.micro/small/medium."
    }
  }
}

Exercise 3: Refactor Copy-Paste into a Module (15 minutes)¶

You have two nearly identical security group definitions — one in dev/ and one in prod/. They differ only in allowed CIDR ranges and the VPC ID. Create a module at modules/web-sg/ that takes vpc_id, allowed_cidrs, and environment as inputs, and outputs the security group ID.

Solution

# modules/web-sg/variables.tf
variable "vpc_id"        { type = string }
variable "allowed_cidrs" { type = list(string) }
variable "environment"   { type = string }

# modules/web-sg/main.tf
resource "aws_security_group" "web" {
  name_prefix = "${var.environment}-web-"
  vpc_id      = var.vpc_id

  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = var.allowed_cidrs
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = { Name = "${var.environment}-web-sg", ManagedBy = "terraform" }
}

# modules/web-sg/outputs.tf
output "security_group_id" { value = aws_security_group.web.id }

Cheat Sheet¶

Pin this to your wall.

Task	Command / Syntax
Initialize modules	`terraform init`
Update module version	Change `version`, run `terraform init -upgrade`
List modules in state	`terraform state list \\| grep module`
Move resource into module	`terraform state mv aws_vpc.main module.network.aws_vpc.this`
Module output reference	`module.<NAME>.<OUTPUT>`
Version pin (exact)	`version = "5.5.0"`
Version pin (patch range)	`version = "~> 5.5.0"` (>= 5.5.0, < 5.6.0)
Version pin (minor range)	`version = "~> 5.0"` (>= 5.0, < 6.0)
Run module tests	`terraform test`
Run specific test	`terraform test -filter=tests/vpc.tftest.hcl`
Validate config	`terraform validate`
Format module code	`terraform fmt -recursive`

Module design rules of thumb:

Rule	Why
One module per concern	Blast radius, testability
No `provider` blocks inside modules	Caller controls region/account
Typed variables, not `map(any)`	Self-documenting, validates at plan time
Minimal outputs	Fewer outputs = smaller API surface = fewer breaking changes
Default to secure	`enable_encryption = true`, `enable_flow_logs = true`
Max 2 levels of nesting	Deeper nesting = unreadable state paths
Pin versions in production	Unpinned modules are ticking time bombs

Takeaways¶

Modules are functions for infrastructure. Inputs, logic, outputs. If you're copy-pasting .tf files between directories, you need a module.
Version pinning is not optional. An unpinned module source is a production incident waiting for someone to push a breaking change upstream.
for_each over count, always. Index-based addressing (count) causes cascade destruction when you remove items. Key-based addressing (for_each) is surgical.
Module defaults are governance. Set secure defaults (encryption = true, flow_logs = true) and make teams explicitly opt out. The PR review becomes the policy enforcement.
Test modules before publishing. The native terraform test framework catches bugs at plan time for free. Apply tests catch the rest.
Keep modules small. A module that creates a VPC is good. A module that creates a VPC, EKS cluster, RDS database, and monitoring stack is a liability.

The Terraform State Disaster — what happens when state goes wrong and how to recover
Terraform vs Ansible vs Helm — when to use which tool for infrastructure and configuration
GitOps: The Repo Is the Truth — how module versioning fits into a GitOps workflow
The Cloud Bill Surprise — cost implications of module design decisions (NAT gateways, instance types)

Terraform Modules: Building Infrastructure LEGOs

The Mission¶

Why Modules (The 3-Minute Case)¶

Module Anatomy: What's in the Box¶

Building the VPC Module (Hands On)¶

Step 1: Variables — The Module's API¶

Step 2: Resources — The Module's Body¶

Step 3: Outputs — The Module's Return Values¶

Calling the Module: Three Environments, One Source¶

Flashcard Check #1¶

Local vs. Remote Modules: Where Does the Code Live?¶

The Versioning Problem (Or: Why "Latest" Is a Four-Letter Word)¶

Module Composition: Modules Calling Modules¶

The Two-Level Rule¶

for_each and count at the Module Level¶

Testing Modules: Trust, But Verify¶

The HCL-Native Testing Framework (Terraform 1.6+)¶

Flashcard Check #2¶

Anti-Patterns: Modules Gone Wrong¶

The God Module¶

Hidden Provider Configuration¶

Circular Dependencies¶

Overly Generic Inputs¶

Module Governance: Scaling Beyond One Team¶

War Story: The Module Upgrade That Broke 50 Environments¶

Exercises¶

Exercise 1: Spot the Anti-Pattern (2 minutes)¶

Exercise 2: Cross-Variable Validation (5 minutes)¶

Exercise 3: Refactor Copy-Paste into a Module (15 minutes)¶

Cheat Sheet¶

Takeaways¶

Related Lessons¶

Pages that link here¶

`for_each` and `count` at the Module Level¶