How We Got Here: Infrastructure as Code¶

Arc: Infrastructure Eras covered: 6 Timeline: ~2010-2025 Read time: ~12 min

The Original Problem¶

In 2010, provisioning cloud infrastructure meant clicking through the AWS Management Console. You'd click "Launch Instance," choose an AMI, pick a security group (or create one inline), attach an EBS volume, and hope you remembered all the settings when you needed to do it again. There was no record of what you did, no way to reproduce it, and no way to review it. "Infrastructure" lived in someone's head and in the console's current state. When that person left the company, the knowledge left with them.

Worse, the console was the audit trail. "Who created this S3 bucket with public access?" Nobody knew. CloudTrail existed but nobody read it proactively. The gap between "we should track infrastructure changes" and "we actually do" was enormous.

Era 1: ClickOps and Console Cowboying (~2006-2012)¶

The Solution¶

The AWS Console, Azure Portal, and GCP Console were the primary interfaces. For slightly more sophisticated teams, the AWS CLI and SDKs provided scriptable access. But most infrastructure was created and modified through web UIs.

What It Looked Like¶

# The "infrastructure as code" of 2010:
1. Log into AWS Console
2. Navigate to EC2 → Launch Instance
3. Click through 7 configuration screens
4. Forget to add the tag for cost allocation
5. Realize 3 days later you chose the wrong subnet
6. Create another instance in the right subnet
7. Forget to terminate the old one
8. Get a $400 bill surprise at month-end

Why It Was Better¶

Accessible to anyone who could use a web browser
Visual feedback — you could see what you were creating
No tooling to install or learn

Why It Wasn't Enough¶

Not reproducible — doing it again required remembering every click
Not reviewable — no pull request for infrastructure changes
Not auditable — who changed what, when, and why?
Not testable — can't unit test a click sequence
Environments diverged immediately (dev never matched prod)

Legacy You'll Still See¶

ClickOps is alive and well. Many teams still create "quick" resources through the console, intending to codify them later (they don't). The AWS Console is often the first place people go to debug — and sometimes the first place they go to "fix" things, bypassing their own IaC pipeline.

Era 2: Imperative Scripts and SDKs (~2010-2014)¶

The Solution¶

Teams wrote scripts using cloud SDKs (boto for Python/AWS, fog for Ruby, azure-sdk for .NET) that called cloud APIs directly. This was at least reproducible and version-controllable.

What It Looked Like¶

# boto script to create a VPC, subnet, and instance (~2012)
import boto.ec2
import boto.vpc

conn_vpc = boto.vpc.connect_to_region('us-east-1')
vpc = conn_vpc.create_vpc('10.0.0.0/16')
subnet = conn_vpc.create_subnet(vpc.id, '10.0.1.0/24')

conn_ec2 = boto.ec2.connect_to_region('us-east-1')
reservation = conn_ec2.run_instances(
    'ami-12345678',
    instance_type='t2.micro',
    subnet_id=subnet.id,
    key_name='deploy-key',
)

Why It Was Better¶

Scriptable and repeatable
Could be version controlled in Git
Could be parameterized (different values for dev/staging/prod)
Teams could build libraries of common patterns

Why It Wasn't Enough¶

Imperative: scripts described steps, not desired state
No built-in idempotency — running twice created duplicate resources
No dependency graph — order mattered and was error-prone
No state tracking — scripts didn't know what already existed
Deleting resources required separate teardown scripts

Legacy You'll Still See¶

SDK-based provisioning scripts persist in data engineering and ML teams who build custom automation. Lambda functions that create resources on-demand use this pattern. "Quick scripts" for one-off tasks often follow this model.

Era 3: CloudFormation and ARM Templates (~2011-2016)¶

The Solution¶

AWS CloudFormation (2011) introduced declarative infrastructure. You wrote a JSON (later YAML) template describing your desired resources, and CloudFormation created, updated, and deleted them to match. Azure Resource Manager (ARM) templates (2014) followed the same model.

What It Looked Like¶

# CloudFormation template
AWSTemplateFormatVersion: '2010-09-09'
Resources:
  MyVPC:
    Type: AWS::EC2::VPC
    Properties:
      CidrBlock: 10.0.0.0/16
      Tags:
        - Key: Name
          Value: production-vpc

  WebSubnet:
    Type: AWS::EC2::Subnet
    Properties:
      VpcId: !Ref MyVPC
      CidrBlock: 10.0.1.0/24
      AvailabilityZone: us-east-1a

  WebServer:
    Type: AWS::EC2::Instance
    Properties:
      InstanceType: t3.micro
      SubnetId: !Ref WebSubnet
      ImageId: ami-0abcdef1234567890

Why It Was Better¶

Declarative: describe what you want, not how to create it
Automatic dependency resolution
Stack-based lifecycle: create, update, delete as a unit
Drift detection (eventually)
Free — no additional cost beyond the resources themselves

Why It Wasn't Enough¶

Vendor-locked: CloudFormation is AWS-only, ARM is Azure-only
Verbose: simple infrastructure required hundreds of lines of YAML/JSON
Error messages were cryptic ("UPDATE_ROLLBACK_FAILED" was feared)
Rollback behavior was unpredictable and sometimes destructive
No real programming constructs (loops, conditionals were hacks)

Legacy You'll Still See¶

CloudFormation is still heavily used, especially in organizations where AWS is the only cloud. Many CDK and SAM projects compile down to CloudFormation. ARM templates persist in Azure-first enterprises. If you see a 2000-line YAML file with AWSTemplateFormatVersion at the top, you're in this era.

Era 4: Terraform (~2014-2022)¶

The Solution¶

HashiCorp released Terraform in 2014. It used a custom language (HCL) that was more readable than JSON/YAML, supported multiple cloud providers through a plugin system, and maintained a state file that tracked what it had created. For the first time, one tool could manage AWS, Azure, GCP, and dozens of other providers.

What It Looked Like¶

# main.tf
provider "aws" {
  region = "us-east-1"
}

resource "aws_vpc" "main" {
  cidr_block = "10.0.0.0/16"
  tags = {
    Name = "production-vpc"
  }
}

resource "aws_subnet" "web" {
  vpc_id            = aws_vpc.main.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"
}

resource "aws_instance" "web" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "t3.micro"
  subnet_id     = aws_subnet.web.id
}

terraform init
terraform plan    # show what will change
terraform apply   # make it happen

Why It Was Better¶

Multi-cloud with a single tool and language
Plan before apply — see changes before they happen
State file tracks resource-to-code mapping
Module system for reusable components (Terraform Registry)
Massive community and provider ecosystem

Why It Wasn't Enough¶

State file management became its own discipline (locking, backends, drift)
HCL is not a real programming language — complex logic is awkward
The HashiCorp BSL license change (2023) shook community trust
Large state files became slow and fragile
Terraform Cloud/Enterprise pushed a commercial model that not everyone wanted

Legacy You'll Still See¶

Terraform is the most widely used IaC tool today. Most job postings that mention IaC mean Terraform. The OpenTofu fork (after the license change) is gaining traction but the ecosystem is still Terraform-centric. If you do IaC professionally, you will write HCL.

Era 5: Pulumi and CDK (~2018-2024)¶

The Solution¶

AWS CDK (2018) and Pulumi (2018) asked: why invent a new language when developers already know TypeScript, Python, Go, and Java? Both let you define infrastructure using real programming languages with real IDEs, real debuggers, real testing frameworks, and real abstractions like classes and functions.

What It Looked Like¶

// Pulumi — TypeScript
import * as aws from "@pulumi/aws";

const vpc = new aws.ec2.Vpc("main", {
  cidrBlock: "10.0.0.0/16",
  tags: { Name: "production-vpc" },
});

const subnet = new aws.ec2.Subnet("web", {
  vpcId: vpc.id,
  cidrBlock: "10.0.1.0/24",
  availabilityZone: "us-east-1a",
});

const server = new aws.ec2.Instance("web", {
  ami: "ami-0abcdef1234567890",
  instanceType: "t3.micro",
  subnetId: subnet.id,
});

# AWS CDK — Python
from aws_cdk import Stack, aws_ec2 as ec2
from constructs import Construct

class WebStack(Stack):
    def __init__(self, scope: Construct, id: str, **kwargs):
        super().__init__(scope, id, **kwargs)
        vpc = ec2.Vpc(self, "MainVpc", max_azs=2)
        ec2.Instance(self, "WebServer",
            vpc=vpc,
            instance_type=ec2.InstanceType("t3.micro"),
            machine_image=ec2.AmazonLinuxImage(),
        )

Why It Was Better¶

Real programming languages: loops, conditionals, type checking, IDE support
Testable with standard testing frameworks (Jest, pytest)
Reusable abstractions using classes, functions, packages
CDK compiles to CloudFormation — familiar deployment model
Pulumi manages its own state — no separate state file gymnastics

Why It Wasn't Enough¶

CDK is AWS-only (CDKTF for Terraform exists but is a second-class citizen)
Generated CloudFormation templates are enormous and hard to debug
"Turing-complete IaC" can lead to over-engineering
Pulumi's managed state service is a commercial dependency
The community is smaller than Terraform's — fewer examples, fewer modules

Legacy You'll Still See¶

CDK is growing fast in AWS-heavy shops. Pulumi is popular with developer-centric teams. Both coexist with Terraform — often in the same organization. The "should IaC be a DSL or a real language?" debate is ongoing.

Era 6: Crossplane and Control Plane IaC (~2022-2025)¶

The Solution¶

Crossplane (2019, mainstream ~2022) brings IaC into Kubernetes. Instead of running terraform apply from a CI pipeline, you declare cloud resources as Kubernetes custom resources. The Crossplane controller reconciles them continuously, just like Kubernetes reconciles pods. Infrastructure becomes another Kubernetes workload.

What It Looked Like¶

# Crossplane Composition — a managed Postgres
apiVersion: database.aws.crossplane.io/v1beta1
kind: RDSInstance
metadata:
  name: production-db
spec:
  forProvider:
    region: us-east-1
    dbInstanceClass: db.t3.micro
    engine: postgres
    engineVersion: "15"
    masterUsername: admin
  writeConnectionSecretToRef:
    name: db-credentials
    namespace: production

Why It Was Better¶

Continuous reconciliation — drift is automatically corrected
Uses Kubernetes RBAC, namespaces, and policies for access control
Compositions let platform teams build self-service abstractions
Single control plane for both workloads and infrastructure
GitOps-native — ArgoCD can manage infrastructure and applications

Why It Wasn't Enough¶

Requires Kubernetes — which is a significant prerequisite
Provider coverage lags behind Terraform
Debugging is harder (Kubernetes events + cloud API errors)
Community is smaller and documentation is thinner
The "everything through Kubernetes" philosophy is not universally embraced

Legacy You'll Still See¶

Crossplane is gaining adoption in platform engineering teams but is far from mainstream. Most organizations still use Terraform. The pattern of "infrastructure as Kubernetes resources" is the direction, but the tooling is still maturing.

Where We Are Now¶

Terraform dominates, with CDK and Pulumi growing in developer-centric teams. Crossplane is emerging for platform engineering use cases. CloudFormation persists in AWS-only shops. Most organizations use one primary tool with exceptions for edge cases. The state management problem (Terraform state, CloudFormation stacks, Pulumi state) remains one of the biggest operational challenges.

Where It's Going¶

The convergence of IaC and GitOps is the clearest trend — infrastructure declared in Git, reconciled continuously by controllers. AI-assisted IaC generation (describe what you want, get working code) is arriving but not yet reliable for production use. The most impactful near-term change may be the OpenTofu/Terraform split forcing organizations to choose sides.

The Pattern¶

Every generation of IaC tries to be the single abstraction layer between intent and infrastructure. The winning tool is always the one with the largest ecosystem of providers and modules, because infrastructure diversity is the fundamental challenge.

Key Takeaway for Practitioners¶

Learn Terraform deeply — it's the lingua franca. But understand that IaC is a practice, not a tool. The discipline of "all infrastructure changes go through code review" matters more than which tool generates the API calls.

Cross-References¶

Topic Packs: Terraform, CloudFormation, Pulumi
Tool Comparisons: Terraform vs Pulumi vs CDK
Evolution Guides: Configuration Management, Bare Metal to Serverless

How We Got Here: Infrastructure as Code¶

The Original Problem¶

Era 1: ClickOps and Console Cowboying (~2006-2012)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Era 2: Imperative Scripts and SDKs (~2010-2014)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Era 3: CloudFormation and ARM Templates (~2011-2016)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Era 4: Terraform (~2014-2022)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Era 5: Pulumi and CDK (~2018-2024)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Era 6: Crossplane and Control Plane IaC (~2022-2025)¶

The Solution¶

What It Looked Like¶

Why It Was Better¶

Why It Wasn't Enough¶

Legacy You'll Still See¶

Where We Are Now¶

Where It's Going¶

The Pattern¶

Key Takeaway for Practitioners¶

Cross-References¶

Pages that link here¶