Infrastructure Testing — Trivia & Interesting Facts¶

Surprising, historical, and little-known facts about infrastructure testing.

Terratest was created because Terraform had no native testing framework¶

Gruntwork created Terratest in 2018 because Terraform had no built-in way to test infrastructure code. Terratest deploys real infrastructure, validates it, and tears it down — running actual integration tests against cloud providers. This "deploy and verify" approach remains the gold standard for IaC testing, though it's slow and expensive.

Netflix's Chaos Monkey was the beginning of infrastructure testing in production¶

When Netflix released Chaos Monkey in 2012, it introduced the radical idea that production infrastructure should be actively tested by injecting failures. This evolved into the Simian Army (Chaos Gorilla for AZ failures, Latency Monkey for network delays) and spawned the entire chaos engineering discipline.

InSpec has over 500 built-in resource types for compliance testing¶

Chef InSpec, launched in 2015, includes over 500 built-in resource types that can test everything from SSH configuration to AWS S3 bucket policies. Writing a compliance test is as simple as describe sshd_config do its('PermitRootLogin') { should eq 'no' } end. This human-readable syntax was revolutionary for bridging the gap between security policies and automated tests.

Most infrastructure code has zero tests¶

A 2022 analysis of public Terraform repositories on GitHub found that fewer than 15% included any form of automated testing. Most infrastructure code is tested through "apply it and see what happens" — the equivalent of testing software by running it in production. The absence of testing culture in IaC is one of the biggest gaps in modern DevOps.

Molecule revolutionized Ansible role testing¶

Molecule, created by John Googin in 2015, provides a framework for testing Ansible roles in isolated environments (Docker containers, Vagrant VMs, or cloud instances). Before Molecule, testing Ansible roles meant running them against a real or manually created test server. Molecule made test-driven Ansible development practical for the first time.

Policy-as-code testing catches misconfigurations before deployment¶

Tools like Open Policy Agent (OPA), Checkov, and tfsec test infrastructure code against security and compliance policies before deployment. Checkov alone has over 1,000 built-in policies covering AWS, Azure, GCP, and Kubernetes. These tools have prevented countless misconfigurations from reaching production by catching them at the pull request stage.

The "test pyramid" for infrastructure is inverted compared to software¶

In software, the test pyramid has many unit tests, fewer integration tests, and few end-to-end tests. In infrastructure, the pyramid is inverted: most valuable tests are integration tests (does the actual infrastructure work?) and end-to-end tests (can traffic flow through the system?). Unit testing IaC (validating syntax) catches relatively few real-world issues.

Goss validates server state in seconds, not minutes¶

Goss, created by Ahmed Elsabbahy in 2016, can validate server configuration (packages installed, services running, ports listening, files existing) in under a second. It compiles to a single binary and generates test specs from running systems with goss autoadd. This speed makes it practical to run server validation as part of CI/CD pipelines and health checks.

Litmus Chaos brought chaos engineering to Kubernetes as a CRD¶

LitmusChaos, created by MayaData and donated to the CNCF, defines chaos experiments as Kubernetes Custom Resource Definitions. This means chaos tests can be version-controlled, reviewed, and applied through GitOps workflows — treating infrastructure testing as declarative configuration rather than imperative scripting.

The cost of not testing infrastructure is measured in outage hours¶

A 2023 study by Firefly found that organizations without automated infrastructure testing experienced 3.5x more configuration-related outages than those with testing in place. The average configuration-related outage lasted 2.7 hours. The math clearly favors investing in infrastructure testing, yet adoption remains low due to the complexity of setting up test environments.