AWS Route 53 - Street-Level Ops¶

Real-world Route 53 operational workflows for DNS migrations, failover setup, debugging, and hybrid architectures.

Migrating DNS to Route 53¶

The most common Route 53 operation: moving an existing domain from another DNS provider.

# Step 1: Export your existing zone file from the current provider
# Most providers offer zone file export (BIND format).
# If not, manually list all records.

# Step 2: Create the hosted zone in Route 53
ZONE_ID=$(aws route53 create-hosted-zone \
  --name example.com \
  --caller-reference "migration-$(date +%s)" \
  --query 'HostedZone.Id' --output text | sed 's|/hostedzone/||')

echo "Zone ID: $ZONE_ID"

# Step 3: Get the NS records Route 53 assigned
aws route53 get-hosted-zone --id $ZONE_ID \
  --query 'DelegationSet.NameServers[]' --output text
# Example output:
# ns-1234.awsdns-56.org
# ns-789.awsdns-01.co.uk
# ns-456.awsdns-23.com
# ns-012.awsdns-78.net

# Step 4: BEFORE updating the registrar, lower TTLs at your current provider
# Set all records to TTL 60-300 seconds
# Wait at least the OLD TTL duration (e.g., if it was 86400, wait 24 hours)

# Step 5: Import records into Route 53
# Create a JSON change batch for all your records:
cat > /tmp/import-records.json << 'EOF'
{
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "example.com",
        "Type": "A",
        "TTL": 300,
        "ResourceRecords": [{"Value": "203.0.113.10"}]
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "www.example.com",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [{"Value": "example.com"}]
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "example.com",
        "Type": "MX",
        "TTL": 3600,
        "ResourceRecords": [
          {"Value": "10 mail.example.com"},
          {"Value": "20 mail2.example.com"}
        ]
      }
    },
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "example.com",
        "Type": "TXT",
        "TTL": 3600,
        "ResourceRecords": [
          {"Value": "\"v=spf1 include:_spf.google.com ~all\""}
        ]
      }
    }
  ]
}
EOF

aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch file:///tmp/import-records.json

# Step 6: Verify records resolve correctly via Route 53 NS directly
dig @ns-1234.awsdns-56.org example.com A +short
dig @ns-1234.awsdns-56.org www.example.com CNAME +short
dig @ns-1234.awsdns-56.org example.com MX +short
dig @ns-1234.awsdns-56.org example.com TXT +short

# Step 7: Update NS records at the registrar to the Route 53 name servers
# This is done in the registrar's web UI — not via CLI.
# Replace existing NS records with the four Route 53 NS records.

# Step 8: Monitor propagation
watch -n 30 'dig example.com NS +short'
# Wait until you see the Route 53 NS records consistently.
# Full propagation can take 24-48 hours.

# Step 9: After propagation is complete, raise TTLs to production values

Setting Up Failover Between Regions¶

Active-passive failover with health checks.

# Step 1: Create health checks for both regions
PRIMARY_HC=$(aws route53 create-health-check \
  --caller-reference "primary-hc-$(date +%s)" \
  --health-check-config '{
    "Type": "HTTPS",
    "FullyQualifiedDomainName": "us-east-1-alb.example.com",
    "Port": 443,
    "ResourcePath": "/health",
    "RequestInterval": 10,
    "FailureThreshold": 3
  }' --query 'HealthCheck.Id' --output text)

SECONDARY_HC=$(aws route53 create-health-check \
  --caller-reference "secondary-hc-$(date +%s)" \
  --health-check-config '{
    "Type": "HTTPS",
    "FullyQualifiedDomainName": "us-west-2-alb.example.com",
    "Port": 443,
    "ResourcePath": "/health",
    "RequestInterval": 10,
    "FailureThreshold": 3
  }' --query 'HealthCheck.Id' --output text)

# Step 2: Tag the health checks (for identification)
aws route53 change-tags-for-resource \
  --resource-type healthcheck --resource-id $PRIMARY_HC \
  --add-tags Key=Name,Value=primary-us-east-1

aws route53 change-tags-for-resource \
  --resource-type healthcheck --resource-id $SECONDARY_HC \
  --add-tags Key=Name,Value=secondary-us-west-2

# Step 3: Create failover records
aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch '{
    "Changes": [
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "primary-us-east-1",
          "Failover": "PRIMARY",
          "AliasTarget": {
            "HostedZoneId": "Z35SXDOTRQ7X7K",
            "DNSName": "us-east-1-alb.example.com",
            "EvaluateTargetHealth": true
          },
          "HealthCheckId": "'$PRIMARY_HC'"
        }
      },
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "secondary-us-west-2",
          "Failover": "SECONDARY",
          "AliasTarget": {
            "HostedZoneId": "Z1H1FL5HABSF5",
            "DNSName": "us-west-2-alb.example.com",
            "EvaluateTargetHealth": true
          },
          "HealthCheckId": "'$SECONDARY_HC'"
        }
      }
    ]
  }'

# Step 4: Test the failover by checking health check status
aws route53 get-health-check-status --health-check-id $PRIMARY_HC \
  --query 'HealthCheckObservations[].{Region:Region,Status:StatusReport.Status}'

Weighted Routing for Canary Deploys¶

Shift a percentage of traffic to a new version.

# Start: 100% to v1, 0% to v2
# Then: 95/5, 90/10, 75/25, 50/50, 0/100

# Create weighted records
aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch '{
    "Changes": [
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "v1-stable",
          "Weight": 95,
          "AliasTarget": {
            "HostedZoneId": "Z35SXDOTRQ7X7K",
            "DNSName": "v1-alb.example.com",
            "EvaluateTargetHealth": true
          }
        }
      },
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "v2-canary",
          "Weight": 5,
          "AliasTarget": {
            "HostedZoneId": "Z35SXDOTRQ7X7K",
            "DNSName": "v2-alb.example.com",
            "EvaluateTargetHealth": true
          }
        }
      }
    ]
  }'

# Shift traffic gradually — update weights
# To go to 50/50:
aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch '{
    "Changes": [
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "v1-stable",
          "Weight": 50,
          "AliasTarget": {
            "HostedZoneId": "Z35SXDOTRQ7X7K",
            "DNSName": "v1-alb.example.com",
            "EvaluateTargetHealth": true
          }
        }
      },
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "app.example.com",
          "Type": "A",
          "SetIdentifier": "v2-canary",
          "Weight": 50,
          "AliasTarget": {
            "HostedZoneId": "Z35SXDOTRQ7X7K",
            "DNSName": "v2-alb.example.com",
            "EvaluateTargetHealth": true
          }
        }
      }
    ]
  }'

# Rollback: set canary weight to 0
# Note: weight 0 means Route 53 never returns that record (unless all others are 0 too)

Debugging Route 53 Resolution¶

When DNS is not resolving as expected.

# 1. Check what Route 53 thinks the records are
aws route53 list-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --query "ResourceRecordSets[?Name=='app.example.com.']"

# 2. Test resolution directly against Route 53 name servers
# (bypasses caching resolvers)
NS=$(aws route53 get-hosted-zone --id $ZONE_ID \
  --query 'DelegationSet.NameServers[0]' --output text)
dig @$NS app.example.com A +short
dig @$NS app.example.com A +trace

# 3. Test resolution from the Route 53 resolver's perspective
aws route53 test-dns-answer \
  --hosted-zone-id $ZONE_ID \
  --record-name app.example.com \
  --record-type A
# This shows exactly what Route 53 would return, including
# routing policy evaluation and health check status

# 4. Check health check status (if routing policy depends on it)
aws route53 list-health-checks \
  --query 'HealthChecks[].{Id:Id,Config:HealthCheckConfig.FullyQualifiedDomainName,Status:HealthCheckConfig.Type}'

# For each health check:
aws route53 get-health-check-status --health-check-id hc-abc123 \
  --query 'HealthCheckObservations[].{Region:Region,IP:IPAddress,Status:StatusReport.Status}'

# 5. Check query logs if enabled
aws logs filter-log-events \
  --log-group-name /aws/route53/example.com \
  --filter-pattern "app.example.com" \
  --start-time $(date -d '1 hour ago' +%s000) \
  --limit 20 \
  --query 'events[].message' --output text

# 6. Check from external resolvers
dig @8.8.8.8 app.example.com A +short     # Google
dig @1.1.1.1 app.example.com A +short     # Cloudflare
dig @208.67.222.222 app.example.com A +short  # OpenDNS
# If Route 53 NS answers correctly but these do not, it is a caching/TTL issue

Private Hosted Zone Across Accounts¶

Share a private hosted zone with VPCs in other AWS accounts.

# In the account that owns the hosted zone:
# Authorize the other account's VPC to associate
aws route53 create-vpc-association-authorization \
  --hosted-zone-id Z0987654321XYZ \
  --vpc VPCRegion=us-east-1,VPCId=vpc-other-account-123

# In the OTHER account:
# Associate the VPC with the private hosted zone
aws route53 associate-vpc-with-hosted-zone \
  --hosted-zone-id Z0987654321XYZ \
  --vpc VPCRegion=us-east-1,VPCId=vpc-other-account-123

# Back in the owning account: revoke the authorization (one-time use)
aws route53 delete-vpc-association-authorization \
  --hosted-zone-id Z0987654321XYZ \
  --vpc VPCRegion=us-east-1,VPCId=vpc-other-account-123

# Verify the association
aws route53 get-hosted-zone --id Z0987654321XYZ \
  --query 'VPCs[].{Region:VPCRegion,VPC:VPCId}'

Route 53 + CloudFront Setup¶

The standard pattern for serving a static site or CDN-fronted app.

# Prerequisite: CloudFront distribution with an ACM certificate
# (certificate must be in us-east-1 for CloudFront)

# Create alias record pointing to CloudFront
aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch '{
    "Changes": [
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "www.example.com",
          "Type": "A",
          "AliasTarget": {
            "HostedZoneId": "Z2FDTNDATAQYW2",
            "DNSName": "d1234567890.cloudfront.net",
            "EvaluateTargetHealth": false
          }
        }
      },
      {
        "Action": "UPSERT",
        "ResourceRecordSet": {
          "Name": "www.example.com",
          "Type": "AAAA",
          "AliasTarget": {
            "HostedZoneId": "Z2FDTNDATAQYW2",
            "DNSName": "d1234567890.cloudfront.net",
            "EvaluateTargetHealth": false
          }
        }
      }
    ]
  }'

# Z2FDTNDATAQYW2 is always the hosted zone ID for CloudFront — it is a constant.

# Verify
dig www.example.com A +short
# Should return CloudFront edge IPs

Route 53 Resolver for Hybrid DNS¶

On-premises servers need to resolve AWS private hosted zones. AWS resources need to resolve on-prem domains.

# Scenario:
# On-prem DNS: corp.internal (10.100.1.53, 10.100.2.53)
# AWS private zone: aws.internal
# Goal: bidirectional resolution

# Step 1: Inbound endpoint (on-prem → AWS)
# On-prem DNS servers forward aws.internal queries here
INBOUND=$(aws route53resolver create-resolver-endpoint \
  --creator-request-id "inbound-$(date +%s)" \
  --name "on-prem-to-aws" \
  --security-group-ids sg-resolver-inbound \
  --direction INBOUND \
  --ip-addresses \
    SubnetId=subnet-priv1a,Ip=10.0.1.10 \
    SubnetId=subnet-priv1b,Ip=10.0.2.10 \
  --query 'ResolverEndpoint.Id' --output text)

echo "Configure on-prem DNS to forward aws.internal to 10.0.1.10 and 10.0.2.10"

# Step 2: Outbound endpoint (AWS → on-prem)
OUTBOUND=$(aws route53resolver create-resolver-endpoint \
  --creator-request-id "outbound-$(date +%s)" \
  --name "aws-to-on-prem" \
  --security-group-ids sg-resolver-outbound \
  --direction OUTBOUND \
  --ip-addresses \
    SubnetId=subnet-priv1a \
    SubnetId=subnet-priv1b \
  --query 'ResolverEndpoint.Id' --output text)

# Step 3: Create forwarding rule for on-prem domain
RULE=$(aws route53resolver create-resolver-rule \
  --creator-request-id "fwd-corp-$(date +%s)" \
  --name "forward-corp-internal" \
  --rule-type FORWARD \
  --domain-name "corp.internal" \
  --resolver-endpoint-id $OUTBOUND \
  --target-ips "Ip=10.100.1.53,Port=53" "Ip=10.100.2.53,Port=53" \
  --query 'ResolverRule.Id' --output text)

# Step 4: Associate the rule with VPCs
aws route53resolver associate-resolver-rule \
  --resolver-rule-id $RULE \
  --vpc-id vpc-abc123

# Step 5: Verify from an EC2 instance in the VPC
# SSH to the instance and run:
dig server1.corp.internal +short
# Should resolve to the on-prem IP

# Step 6: Verify from on-prem (after configuring conditional forwarder)
# On Windows DNS: Add conditional forwarder for aws.internal → 10.0.1.10
# On BIND: zone "aws.internal" { type forward; forwarders { 10.0.1.10; 10.0.2.10; }; };

Calculating Route 53 Costs¶

# Hosted zones: $0.50/month each
# Queries (standard): $0.40 per million (first 1B/month)
# Queries (latency-based): $0.60 per million
# Queries (geo): $0.70 per million
# Health checks (basic): $0.50/month (AWS endpoint), $0.75/month (non-AWS)
# Health checks (HTTPS): add $1.00/month
# Health checks (string match): add $2.00/month
# Health checks (fast interval 10s): doubles the cost
# Traffic flow policy record: $50/month

# Example: 3 hosted zones, 10M queries/month, 4 health checks (HTTPS, 10s)
ZONES=3
QUERIES_M=10
HC_COUNT=4

ZONE_COST=$(echo "$ZONES * 0.50" | bc)
QUERY_COST=$(echo "$QUERIES_M * 0.40" | bc)
HC_COST=$(echo "$HC_COUNT * (0.75 + 1.00) * 2" | bc)  # non-AWS, HTTPS, fast

echo "Zones:   \$$ZONE_COST/month"
echo "Queries: \$$QUERY_COST/month"
echo "Health:  \$$HC_COST/month"
echo "Total:   \$$(echo "$ZONE_COST + $QUERY_COST + $HC_COST" | bc)/month"
# Zones: $1.50, Queries: $4.00, Health: $14.00 = $19.50/month

Bulk Record Management with CLI and Terraform¶

For managing dozens or hundreds of records, use the CLI batch API or Terraform.

# CLI: batch upsert from a JSON file
# The change batch supports up to 1000 changes per call

# Generate a change batch from a CSV
cat records.csv
# name,type,ttl,value
# app.example.com,A,300,203.0.113.10
# api.example.com,A,300,203.0.113.20
# cdn.example.com,CNAME,3600,d123.cloudfront.net

# Convert CSV to Route 53 change batch
python3 -c "
import csv, json, sys
changes = []
for row in csv.DictReader(open('records.csv')):
    changes.append({
        'Action': 'UPSERT',
        'ResourceRecordSet': {
            'Name': row['name'],
            'Type': row['type'],
            'TTL': int(row['ttl']),
            'ResourceRecords': [{'Value': row['value']}]
        }
    })
json.dump({'Changes': changes}, sys.stdout, indent=2)
" > /tmp/batch-changes.json

aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch file:///tmp/batch-changes.json

# Check the change status
CHANGE_ID=$(aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch file:///tmp/batch-changes.json \
  --query 'ChangeInfo.Id' --output text)

aws route53 get-change --id $CHANGE_ID \
  --query 'ChangeInfo.Status'
# PENDING → INSYNC (usually within 60 seconds)

Terraform approach for infrastructure-as-code DNS management:

# Terraform: manage Route 53 records declaratively
resource "aws_route53_zone" "main" {
  name = "example.com"
}

resource "aws_route53_record" "app" {
  zone_id = aws_route53_zone.main.zone_id
  name    = "app.example.com"
  type    = "A"

  alias {
    name                   = aws_lb.app.dns_name
    zone_id                = aws_lb.app.zone_id
    evaluate_target_health = true
  }
}

resource "aws_route53_health_check" "app" {
  fqdn              = "app.example.com"
  port               = 443
  type               = "HTTPS"
  resource_path      = "/health"
  failure_threshold  = 3
  request_interval   = 10

  tags = {
    Name = "app-health-check"
  }
}