Portal | Level: L1: Foundations | Topics: AWS EC2, Cloud Deep Dive | Domain: Cloud

AWS EC2 - Primer¶

Why This Matters¶

EC2 is the workhorse of AWS. Even in a container-first, serverless-leaning world, EC2 instances underpin EKS worker nodes, RDS databases, ElastiCache clusters, and anything else that needs a virtual machine. Understanding instance types, storage options, networking behavior, and pricing models directly impacts your ability to run reliable, cost-effective infrastructure.

When an instance is unreachable, when CPU credits run out at 2 AM, when you lose data because you did not understand the difference between instance store and EBS — that is when EC2 knowledge pays for itself.

Core Concepts¶

1. Instance Types and Families¶

Remember: Instance family letter mnemonic: Most workloads (general), Compute, RAM (memory), Tiny-burst (burstable), I/O (storage), Parallel (GPU). The letter tells you what the instance is optimized for.

Every instance type follows the naming convention: <family><generation>.<size>

m7g.xlarge
│││  └── Size: xlarge (4 vCPU, 16 GiB)
││└──── Generation: 7th
│└───── Family: m (general purpose)
└────── (optional) Processor: g = Graviton

Instance families:

Family	Optimized For	Examples
m (general)	Balanced CPU/memory	Web servers, app servers, dev/test
c (compute)	CPU-intensive	Batch processing, ML inference, gaming
r (memory)	Memory-intensive	Databases, in-memory caches, analytics
i/d (storage)	High I/O, local NVMe	Databases needing IOPS, data warehousing
t (burstable)	Variable workloads	Dev/test, small databases, microservices
p/g (accelerated)	GPU workloads	ML training, video encoding, HPC

# List all instance types available in your region
aws ec2 describe-instance-types \
  --query 'InstanceTypes[].{Type:InstanceType,vCPU:VCpuInfo.DefaultVCpus,MemGB:MemoryInfo.SizeInMiB}' \
  --filters "Name=instance-type,Values=m7*" \
  --output table

# Check pricing (use the pricing API or the website)
aws pricing get-products \
  --service-code AmazonEC2 \
  --filters "Type=TERM_MATCH,Field=instanceType,Value=m7g.xlarge" \
  --region us-east-1

Graviton instances (suffix g): ARM-based, 20-40% better price-performance than x86 for most workloads. Use them unless your software requires x86.

2. AMIs and the Boot Process¶

An Amazon Machine Image (AMI) is a snapshot of a root volume plus metadata (kernel, block device mapping, permissions). Every instance launches from an AMI.

# Find the latest Amazon Linux 2023 AMI
aws ec2 describe-images \
  --owners amazon \
  --filters "Name=name,Values=al2023-ami-2023*-x86_64" \
  --query 'sort_by(Images, &CreationDate)[-1].{ID:ImageId,Name:Name,Date:CreationDate}' \
  --output table

# Find the latest Ubuntu 22.04 AMI
aws ec2 describe-images \
  --owners 099720109477 \
  --filters "Name=name,Values=ubuntu/images/hvm-ssd/ubuntu-jammy-22.04-amd64-server-*" \
  --query 'sort_by(Images, &CreationDate)[-1].{ID:ImageId,Name:Name}'

# Create your own AMI from a running instance
aws ec2 create-image \
  --instance-id i-abc123 \
  --name "app-server-$(date +%Y%m%d)" \
  --description "App server with deps pre-installed" \
  --no-reboot  # skip reboot (risk: inconsistent filesystem)

Boot order: 1. Instance launches from AMI 2. Cloud-init runs (reads user data, configures SSH keys, sets hostname) 3. User data script executes (if provided) 4. Instance reaches "running" state

3. Instance Store vs EBS¶

This is one of the most critical distinctions in EC2.

EBS (Elastic Block Store): network-attached persistent storage. Survives instance stop/start. Can be snapshotted. Can be detached and reattached to another instance.

Instance Store: physically attached NVMe SSDs on the host. Extremely fast but ephemeral — data is lost when the instance is stopped, terminated, or the underlying host fails.

War story: A team ran a self-managed Elasticsearch cluster on i3 instances for the local NVMe performance. When AWS performed scheduled maintenance and stopped the instances, all data on the instance store volumes vanished. They had no replicas configured because "we had three nodes." All three were on the same maintenance schedule. The cluster was empty when it came back up.

Instance Store:                    EBS:
├── Blazing fast (local NVMe)      ├── Persistent (survives stop)
├── Free (included in price)       ├── Costs per GB/month + IOPS
├── DATA LOST on stop/terminate    ├── Snapshots for backup
├── Cannot be detached             ├── Can resize, change type
└── Fixed size per instance type   └── Up to 64 TiB per volume

EBS volume types:

Type	IOPS	Throughput	Use Case
gp3	3,000 base (up to 16,000)	125 MiB/s (up to 1,000)	Default for most workloads
io2	Up to 64,000	Up to 1,000 MiB/s	Databases needing consistent IOPS
st1	Baseline 40 MiB/s per TiB	Up to 500 MiB/s	Sequential big data, log processing
sc1	Baseline 12 MiB/s per TiB	Up to 250 MiB/s	Cold storage, infrequent access

# Create a gp3 volume with custom IOPS and throughput
aws ec2 create-volume \
  --volume-type gp3 \
  --size 100 \
  --iops 6000 \
  --throughput 400 \
  --availability-zone us-east-1a

# Attach to instance
aws ec2 attach-volume \
  --volume-id vol-abc123 \
  --instance-id i-abc123 \
  --device /dev/xvdf

# Modify volume (resize or change type — online, no downtime)
aws ec2 modify-volume --volume-id vol-abc123 --size 200 --iops 8000

4. Key Pairs and SSH Access¶

EC2 uses SSH key pairs for Linux access. AWS stores the public key; you keep the private key.

# Create a key pair
aws ec2 create-key-pair --key-name prod-key \
  --query 'KeyMaterial' --output text > prod-key.pem
chmod 400 prod-key.pem

# SSH to instance
ssh -i prod-key.pem ec2-user@<public-ip>

Better alternatives to key pairs:

EC2 Instance Connect: push a temporary SSH key for 60 seconds

aws ec2-instance-connect send-ssh-public-key \
  --instance-id i-abc123 \
  --availability-zone us-east-1a \
  --instance-os-user ec2-user \
  --ssh-public-key file://~/.ssh/id_rsa.pub

SSM Session Manager: no SSH, no open ports, no key management
```
aws ssm start-session --target i-abc123
```

5. User Data Scripts¶

User data runs once at first boot (by default) or every boot (with cloud-init directives). Used for bootstrapping.

#!/bin/bash
# User data example: install and start nginx
yum update -y
yum install -y nginx
systemctl enable nginx
systemctl start nginx

# Write app config from instance metadata
TOKEN=$(curl -sX PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 300")
INSTANCE_ID=$(curl -sH "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id)
echo "INSTANCE_ID=$INSTANCE_ID" >> /etc/app.conf

# Launch an instance with user data
aws ec2 run-instances \
  --image-id ami-abc123 \
  --instance-type m7g.large \
  --key-name prod-key \
  --security-group-ids sg-web \
  --subnet-id subnet-pub1a \
  --user-data file://bootstrap.sh \
  --iam-instance-profile Name=ec2-app-profile

# Retrieve user data from a running instance (base64 encoded)
aws ec2 describe-instance-attribute \
  --instance-id i-abc123 --attribute userData \
  --query 'UserData.Value' --output text | base64 -d

6. Instance Metadata Service (IMDSv2)¶

The metadata service at 169.254.169.254 provides instance information, security credentials, and user data. Always use IMDSv2 (token-required) — IMDSv1 (no token) is vulnerable to SSRF attacks.

Under the hood: The metadata service IP 169.254.169.254 is a link-local address. It is not a real server on the network — the hypervisor (Nitro) intercepts packets to this address and responds directly. This is why it works even without a default gateway configured. The Capital One breach of 2019 exploited IMDSv1 via SSRF to steal IAM credentials from this endpoint, which led AWS to create IMDSv2.

# IMDSv2: get a session token first
TOKEN=$(curl -sX PUT "http://169.254.169.254/latest/api/token" \
  -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

# Then use the token for all requests
curl -sH "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-id

curl -sH "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/instance-type

curl -sH "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/placement/availability-zone

# IAM role credentials (temporary, auto-rotated)
curl -sH "X-aws-ec2-metadata-token: $TOKEN" \
  http://169.254.169.254/latest/meta-data/iam/security-credentials/ec2-app-role

# Enforce IMDSv2 (disable IMDSv1) — do this on all instances
aws ec2 modify-instance-metadata-options \
  --instance-id i-abc123 \
  --http-tokens required \
  --http-endpoint enabled

7. Placement Groups¶

Placement groups control how instances are physically placed on hardware.

Type	Behavior	Use Case
Cluster	All instances on the same rack	HPC, low-latency networking
Spread	Each instance on different hardware	Critical instances that must survive hardware failure
Partition	Groups of instances on separate racks	Large distributed systems (HDFS, Cassandra)

aws ec2 create-placement-group \
  --group-name hpc-cluster \
  --strategy cluster

8. Pricing Models¶

Model	Commitment	Discount	Best For
On-Demand	None	0%	Short-term, unpredictable workloads
Reserved	1 or 3 years	Up to 72%	Steady-state, predictable workloads
Savings Plans	$/hour commitment	Up to 72%	Flexible across instance families/regions
Spot	None (can be reclaimed)	Up to 90%	Fault-tolerant, flexible workloads

Spot instances are spare capacity sold at a discount. AWS can reclaim them with 2-minute notice.

Fun fact: Spot instances were originally an auction model (you bid a max price). AWS changed to a flat discount model in November 2017 — prices now fluctuate based on supply/demand but you no longer set a bid. The --spot-price parameter still exists but acts as a ceiling, not a bid.

# Request spot instances
aws ec2 request-spot-instances \
  --spot-price "0.05" \
  --instance-count 5 \
  --type "one-time" \
  --launch-specification '{
    "ImageId": "ami-abc123",
    "InstanceType": "m7g.large",
    "SecurityGroupIds": ["sg-abc123"],
    "SubnetId": "subnet-priv1a"
  }'

# Check spot pricing history
aws ec2 describe-spot-price-history \
  --instance-types m7g.large \
  --start-time $(date -u -d '1 day ago' +%Y-%m-%dT%H:%M:%S) \
  --product-descriptions "Linux/UNIX" \
  --query 'SpotPriceHistory[].[AvailabilityZone,SpotPrice]' \
  --output table

9. Auto Scaling Groups (ASG)¶

ASGs maintain a fleet of instances, automatically scaling based on demand.

# Create launch template (replaces launch configurations)
aws ec2 create-launch-template \
  --launch-template-name app-template \
  --version-description "v1" \
  --launch-template-data '{
    "ImageId": "ami-abc123",
    "InstanceType": "m7g.large",
    "SecurityGroupIds": ["sg-app"],
    "IamInstanceProfile": {"Name": "ec2-app-profile"},
    "UserData": "'$(base64 -w 0 bootstrap.sh)'",
    "MetadataOptions": {"HttpTokens": "required"}
  }'

# Create auto scaling group
aws autoscaling create-auto-scaling-group \
  --auto-scaling-group-name app-asg \
  --launch-template LaunchTemplateName=app-template,Version='$Latest' \
  --min-size 2 \
  --max-size 10 \
  --desired-capacity 3 \
  --vpc-zone-identifier "subnet-priv1a,subnet-priv1b" \
  --target-group-arns arn:aws:elasticloadbalancing:...:targetgroup/app-tg/...
  --health-check-type ELB \
  --health-check-grace-period 300

Scaling policies: - Target tracking: maintain a metric at a target value (e.g., CPU at 60%) - Step scaling: add/remove instances based on alarm thresholds - Scheduled: scale at specific times (e.g., scale up before business hours) - Predictive: ML-based forecasting of traffic patterns

# Target tracking policy: keep CPU around 60%
aws autoscaling put-scaling-policy \
  --auto-scaling-group-name app-asg \
  --policy-name cpu-target \
  --policy-type TargetTrackingScaling \
  --target-tracking-configuration '{
    "PredefinedMetricSpecification": {
      "PredefinedMetricType": "ASGAverageCPUUtilization"
    },
    "TargetValue": 60.0,
    "ScaleInCooldown": 300,
    "ScaleOutCooldown": 60
  }'

10. Instance Lifecycle¶

pending → running → stopping → stopped → pending → running
                  → shutting-down → terminated

Key transitions:
- Stop:      instance store data LOST, EBS persists, public IP released
- Start:     new host, new public IP (unless Elastic IP), instance store empty
- Reboot:    same host, same IPs, instance store preserved
- Terminate: everything gone (unless EBS has DeleteOnTermination=false)
- Hibernate: RAM saved to EBS, faster resume (must be pre-configured)

# Stop (EBS-backed only)
aws ec2 stop-instances --instance-ids i-abc123

# Start
aws ec2 start-instances --instance-ids i-abc123

# Reboot (preferred over stop/start when possible)
aws ec2 reboot-instances --instance-ids i-abc123

# Terminate (destructive!)
aws ec2 terminate-instances --instance-ids i-abc123

# Enable termination protection
aws ec2 modify-instance-attribute \
  --instance-id i-abc123 \
  --disable-api-termination

11. Nitro System¶

Nitro is AWS's custom hypervisor and hardware platform. All modern instance types run on Nitro. Benefits: - Near bare-metal performance (hypervisor offloaded to dedicated hardware) - Enhanced networking (up to 100 Gbps) - EBS-optimized by default - NVMe-based storage interface - Security: hardware root of trust, encrypted memory

If an instance type uses Nitro, EBS volumes appear as /dev/nvme* devices instead of /dev/xvd*.

12. EC2 Instance Connect¶

A safer alternative to managing SSH key pairs. Pushes a temporary public key to the instance metadata for 60 seconds.

# Connect via CLI
aws ec2-instance-connect ssh --instance-id i-abc123

# Or push key and connect manually
aws ec2-instance-connect send-ssh-public-key \
  --instance-id i-abc123 \
  --instance-os-user ec2-user \
  --ssh-public-key file://~/.ssh/id_rsa.pub

# Connect within 60 seconds
ssh -i ~/.ssh/id_rsa ec2-user@<ip>

Key Takeaways¶

Use Graviton instances (g suffix) for 20-40% better price-performance unless you need x86
Instance store is ephemeral — data vanishes on stop/terminate/host failure
gp3 is the default EBS volume type — you can tune IOPS and throughput independently
Always enforce IMDSv2 (token-required) to prevent SSRF credential theft
T-family burstable instances have CPU credits — understand baseline vs burst
SSM Session Manager is preferred over SSH for production access
Spot instances save up to 90% but can be reclaimed with 2-minute notice
Auto Scaling Groups with launch templates are the modern standard for fleets
Stop/start changes the underlying host and public IP; reboot does not

Prerequisites¶

Cloud Ops Basics (Topic Pack, L1)
AWS Networking (Topic Pack, L1)

AWS CloudWatch (Topic Pack, L2) — Cloud Deep Dive
AWS Compute Flashcards (CLI) (flashcard_deck, L1) — AWS EC2
AWS Devops Flashcards (CLI) (flashcard_deck, L1) — Cloud Deep Dive
AWS ECS (Topic Pack, L2) — Cloud Deep Dive
AWS General Flashcards (CLI) (flashcard_deck, L1) — Cloud Deep Dive
AWS IAM (Topic Pack, L1) — Cloud Deep Dive
AWS Lambda (Topic Pack, L2) — Cloud Deep Dive
AWS Networking (Topic Pack, L1) — Cloud Deep Dive
AWS Route 53 (Topic Pack, L2) — Cloud Deep Dive
AWS S3 Deep Dive (Topic Pack, L1) — Cloud Deep Dive

AWS EC2 - Primer¶

Why This Matters¶

Core Concepts¶

1. Instance Types and Families¶

2. AMIs and the Boot Process¶

3. Instance Store vs EBS¶

4. Key Pairs and SSH Access¶

5. User Data Scripts¶

6. Instance Metadata Service (IMDSv2)¶

7. Placement Groups¶

8. Pricing Models¶

9. Auto Scaling Groups (ASG)¶

10. Instance Lifecycle¶

11. Nitro System¶

12. EC2 Instance Connect¶

Key Takeaways¶

Wiki Navigation¶

Prerequisites¶

Pages that link here¶

AWS EC2 - Primer¶

Why This Matters¶

Core Concepts¶

1. Instance Types and Families¶

2. AMIs and the Boot Process¶

3. Instance Store vs EBS¶

4. Key Pairs and SSH Access¶

5. User Data Scripts¶

6. Instance Metadata Service (IMDSv2)¶

7. Placement Groups¶

8. Pricing Models¶

9. Auto Scaling Groups (ASG)¶

10. Instance Lifecycle¶

11. Nitro System¶

12. EC2 Instance Connect¶

Key Takeaways¶

Wiki Navigation¶

Prerequisites¶

Related Content¶

Pages that link here¶