Skip to content

Portal | Level: L2: Operations | Topics: Cloud Deep Dive | Domain: Cloud

Cloud Provider Deep-Dive (AWS & GCP) - Primer

Why This Matters

Cloud-ops-basics covers "what is a VPC" - this pack covers the real-world configurations that cause outages and cost overruns. If you're running Kubernetes in production on AWS or GCP, you need to understand IAM policies, VPC networking, load balancers, and managed services at a deeper level.

AWS Deep-Dive

IAM - The Security Backbone

Policy Structure

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowS3Read",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::my-bucket",
        "arn:aws:s3:::my-bucket/*"
      ],
      "Condition": {
        "StringEquals": {
          "aws:RequestedRegion": "us-east-1"
        }
      }
    }
  ]
}

IAM for Kubernetes (IRSA)

IAM Roles for Service Accounts lets K8s pods assume AWS IAM roles without static credentials:

# 1. Create IAM role with trust policy for the OIDC provider
# 2. Annotate the K8s ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grokdevops
  namespace: grokdevops
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/grokdevops-role

The pod gets temporary AWS credentials via a projected volume. No access keys needed.

Common IAM Mistakes

Mistake Impact Fix
Using * in actions Over-permissive List specific actions
* in resources Access to everything Scope to specific ARNs
No conditions No guardrails Add region/tag conditions
Long-lived access keys Credential rotation risk Use IRSA/instance profiles
Admin access for apps Blast radius Least privilege + IRSA

VPC Networking

VPC (10.0.0.0/16)
|
+-- Public Subnet (10.0.1.0/24) -- Internet Gateway
|     +-- NAT Gateway
|     +-- Load Balancer
|
+-- Private Subnet (10.0.10.0/24) -- NAT Gateway (outbound)
|     +-- EKS Worker Nodes
|     +-- RDS Database
|
+-- Private Subnet (10.0.11.0/24) -- No internet
      +-- Internal services

Key Networking Concepts

Concept What it does Gotcha
Security Group Stateful firewall per ENI Rules are per-SG, not per-subnet
NACL Stateless firewall per subnet Must allow return traffic explicitly
NAT Gateway Outbound internet for private subnets $0.045/hr + $0.045/GB processed
VPC Endpoints Private access to AWS services Saves NAT costs for S3/DynamoDB/ECR
VPC Peering Connect two VPCs No transitive routing
Transit Gateway Hub-and-spoke VPC connectivity $0.05/GB processed

EKS Networking

# EKS uses the VPC CNI plugin (each pod gets a real VPC IP)
# This means:
# - Pods can talk directly to RDS, ElastiCache, etc.
# - Security groups work at the pod level (SG for Pods)
# - Subnet IP exhaustion is a real risk

# Check available IPs in subnets
aws ec2 describe-subnets --subnet-ids subnet-abc123 \
  --query 'Subnets[].AvailableIpAddressCount'

Load Balancers

Type Layer Use case K8s integration
ALB L7 (HTTP) Web apps, path-based routing AWS LB Controller Ingress
NLB L4 (TCP/UDP) Low latency, static IPs Service type: LoadBalancer
CLB L4/L7 (legacy) Don't use for new projects Default (legacy)
# AWS Load Balancer Controller: ALB Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grokdevops
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/target-type: ip
    alb.ingress.kubernetes.io/healthcheck-path: /health
spec:
  ingressClassName: alb  # preferred over kubernetes.io/ingress.class annotation
  rules:
    - host: app.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: grokdevops
                port:
                  number: 8000

GCP Deep-Dive

IAM

GCP uses a different model: roles are bound to members at different levels.

Organization
  +-- Folder
       +-- Project
            +-- Resource

Roles can be granted at any level and inherit downward.

Workload Identity (GKE)

# 1. Create GCP service account
# gcloud iam service-accounts create grokdevops-sa

# 2. Bind K8s SA to GCP SA
# gcloud iam service-accounts add-iam-policy-binding \
#   grokdevops-sa@PROJECT.iam.gserviceaccount.com \
#   --role roles/iam.workloadIdentityUser \
#   --member "serviceAccount:PROJECT.svc.id.goog[grokdevops/grokdevops]"

# 3. Annotate K8s ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
  name: grokdevops
  namespace: grokdevops
  annotations:
    iam.gke.io/gcp-service-account: grokdevops-sa@PROJECT.iam.gserviceaccount.com

VPC & Networking

GCP concept AWS equivalent Notes
VPC VPC Global (not regional)
Subnet Subnet Regional (spans all zones)
Firewall Rules Security Groups + NACLs Applied by tags/SAs, not subnets
Cloud NAT NAT Gateway Per-region, auto-scaling
Private Service Connect VPC Endpoints Access Google APIs privately
VPC Peering VPC Peering Same limitations

GKE Networking

# GKE has two networking modes:
# 1. Routes-based (legacy): each node gets a /24 pod CIDR
# 2. VPC-native (recommended): pods get IPs from secondary ranges

# Check pod CIDR exhaustion
gcloud container clusters describe CLUSTER --zone ZONE \
  --format="get(clusterIpv4Cidr)"

Load Balancers

Type Layer GKE integration
External HTTP(S) LB L7 GKE Ingress (default)
Internal HTTP(S) LB L7 Ingress with spec.ingressClassName: gce-internal
Network LB L4 Service type: LoadBalancer
Internal TCP/UDP LB L4 Service with cloud.google.com/load-balancer-type: Internal

Cross-Cloud Comparison

Feature AWS GCP
Managed K8s EKS GKE
Pod networking VPC CNI (real IPs) VPC-native (alias IPs)
Workload identity IRSA Workload Identity
Node autoscaling Karpenter / CA GKE Autopilot / CA
Serverless K8s Fargate GKE Autopilot
Container registry ECR Artifact Registry
Object storage S3 GCS
Managed database RDS/Aurora Cloud SQL/AlloyDB
Secret management Secrets Manager Secret Manager

Common Pitfalls

  1. Subnet IP exhaustion — EKS VPC CNI assigns real IPs to pods. A /24 subnet (254 IPs) fills fast.
  2. NAT Gateway costs — Every outbound byte costs money. Use VPC endpoints.
  3. Cross-AZ data transfer — $0.01/GB each way in AWS. Use topology-aware routing.
  4. IAM key leaks — Use IRSA/Workload Identity instead of static credentials.
  5. Default security groups — Too permissive. Create specific SGs per workload.
  6. No encryption at rest — Enable by default for EBS, S3, GCE disks.

Wiki Navigation

Prerequisites

Next Steps

  • AWS CloudWatch (Topic Pack, L2) — Cloud Deep Dive
  • AWS Devops Flashcards (CLI) (flashcard_deck, L1) — Cloud Deep Dive
  • AWS EC2 (Topic Pack, L1) — Cloud Deep Dive
  • AWS ECS (Topic Pack, L2) — Cloud Deep Dive
  • AWS General Flashcards (CLI) (flashcard_deck, L1) — Cloud Deep Dive
  • AWS IAM (Topic Pack, L1) — Cloud Deep Dive
  • AWS Lambda (Topic Pack, L2) — Cloud Deep Dive
  • AWS Networking (Topic Pack, L1) — Cloud Deep Dive
  • AWS Route 53 (Topic Pack, L2) — Cloud Deep Dive
  • AWS S3 Deep Dive (Topic Pack, L1) — Cloud Deep Dive