Skip to content

Kubernetes

Portal | Tag Cloud | 138 assets across 31 topics

K8s core, networking, storage, RBAC, autoscaling, operators, etcd


Core Skills (Tier 1)

Start here — these are the topics a mid-level engineer uses weekly.


All Content

📘 Topic Packs

Asset Topics Level Time Status
API Gateways & Ingress API Gateways & Ingress, Kubernetes Networking L2 5h solid
Argo Workflows Argo Workflows L2 4h solid
Chaos Engineering & Fault Injection Kubernetes Core L2 5h solid
Cilium & eBPF Networking Cilium & eBPF Networking, Kubernetes Networking L2 4h solid
Container Images Container Base Images, Container Image Optimization (alias → container_images) L1 3h solid
CrashLoopBackOff CrashLoopBackOff (alias) L1 2h solid
Database Operations on Kubernetes Kubernetes Storage L2 4h solid
K8s Ecosystem K8s Ecosystem, Kubernetes Operators L0 6h solid
K8s Networking Kubernetes Networking L1 2h solid
K8s RBAC RBAC L1 2h solid
K8s Storage Kubernetes Storage L1 2h solid
Kubernetes Concept Chain Kubernetes Core L0 2h solid
Kubernetes Debugging Playbook Kubernetes Core, Kubernetes Debugging L2 4h solid
Kubernetes Node Lifecycle Node Lifecycle & Maintenance L2 4h solid
Kubernetes Ops (Production) CrashLoopBackOff, HPA / Autoscaling, K8s HPA (alias → k8s_ops) L2 5h solid
Kubernetes Pods & Scheduling Kubernetes Core, Kubernetes Pods & Scheduling L1 4h solid
Kubernetes Services & Ingress Kubernetes Networking, Kubernetes Services & Ingress L1 4h solid
Kustomize Kubernetes Core, Kustomize L1 3h solid
Mental Models (Core Concepts) Kubernetes Core L0 4h solid
Multi-Tenancy Patterns Kubernetes Networking, RBAC, Multi-Tenancy Patterns L2 5h solid
Node Maintenance Node Lifecycle & Maintenance L1 2h solid
OOMKilled OOMKilled (alias) L1 2h solid
Platform Engineering Patterns Kubernetes Core L2 5h solid
Policy Engines (OPA / Kyverno) RBAC, Policy Engines L2 4h solid
Progressive Delivery Progressive Delivery L2 4h solid
Service Mesh Kubernetes Networking, Service Mesh L3 5h solid
The Ops of AI/ML Workloads Kubernetes Core L2 4h solid
cert-manager cert-manager L1 3h solid
etcd etcd L1 2h solid

🏋️ Exercise Sets

Asset Topics Level Time Status
Chaos Engineering Scripts (CLI) Kubernetes Core L2 2h solid
Incident Simulator (18 scenarios) (CLI) CrashLoopBackOff, HPA / Autoscaling, OOMKilled L2 4h solid
Kubernetes Exercises (Quest Ladder) (CLI) HPA / Autoscaling, Kubernetes Core, Kubernetes Networking L1 10h solid

🔬 Labs

Asset Topics Level Time Status
Lab: HPA Live Scaling (CLI) HPA / Autoscaling, Kubernetes Core L1 30m solid
Lab: Readiness Probe Failure (CLI) Kubernetes Core, Probes (Liveness/Readiness) L1 30m solid
Lab: Resource Limits OOMKilled (CLI) Kubernetes Core, OOMKilled L1 30m solid

⚡ Drills

Asset Topics Level Time Status
Kubernetes Operators Drills K8s Ecosystem L3 30m solid
Policy Engine Drills Policy Engines L2 30m solid
Service Mesh Drills Service Mesh L3 30m solid
etcd Drills etcd L2 30m solid
kubectl Drills Kubernetes Core L1 45m solid

🎯 Scenarios

Asset Topics Level Time Status
Adversarial Interview Gauntlet (30 sequences) Kubernetes Core L2 15h solid
Interview: Deployment Stuck Progressing Kubernetes Core, Probes (Liveness/Readiness) L2 15m solid
Interview: HPA Not Scaling HPA / Autoscaling L2 15m solid
Interview: Ingress 404 Kubernetes Networking L2 15m solid
Interview: Kyverno Blocking Deploys Policy Engines L2 15m solid
Interview: Pods OOMKilled OOMKilled L2 15m solid
Interview: RBAC Forbidden RBAC L2 15m solid
Interview: Service Mesh 503s Service Mesh L3 15m solid
Interview: etcd Space Exceeded etcd L3 15m solid
Scenario: etcd Troubleshooting etcd L3 30m solid

📋 Runbooks

Asset Topics Level Time Status
Runbook: DNS Resolution Failure Kubernetes Networking L1 15m solid
Runbook: Deployment Stuck / Rollout Stalled Kubernetes Core L1 15m solid
Runbook: Disaster Recovery Kubernetes Core L2 20m solid
Runbook: HPA Not Scaling HPA / Autoscaling L2 15m solid
Runbook: HPA Thrashing (Rapid Scale Up/Down) HPA / Autoscaling, Kubernetes Core L2 15m solid
Runbook: ImagePullBackOff Kubernetes Core L1 15m solid
Runbook: Ingress 404 Kubernetes Networking L1 15m solid
Runbook: Ingress 502 Bad Gateway Kubernetes Networking, Kubernetes Services & Ingress L2 15m solid
Runbook: Istio 503 Errors Service Mesh L3 15m solid
Runbook: Kyverno Blocking Workloads Policy Engines L2 15m solid
Runbook: NetworkPolicy Block Kubernetes Networking L2 15m solid
Runbook: Node NotReady Node Lifecycle & Maintenance L1 15m solid
Runbook: OOMKilled Container Kubernetes Core, OOMKilled L1 15m solid
Runbook: PVC Stuck in Pending Kubernetes Storage L1 15m solid
Runbook: Pod CrashLoopBackOff CrashLoopBackOff, Kubernetes Core L1 15m solid
Runbook: Pod Eviction Kubernetes Core L2 15m solid
Runbook: RBAC Forbidden RBAC L2 15m solid
Runbook: Readiness Probe Failed Kubernetes Core, Probes (Liveness/Readiness) L1 15m solid
Runbook: Velero Backup & Restore Kubernetes Core L2 15m solid
Runbook: etcd Backup & Restore etcd L2 20m solid
Runbook: etcd High Latency / Slow Operations etcd L3 20m solid

📝 Assessments

Asset Topics Level Time Status
Skillcheck: Kubernetes HPA / Autoscaling, Kubernetes Core, Probes (Liveness/Readiness) L1 45m solid
Skillcheck: Kubernetes Operators K8s Ecosystem L3 30m solid
Skillcheck: Kubernetes Under the Covers Kubernetes Core, Kubernetes Networking, Node Lifecycle & Maintenance L2 45m solid
Skillcheck: Policy Engines Policy Engines L2 30m solid
Skillcheck: Service Mesh Service Mesh L3 30m solid
Skillcheck: etcd etcd L2 30m solid

📎 References

Asset Topics Level Time Status
Track: Kubernetes Core Kubernetes Core, Kubernetes Networking, RBAC L1 15h solid
kubectl Debugging Cheatsheet Kubernetes Core L1 15m solid

🔍 Case Studys

Asset Topics Level Time Status
Case Study: Alert Storm — Flapping Health Checks Kubernetes Core L2 30m solid
Case Study: CNI Broken After Restart Kubernetes Networking L2 30m solid
Case Study: Canary Deploy Routing to Wrong Backend — Ingress Misconfigured Kubernetes Core, Kubernetes Networking L2 30m solid
Case Study: CoreDNS Timeout Pod DNS Kubernetes Networking L2 30m solid
Case Study: CrashLoopBackOff No Logs Kubernetes Core L1 30m solid
Case Study: DNS Looks Broken — TLS Expired, Fix Is Cert-Manager Kubernetes Core L2 30m solid
Case Study: DaemonSet Blocks Eviction Kubernetes Core, Node Lifecycle & Maintenance L2 30m solid
Case Study: Deployment Stuck — ImagePull Auth Failure, Vault Secret Rotation Kubernetes Core L2 30m solid
Case Study: Drain Blocked by PDB Kubernetes Core L2 30m solid
Case Study: Grafana Dashboard Empty — Prometheus Blocked by NetworkPolicy Kubernetes Networking L2 30m solid
Case Study: HPA Flapping — Metrics Server Clock Skew, Fix Is NTP Kubernetes Core L2 30m solid
Case Study: ImagePullBackOff Registry Auth Kubernetes Core L1 30m solid
Case Study: Job Queue Backlog — Worker Pod CPU Throttled by cgroup Kubernetes Core L2 30m solid
Case Study: Node NotReady — NIC Firmware Bug, Fix Is Ansible Playbook Kubernetes Core L2 30m solid
Case Study: Node Pressure Evictions Kubernetes Core, OOMKilled L2 30m solid
Case Study: Persistent Volume Stuck Terminating Kubernetes Core, Kubernetes Storage L2 30m solid
Case Study: Pod OOMKilled — Memory Leak in Sidecar, Fix Is Helm Values Kubernetes Core, OOMKilled L2 30m solid
Case Study: Resource Quota Blocking Deploy Kubernetes Core L2 30m solid
Case Study: Service Mesh 503s — Envoy Misconfigured, RBAC Policy Kubernetes Networking L2 30m solid
Case Study: Service No Endpoints Kubernetes Core, Kubernetes Networking L1 30m solid
Case Study: User Auth Failing — OIDC Cert Expired, Cloud KMS Rotation Kubernetes Core L2 30m solid
Ops Archaeology: The 5% That Can't Resolve Kubernetes Networking L2 30m solid
Ops Archaeology: The Alerts That Stopped Firing Kubernetes Core L2 30m solid
Ops Archaeology: The Cluster That Disagrees With Itself Kubernetes Core L2 30m solid
Ops Archaeology: The DR That Looks Ready But Isn't Kubernetes Core L2 30m solid
Ops Archaeology: The Deploy That Didn't Deploy Kubernetes Core L1 30m solid
Ops Archaeology: The Job That Succeeded Wrong Kubernetes Core L2 30m solid
Ops Archaeology: The Pods That Won't Schedule Kubernetes Core L1 30m solid
Ops Archaeology: The Requests That Vanish Kubernetes Networking L2 30m solid
Ops Archaeology: The Session Store That Keeps Dying Kubernetes Core, OOMKilled L2 30m solid