GrokDevOps Training Hub¶

New here? Start at training/START_HERE.md -- the single best onboarding doc.

Looking for everything? See the Content Hub -- browse all content by type, domain, or tier.

One place to start learning. This hub indexes all learning content across the repository and adds runtime labs, chaos scripts, runbooks, and interview scenarios.

Surprise Me¶

Jump to random content — learn something unexpected:

More options: Random Discovery Page · Content Hub

Key Entry Points¶

Resource	Purpose
START_HERE.md	10-minute warm-up + orientation
Content Hub	Browse all content by type, domain, or tier
Learning Paths	Breaking Into DevOps / Daily Driver / Crash Course / Interview Prep / Comprehensive
Drills	Quick muscle-memory exercises
Flashcards	178 decks, 6255 cards — browse online
Quiz Bank	722 self-test questions by topic
Solutions	Hint ladders + answer keys for labs

Quickstart¶

1. Deploy the full stack¶

make deploy-all    # Deploys observability stack + grokdevops app
make status        # Verify everything is running
make port-forward  # Access app at localhost:8000

2. Use the exercise system (250 exercises across 5 tracks)¶

source activate.sh                          # From repo root (sets up PATH)
quest list                                  # See all 250 exercises
quest list bash                             # Filter by track
cd training/interactive/exercises/levels/level-01/bash-exit-codes
quest info                                  # See the objective
quest run                                   # Run the broken artifact
quest hint 1                                # Get a nudge if stuck
quest solution                              # See the reference answer

3. Run a runtime lab¶

make lab-list                              # See available labs
make lab LAB=lab-runtime-01 MODE=break     # Introduce failure
# ... investigate and fix ...
make lab LAB=lab-runtime-01 MODE=verify    # Check your fix
make lab LAB=lab-runtime-01 MODE=teardown  # Clean up

4. Guided investigation loop¶

make deploy-all                  # Ensure stack is running
make incident YES=1              # Trigger a random incident
make investigate                 # See step-by-step investigation plan
# ... use kubectl/helm/grafana to investigate ...
make hint                        # Get a hint (if stuck)
make hint HINT=2                 # Deeper hint
make explain                     # Record what you found
make incident-resolve            # Clear the incident
make undeploy-all                # (optional) tear down

5. Practice with chaos scripts¶

make chaos LIST=1                                          # See scripts
training/interactive/chaos/scripts/kill_pods.sh --dry-run              # Preview
training/interactive/chaos/scripts/kill_pods.sh --yes --namespace grokdevops

6. Run incident challenges¶

make incident                    # Preview a random incident (dry-run)
make incident YES=1              # Inject a random incident
make incident-status             # Check status + elapsed time
make incident-forensics          # Capture evidence bundle
# ... diagnose and fix the issue ...
make incident-resolve            # Mark as resolved, record time

# Challenge mode (time-boxed)
make challenge YES=1 MINUTES=10  # Inject + start timer
make incident-list               # See all 18 scenarios
make scoreboard                  # View your performance history

7. Study runbooks and interview scenarios¶

make runbook LIST=1    # List all runbooks
ls training/library/runbooks/  # Browse directly
ls training/library/interview-scenarios/

8. Tear everything down¶

make undeploy-all   # Remove all deployed resources

Directory Structure¶

training/
├── README.md                          # You are here
├── START_HERE.md                      # 10-minute warm-up + orientation
├── catalog.md                         # Asset registry inventory (maintainer view)
├── kubectl-debugging-cheatsheet.md    # Dense command reference
├── interactive/                       # Hands-on, executable content
│   ├── exercises/                     # 250 break/fix exercises across 5 tracks
│   ├── runtime-labs/                  # 8 hands-on labs (break -> fix -> verify)
│   ├── incidents/                     # Incident simulator (18 scenarios)
│   ├── investigation/                 # Guided investigation engine (playbooks, hints)
│   ├── chaos/                         # 7 safe, reversible chaos scripts
│   ├── assessments/                   # Scorecards and self-assessment
│   ├── knowledge/                     # Flashcards and spaced-repetition data
│   └── registry/                      # Canonical asset registry
├── library/                           # Reference and study material
│   ├── portal/                        # Index of indexes (topics, paths, artifacts)
│   ├── runbooks/                      # Incident response runbooks
│   ├── interview-scenarios/           # DevOps interview prep scenarios
│   ├── drills/                        # Muscle-memory exercises
│   ├── skillchecks/                   # Self-assessment skill checks
│   ├── cheatsheets/                   # Quick-reference cheat sheets
│   ├── topics/                        # Deep-dive topic packs
│   ├── scenarios/                     # Multi-step troubleshooting scenarios
│   ├── solutions/                     # Hint ladders + answer keys for labs
│   ├── curriculum/                    # Structured learning paths
│   │   ├── tracks/                    # 10 skill-based tracks
│   │   ├── levels/                    # 5 progressive levels
│   │   └── coverage/                  # Coverage map and gaps
│   ├── domains/                       # Domain-specific content
│   ├── guides/                        # CI pipeline, DevOps roadmap guides
│   └── reference/                     # Knowledge architecture + lookup material
│       └── knowledge-architecture/    # Concept/failure/command intelligence

Prerequisites¶

A running k3s cluster (or any Kubernetes cluster)
kubectl, helm installed and configured
This repo cloned locally
make deploy-all completed successfully for runtime labs