Drill: Build an Event Timeline for Debugging¶

Goal¶

Use kubectl events and describe to build a chronological timeline of cluster events for incident investigation.

Setup¶

kubectl configured with cluster access
A namespace with recent activity or issues

Commands¶

Show events in a namespace sorted by time:

kubectl get events -n <namespace> --sort-by='.lastTimestamp'

Show events for a specific resource:

kubectl get events -n <namespace> --field-selector involvedObject.name=<pod-name>

Filter events by type (Warning only):

kubectl get events -n <namespace> --field-selector type=Warning

Use describe for a resource's event history:

kubectl describe pod <pod-name> -n <namespace>

Show events across all namespaces:

kubectl get events -A --sort-by='.lastTimestamp' | tail -30

Watch events in real time:

kubectl get events -n <namespace> -w

Show events with custom columns for cleaner output:

kubectl get events -n <namespace> -o custom-columns='TIME:.lastTimestamp,TYPE:.type,REASON:.reason,OBJECT:.involvedObject.name,MESSAGE:.message' --sort-by='.lastTimestamp'

Check events related to a node:

kubectl get events --field-selector involvedObject.kind=Node,involvedObject.name=<node-name>

What to Look For¶

Event sequence: Scheduled -> Pulling -> Pulled -> Created -> Started is healthy
FailedScheduling events reveal resource or affinity constraints
BackOff events show containers crashing repeatedly
Unhealthy events from readiness/liveness probes precede pod restarts
Events have a default TTL of 1 hour; old events may be gone

Common Mistakes¶

Relying on events that have expired (default retention is ~1 hour)
Not checking node-level events when pods fail across multiple nodes
Ignoring the count field that shows how many times an event repeated
Not correlating events from the pod, replicaset, and deployment levels together

Cleanup¶

No cleanup needed. These are read-only commands.