Questions to Determine¶
- What are the current node conditions (MemoryPressure, DiskPressure, PIDPressure)?
- Which pods are consuming the most memory on the node?
- Do the affected pods have memory requests and limits defined?
- What are the kubelet eviction thresholds configured on this node?
- Are evicted pods being rescheduled back to the same overloaded node?
- Is there a single pod causing the memory spike, or is it cumulative overcommitment?
- What is the total memory requested vs. allocatable on the node?
- Are there LimitRange or ResourceQuota policies in the affected namespaces?
- Is the cluster autoscaler enabled and are there pending pods?