Questions to Determine¶

What are the current node conditions (MemoryPressure, DiskPressure, PIDPressure)?
Which pods are consuming the most memory on the node?
Do the affected pods have memory requests and limits defined?
What are the kubelet eviction thresholds configured on this node?
Are evicted pods being rescheduled back to the same overloaded node?
Is there a single pod causing the memory spike, or is it cumulative overcommitment?
What is the total memory requested vs. allocatable on the node?
Are there LimitRange or ResourceQuota policies in the affected namespaces?
Is the cluster autoscaler enabled and are there pending pods?