Solution¶
Triage¶
- Review the error output from the failed drain command to identify blocking pods.
- List all pods on the node to categorize DaemonSet vs. non-DaemonSet pods:
- Identify which DaemonSets are running on the node:
- Check if any DaemonSet pods use local storage:
Root Cause¶
kubectl drain by default refuses to delete pods managed by DaemonSets. This is by design: DaemonSet pods are tied to nodes, and evicting them would just cause the DaemonSet controller to immediately recreate them on the same node. The drain command requires explicit acknowledgment via --ignore-daemonsets to skip these pods.
The engineer ran kubectl drain node-5.internal without the flag, causing the command to fail before evicting any pods.
Fix¶
- Run the drain with the correct flags:
- If Fluentd uses a buffer directory, allow time for graceful shutdown by setting an appropriate
--timeout: - Verify all non-DaemonSet pods have been evicted: Only DaemonSet pods should remain.
- Once the node is fully drained, proceed with decommissioning (remove from cloud provider, delete the Node object): This triggers DaemonSet pod cleanup.
Rollback / Safety¶
- If the drain needs to be aborted, uncordon the node:
- Verify that evicted workloads are running healthy on other nodes before deleting the node.
- If Fluentd has a persistent buffer, ensure log delivery is caught up before removing the node.
Common Traps¶
- Forgetting
--ignore-daemonsetsin automation. Any drain script or CI pipeline must include this flag; otherwise it will fail on every node. - Confusing
--ignore-daemonsetswith deleting DaemonSet pods. The flag tells drain to skip them, not to evict them. They remain running. - Using
--forceinstead of--ignore-daemonsets.--forcehandles pods not managed by any controller (bare pods). It does not address DaemonSet pods. You typically need both flags for a clean drain. - Not accounting for
--delete-emptydir-data. If pods use emptyDir volumes, drain will refuse without this flag. Fluentd commonly uses emptyDir for its buffer. - Assuming DaemonSet pods vanish after drain. They persist until the Node object is deleted or the DaemonSet is modified.