Grading Checklist¶
- Identifies that 2 CoreDNS replicas are insufficient for 200+ pods generating heavy DNS traffic.
- Explains the ndots:5 problem: lookups for external names like
api.stripe.com(2 dots < 5) cause Kubernetes to try all search domains first. - Calculates the query amplification: a single lookup for
api.stripe.comgenerates up to 8 queries (4 search suffixes x 2 for A+AAAA). - Recommends scaling CoreDNS replicas using HPA or increasing the replica count.
- Suggests deploying NodeLocal DNSCache (node-level caching daemonset) to reduce load on CoreDNS.
- Recommends adding a trailing dot to external FQDNs in application config (e.g.,
api.stripe.com.) to bypass search suffixes. - Suggests overriding
dnsConfigin pod spec to reduce ndots (e.g.,ndots: 2) for pods making many external DNS calls. - Mentions checking CoreDNS resource limits and increasing if throttled.
- Recommends enabling CoreDNS metrics (Prometheus) to monitor query rates and latency.
- Notes that
autopathplugin in CoreDNS can optimize search domain resolution. - Warns against disabling the search domain list entirely, as it breaks internal service discovery.