Pattern: ndots:5 Query Amplification¶

ID: FP-036 Family: Configuration Landmine Frequency: Common Blast Radius: Multi-Service Detection Difficulty: Subtle

The Shape¶

Kubernetes defaults ndots:5 in pod DNS configuration, meaning any name with fewer than 5 dots triggers search-domain expansion before the literal name is tried. A lookup for api.example.com (2 dots) first tries api.example.com.default.svc.cluster.local, then api.example.com.svc.cluster.local, then api.example.com.cluster.local, then api.example.com — 4 DNS queries for what should be 1. CoreDNS receives 4× more queries than expected. Under load, this amplification overwhelms CoreDNS and causes DNS resolution failures for all pods.

How You'll See It¶

In Kubernetes¶

# In a pod, for an external DNS lookup:
$ strace -e trace=network getent hosts api.example.com 2>&1 | grep sendto
# Shows 4 sendto() calls instead of 1

CoreDNS metrics: coredns_dns_request_duration_seconds spikes. Pods report intermittent dial tcp: lookup api.example.com: no such host errors — not consistently, because some queries succeed before CoreDNS is overloaded.

In Linux/Infrastructure¶

Not applicable (ndots is a Kubernetes/resolv.conf concept). But the same pattern exists in /etc/resolv.conf with multiple search domains — every short name is tried against each search domain before the literal name, multiplying DNS queries by the number of search domains.

In CI/CD¶

CI jobs running in Kubernetes pods make many external API calls (package registries, notification services). Each call generates 4–8 DNS queries instead of 1. CoreDNS is the bottleneck during parallel CI builds.

The Tell¶

CoreDNS request_count_total is 4–6× higher than the number of application-level DNS lookups. Intermittent DNS failures under load, not under low traffic. strace on a pod shows multiple consecutive DNS queries for variants of the same hostname before the actual query.

Common Misdiagnosis¶

Looks Like	But Actually	How to Tell the Difference
CoreDNS overloaded (underpowered)	ndots amplification	Adding more CoreDNS replicas reduces symptoms but doesn't fix the 4x amplification
Network instability	DNS query amplification	DNS failures are intermittent and load-correlated; not random
External DNS outage	Local CoreDNS overload	External DNS (public resolver) has no issues; CoreDNS metrics show overload

The Fix (Generic)¶

Immediate: Use FQDN (trailing dot) in application configs: api.example.com. — a trailing dot tells the resolver "this is already fully qualified; don't expand."
Short-term: In pod spec, set dnsConfig.options: [{name: ndots, value: "1"}] for pods that primarily make external DNS lookups.
Long-term: Use CoreDNS caching (already built in); tune ndots per deployment based on whether the service primarily calls internal (needs ndots:5) or external (needs ndots:1) names.

Real-World Examples¶

Example 1: Microservice making 1,000 external API calls/min. With ndots:5: 4,000–5,000 CoreDNS queries/min. CoreDNS at 2 replicas was overwhelmed. Reducing ndots to 1 for that deployment cut DNS queries by 75%.
Example 2: Black Friday: 10× normal traffic. External payment API lookups amplified by ndots:5. CoreDNS overloaded; payment DNS resolution failures. Intermittent payment failures for 12 minutes until CoreDNS was scaled up.

War Story¶

Payment service was failing intermittently — maybe 2% of requests getting DNS failures. We scaled CoreDNS from 2 to 5 replicas: improved but not fixed. Then someone ran strace on the payment pod and showed us: every lookup of api.stripe.com (2 dots) was triggering 5 DNS queries (the 4 search variants plus the actual one). At our request rate, the payment service alone was generating 12,000 DNS queries/min from "3,000 actual lookups." We added ndots: "1" to the payment deployment's dnsConfig. DNS query rate dropped 75%. CoreDNS load dropped immediately. We reverted to 2 CoreDNS replicas — which was now sufficient.

Cross-References¶

Topic Packs: k8s-ops, dns-ops
Case Studies: kubernetes_ops/coredns-timeout-pod-dns/
Footguns: k8s-ops/footguns.md — "Setting ndots:5 for external domains"
Related Patterns: FP-021 (retry amplification — same multiplication pattern)