Pattern: ndots:5 Query Amplification¶
ID: FP-036 Family: Configuration Landmine Frequency: Common Blast Radius: Multi-Service Detection Difficulty: Subtle
The Shape¶
Kubernetes defaults ndots:5 in pod DNS configuration, meaning any name with fewer
than 5 dots triggers search-domain expansion before the literal name is tried. A lookup
for api.example.com (2 dots) first tries api.example.com.default.svc.cluster.local,
then api.example.com.svc.cluster.local, then api.example.com.cluster.local, then
api.example.com — 4 DNS queries for what should be 1. CoreDNS receives 4× more
queries than expected. Under load, this amplification overwhelms CoreDNS and causes
DNS resolution failures for all pods.
How You'll See It¶
In Kubernetes¶
# In a pod, for an external DNS lookup:
$ strace -e trace=network getent hosts api.example.com 2>&1 | grep sendto
# Shows 4 sendto() calls instead of 1
coredns_dns_request_duration_seconds spikes. Pods report intermittent
dial tcp: lookup api.example.com: no such host errors — not consistently, because some
queries succeed before CoreDNS is overloaded.
In Linux/Infrastructure¶
Not applicable (ndots is a Kubernetes/resolv.conf concept). But the same pattern exists
in /etc/resolv.conf with multiple search domains — every short name is tried against
each search domain before the literal name, multiplying DNS queries by the number of
search domains.
In CI/CD¶
CI jobs running in Kubernetes pods make many external API calls (package registries, notification services). Each call generates 4–8 DNS queries instead of 1. CoreDNS is the bottleneck during parallel CI builds.
The Tell¶
CoreDNS
request_count_totalis 4–6× higher than the number of application-level DNS lookups. Intermittent DNS failures under load, not under low traffic.straceon a pod shows multiple consecutive DNS queries for variants of the same hostname before the actual query.
Common Misdiagnosis¶
| Looks Like | But Actually | How to Tell the Difference |
|---|---|---|
| CoreDNS overloaded (underpowered) | ndots amplification | Adding more CoreDNS replicas reduces symptoms but doesn't fix the 4x amplification |
| Network instability | DNS query amplification | DNS failures are intermittent and load-correlated; not random |
| External DNS outage | Local CoreDNS overload | External DNS (public resolver) has no issues; CoreDNS metrics show overload |
The Fix (Generic)¶
- Immediate: Use FQDN (trailing dot) in application configs:
api.example.com.— a trailing dot tells the resolver "this is already fully qualified; don't expand." - Short-term: In pod spec, set
dnsConfig.options: [{name: ndots, value: "1"}]for pods that primarily make external DNS lookups. - Long-term: Use CoreDNS caching (already built in); tune
ndotsper deployment based on whether the service primarily calls internal (needs ndots:5) or external (needs ndots:1) names.
Real-World Examples¶
- Example 1: Microservice making 1,000 external API calls/min. With ndots:5: 4,000–5,000 CoreDNS queries/min. CoreDNS at 2 replicas was overwhelmed. Reducing ndots to 1 for that deployment cut DNS queries by 75%.
- Example 2: Black Friday: 10× normal traffic. External payment API lookups amplified by ndots:5. CoreDNS overloaded; payment DNS resolution failures. Intermittent payment failures for 12 minutes until CoreDNS was scaled up.
War Story¶
Payment service was failing intermittently — maybe 2% of requests getting DNS failures. We scaled CoreDNS from 2 to 5 replicas: improved but not fixed. Then someone ran
straceon the payment pod and showed us: every lookup ofapi.stripe.com(2 dots) was triggering 5 DNS queries (the 4 search variants plus the actual one). At our request rate, the payment service alone was generating 12,000 DNS queries/min from "3,000 actual lookups." We addedndots: "1"to the payment deployment's dnsConfig. DNS query rate dropped 75%. CoreDNS load dropped immediately. We reverted to 2 CoreDNS replicas — which was now sufficient.
Cross-References¶
- Topic Packs: k8s-ops, dns-ops
- Case Studies: kubernetes_ops/coredns-timeout-pod-dns/
- Footguns: k8s-ops/footguns.md — "Setting ndots:5 for external domains"
- Related Patterns: FP-021 (retry amplification — same multiplication pattern)