Skip to content

Email Infrastructure — Street-Level Ops

Quick Diagnosis Commands

# Check all email DNS records for a domain
dig MX example.com
dig TXT example.com | grep spf
dig TXT mail._domainkey.example.com
dig TXT _dmarc.example.com
dig -x <sending-ip>   # PTR / rDNS check

# Test SMTP connection manually
telnet mail.example.com 25
# or:
openssl s_client -connect mail.example.com:587 -starttls smtp

# Send test message
swaks --to test@recipient.com \
      --from noreply@example.com \
      --server mail.example.com \
      --port 587 --tls \
      --auth LOGIN --auth-user noreply@example.com

# Check blocklist status
dig +short 10.113.0.203.zen.spamhaus.org   # reversed IP
# Returns nothing = not listed. 127.0.0.x = listed.

# Inspect email headers (Gmail)
# Open message → "..." → "Show original" → copy to MxToolbox Header Analyzer

# Test SPF
python3 -c "import spf; print(spf.check2(i='203.0.113.10', s='user@example.com', h='mail.example.com'))"

Gotcha: SPF PermError Due to Exceeding 10-Lookup Limit

Under the hood: The 10-lookup limit exists because SPF is evaluated by every receiving mail server for every inbound message. Without a cap, a malicious sender could craft an SPF record that triggers thousands of recursive DNS lookups, turning the receiver's mail server into a DNS amplification tool.

Symptom: Email goes to spam or is rejected with SPF PermError. SPF record looks correct.

Rule: Every include:, mx, a, ptr, and exists in an SPF record costs one DNS lookup. SPF evaluation fails with PermError if the total exceeds 10. ip4: and ip6: are free.

# Trace your SPF lookup chain
dig TXT example.com | grep spf
# v=spf1 include:mailchimp.net include:sendgrid.net include:_spf.google.com mx -all
# That's already 4 lookups before counting nested includes

# Check nested includes
dig TXT _spf.google.com
# v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ~all
# 3 more lookups = 7 total so far

# Use spfwalk to count automatically
pip install pyspf
spfwalk example.com

# Fix options:
# 1. Replace include: with direct ip4: ranges where possible
# 2. Use a service that provides a single flattened SPF include (e.g., PowerDMARC)
# 3. Remove unused include: entries (audit who actually sends as your domain)

# Example: flatten SendGrid
# Before: include:sendgrid.net (2+ lookups)
# After: ip4:167.89.0.0/17 ip4:198.37.144.0/20 ip4:198.21.0.0/21  (0 lookups)

Gotcha: DKIM Signature Fails After Forwarding

Symptom: Mailing list or email forwarding causes DKIM to fail. DMARC then fails because both SPF and DKIM fail.

Rule: Forwarding servers rewrite the message (add headers, change From:, modify body) which invalidates the DKIM signature. This is expected — the fix is ARC, not debugging DKIM.

# Confirm DKIM is valid for direct delivery (not forwarding path)
# Check the Authentication-Results header in a directly-delivered message:
# dkim=pass header.d=example.com  ← direct delivery, signature valid

# For forwarded messages, check for ARC headers:
# ARC-Authentication-Results: i=1; mx.example.com; dkim=pass; dmarc=pass
# If ARC headers are present, Gmail/M365 will honor them even if DKIM fails

# Enable ARC signing on your outbound mail server (OpenDKIM):
# /etc/opendkim.conf:
# ArcSign yes
# ArcSignatureAlgorithm rsa-sha256

# If your DMARC is set to reject and forwarded messages are being rejected,
# use p=quarantine (not reject) until your ARC implementation is in place

Gotcha: DMARC Aggregate Reports Not Arriving

Symptom: You configured rua=mailto:dmarc@example.com but never receive reports.

Rule: If the rua address is on a different domain than the From: domain, the destination domain must publish an authorization record.

# If From: is @example.com but reports go to @reporting-service.com:
dig TXT example.com._report._dmarc.reporting-service.com
# Must return: v=DMARC1;  (authorization record)

# For same-domain reporting, no extra record needed:
_dmarc.example.com TXT "v=DMARC1; p=none; rua=mailto:dmarc@example.com"
# ← this works without extra records

# Test DMARC report sending manually
# Send a test message that fails SPF+DKIM from an outside IP
# Wait 24 hours — most ISPs batch reports daily

# Check your spam folder — DMARC reports often get caught by spam filters
# Add dmarc@example.com to your allowlist

# Use a DMARC reporting service (Postmark, Valimail, Dmarcian)
# They parse XML and present it as a dashboard

Pattern: Full Pre-Send Authentication Check

Before going live with a new email sending path, verify every layer:

#!/bin/bash
DOMAIN="example.com"
SENDING_IP="203.0.113.10"
DKIM_SELECTOR="mail"

echo "=== MX Records ==="
dig MX $DOMAIN +short

echo ""
echo "=== SPF Record ==="
dig TXT $DOMAIN +short | grep spf

echo ""
echo "=== SPF Check for $SENDING_IP ==="
python3 -c "import spf; r,c,t=spf.check2(i='$SENDING_IP', s='user@$DOMAIN', h='mail.$DOMAIN'); print(f'Result: {r} ({c})')"

echo ""
echo "=== DKIM Record ==="
dig TXT ${DKIM_SELECTOR}._domainkey.$DOMAIN +short | head -c 200

echo ""
echo "=== DMARC Record ==="
dig TXT _dmarc.$DOMAIN +short

echo ""
echo "=== PTR Record ==="
dig -x $SENDING_IP +short

echo ""
echo "=== Spamhaus Check ==="
REVERSED=$(echo $SENDING_IP | awk -F. '{print $4"."$3"."$2"."$1}')
LISTED=$(dig +short ${REVERSED}.zen.spamhaus.org)
if [ -n "$LISTED" ]; then
    echo "LISTED: $LISTED"
else
    echo "Clean"
fi

Scenario: Email from New Service Going to Gmail Spam

Default trap: New sending IPs have no reputation with Gmail. Even with perfect SPF/DKIM/DMARC, a brand-new IP will land in spam until it builds positive engagement signals. "IP warming" means starting with small volumes (50-100 emails/day) to engaged recipients and gradually increasing over 2-4 weeks.

You've deployed a new application that sends transactional email. All messages go to Gmail's spam folder.

# Step 1: Get the raw headers from the spam message
# Gmail: "..." → "Show original" → download .eml

# Step 2: Check authentication results
grep -A 5 "Authentication-Results" message.eml
# Look for: dkim=fail, spf=fail, dmarc=fail

# Step 3: Diagnose specific failures
# If dkim=fail:
dig TXT mail._domainkey.example.com  # Is DKIM record published?
# Check that the sending application is actually signing (headers contain DKIM-Signature?)
grep "DKIM-Signature" message.eml

# If spf=fail:
# What IP is the message coming from?
grep "Received:" message.eml | head -3
# Is that IP in your SPF record?
dig TXT example.com | grep spf
# Add the IP: ip4:<new-service-ip>/32

# Step 4: Check Google Postmaster Tools (postmastertools.google.com)
# Domain reputation: High/Medium/Low/Bad?
# If "Low" or "Bad" — IP warming needed

# Step 5: Is the "From:" address matching the sending domain?
# From: noreply@app.example.com — SPF/DKIM for app.example.com must be configured
# NOT just example.com

# Step 6: Check for spam triggers in content
# - Image-only email (no text) → high spam score
# - URL shorteners → high spam score
# - "Click here" links → high spam score
# - Missing List-Unsubscribe header on marketing mail

# Step 7: Verify the sending IP has a PTR record
# Many Gmail deliveries fail silently due to missing PTR
dig -x <sending-ip> +short

Scenario: Domain Being Spoofed — Phishing Emails Claiming to Be from Us

Remember: DMARC p=none means "monitor but take no action." It does NOT stop spoofed email from being delivered. The progression is none (monitor) -> quarantine (spam folder) -> reject (block). Most domains should aim for p=reject but only after verifying all legitimate senders pass SPF or DKIM.

Your domain reputation is being damaged. Users are getting phishing emails claiming to be from @example.com.

# Step 1: Check current DMARC policy
dig TXT _dmarc.example.com
# If p=none — attackers' spoofed mail gets delivered with no action

# Step 2: Enumerate all legitimate sending sources (parse DMARC aggregate reports)
parsedmarc -n dmarc-report-*.xml.gz | jq '.aggregate_reports[].records[] | select(.policy_evaluated.disposition != "none")'
# Or use dmarcian/Postmark dashboard

# Step 3: Move toward p=reject (follow the staged rollout)
# Week 1: p=none; rua=mailto:dmarc@example.com  (monitoring)
# Week 3: p=quarantine; pct=10
# Week 5: p=quarantine; pct=100
# Week 7: p=reject; pct=100

# Step 4: Ensure SPF covers all legitimate senders
# Check DMARC reports for source IPs passing/failing SPF

# Step 5: Protect subdomains too
_dmarc.example.com TXT "v=DMARC1; p=reject; sp=reject; ..."
# sp=reject covers *.example.com subdomains

# Step 6: For subdomains you don't send email from:
_dmarc.noreply.example.com TXT "v=DMARC1; p=reject;"
noreply.example.com TXT "v=spf1 -all"    # no one authorized to send as this subdomain

Emergency: Mail Server on Blocklist — Outbound Email Blocked

All outbound email is being rejected by major providers with 550 blocked or similar.

# Step 1: Identify which blocklists you're on
REVERSED_IP=$(echo "203.0.113.10" | awk -F. '{print $4"."$3"."$2"."$1}')
for rbl in zen.spamhaus.org bl.spamcop.net b.barracudacentral.org dnsbl.sorbs.net; do
    result=$(dig +short ${REVERSED_IP}.${rbl} 2>/dev/null)
    if [ -n "$result" ]; then
        echo "LISTED on $rbl: $result"
    else
        echo "Clean: $rbl"
    fi
done

# Step 2: Submit removal requests
# Spamhaus: https://www.spamhaus.org/lookup/
# SpamCop: https://www.spamcop.net/bl.shtml
# Barracuda: https://www.barracudacentral.org/rbl/removal-request
# Note: some require fixing the issue first before removal is granted

# Step 3: Check if the IP is shared (cloud/VPS)
# If using AWS EC2 elastic IP, request removal from AWS via:
aws support create-case --subject "Remove IP from email blocklist" ...
# Or use Amazon SES to avoid managing IP reputation yourself

# Step 4: Identify why you were listed (check logs for spam indicators)
grep "bounce\|undeliver\|reject" /var/log/mail.log | tail -100
# High bounce rates, spam complaints, or open relay are common causes

# Step 5: While blocked — switch to a transactional service temporarily
# Configure Postfix to relay via SES/SendGrid:
# /etc/postfix/main.cf:
relayhost = [email-smtp.us-east-1.amazonaws.com]:587
smtp_sasl_auth_enable = yes
smtp_sasl_password_maps = hash:/etc/postfix/sasl_passwd
smtp_tls_security_level = encrypt

Useful One-Liners

# Full domain email authentication check (one-liner)
for record in "" "mail._domainkey" "_dmarc"; do
    echo "=== ${record:-SPF} ==="; dig TXT "${record:+${record}.}example.com" +short
done

# Check all mail-related DNS records at once
dig example.com MX TXT +noall +answer && dig _dmarc.example.com TXT +noall +answer && dig mail._domainkey.example.com TXT +noall +answer

# Test SMTP auth on port 587
openssl s_client -connect mail.example.com:587 -starttls smtp <<'EOF'
EHLO test.local
AUTH LOGIN
EOF

# Parse postfix mail log for delivery failures
grep "status=bounced\|status=deferred" /var/log/mail.log | awk '{print $NF}' | sort | uniq -c | sort -rn | head -20

# Check DKIM key length (2048-bit minimum recommended)
dig TXT mail._domainkey.example.com +short | grep -o "p=[A-Za-z0-9+/=]*" | base64 -d 2>/dev/null | wc -c
# Should be >= 256 bytes (2048-bit RSA key = 256 bytes)

# Count SPF DNS lookups manually
dig TXT example.com +short | grep spf | tr ' ' '\n' | grep -c "include:\|^mx$\|^a$\|^ptr$\|^exists:"

# Watch Postfix mail queue
watch -n 5 mailq | tail -5

# Flush Postfix deferred queue (after fixing blocklist issue)
postqueue -f