MongoDB Operations — Street-Level Ops¶

Quick Diagnosis Commands¶

// First look — always start here
rs.status()
db.serverStatus().connections
db.serverStatus().opcounters

// What's running and slow?
db.currentOp({ secs_running: { $gt: 2 } })

// Replication lag on secondaries
rs.printSlaveReplicationInfo()
// Output: member, syncedTo, X secs (Y hrs) behind the primary

// Collection sizes and index sizes
db.orders.stats({ scale: 1048576 })  // MB

// WiredTiger cache pressure
db.serverStatus().wiredTiger.cache["bytes currently in the cache"]
db.serverStatus().wiredTiger.cache["maximum bytes configured"]

# External monitoring
mongostat --uri="mongodb://user:pass@host:27017" 1
mongotop --uri="mongodb://user:pass@host:27017" 5

# Check mongod process
systemctl status mongod
journalctl -u mongod -n 100

# Tail MongoDB log for slow ops
tail -f /var/log/mongodb/mongod.log | grep '"durationMillis"' | \
  python3 -c "import sys,json; [print(json.dumps(json.loads(l.split('  ')[1] if '  ' in l else l.strip()),indent=2)) for l in sys.stdin]"

Gotcha: COLLSCAN on a Hot Collection¶

explain("executionStats") shows "stage": "COLLSCAN" and totalDocsExamined is in the millions while nReturned is 10. The query is scanning the entire collection.

Rule: COLLSCAN on a large collection will saturate I/O and block other operations due to WiredTiger read pressure. Create the index.

// Identify which queries have no index
db.system.profile.find({
  "planSummary": { $regex: "COLLSCAN" },
  "millis": { $gt: 100 }
}).sort({ millis: -1 }).limit(10)

// Create the index in the background (non-blocking in MongoDB 4.2+; always non-blocking in 4.4+)
db.orders.createIndex({ customer_id: 1, created_at: -1 }, { background: true })

// Monitor index build progress
db.currentOp({ "msg": { $exists: true } })
// Look for: "Index Build: scanning collection" and "Index Build: inserting keys"

Gotcha: Election Loop — Primary Keeps Stepping Down¶

The primary steps down every few minutes. Application sees intermittent write failures.

Rule: Elections are caused by: network partitions, high load causing heartbeat timeouts, or explicit rs.stepDown(). A primary that can't reach a majority steps down.

// Step 1: Check rs.status() for the pattern
rs.status()
// Look for: stateStr changing between PRIMARY/SECONDARY
// Look for: "health" field flapping on members
// Look for: "lastHeartbeatMessage" containing error text

// Step 2: Check election cause in mongod logs
// grep for "ELECTION" in logs

grep -i "ELECTION\|stepdown\|PRIMARY\|became primary" /var/log/mongodb/mongod.log | tail -50

# Check network between replica set members
ping -c 10 mongo2
nc -zv mongo2 27017

// Step 3: Check for network or load issues
db.serverStatus().globalLock.currentQueue   // blocked operations
db.serverStatus().wiredTiger.cache          // memory pressure

// If election is due to oplog write throughput, increase oplog
// Requires changing the config and rolling restart (complex — plan carefully)

Gotcha: Oplog Window Too Small — Secondary Can't Resync¶

One-liner: rs.printReplicationInfo() -- run this on the primary regularly. If the oplog window is shorter than your longest planned maintenance window, you are one maintenance event away from a full resync (which can take hours on large datasets and saturate network/disk).

A secondary was down for maintenance for 18 hours. The primary's oplog only has 12 hours of history. Now the secondary can't catch up — it needs a full initial sync.

Rule: Oplog window shrinks under write load. Size it for your longest expected maintenance window plus buffer.

// Check oplog window (run on each replica set member)
rs.printReplicationInfo()
// Output: "log length start to end: Xsecs (Y hrs)"

// Resize oplog — requires mongod to be running as a replica set member
// MongoDB 3.6+: online resize
db.adminCommand({ replSetResizeOplog: 1, size: 51200 })  // 50 GB in MB

// Or: set in mongod.conf
// replication:
//   oplogSizeMB: 51200

Pattern: Rolling Restart of a Replica Set¶

For mongod version upgrades or config changes that require restart.

# 1. Start with secondaries first (never touch primary until all secondaries are healthy)
# On secondary 1:
systemctl stop mongod
# Make config changes
systemctl start mongod
# Wait for it to reach SECONDARY state and catch up

// Wait for secondary to catch up
rs.status()  // stateStr should be SECONDARY, not RECOVERING
rs.printSlaveReplicationInfo()  // should show 0 seconds behind

# 2. Repeat for secondary 2

# 3. Step down the primary, then restart it

rs.stepDown(120)  // give election up to 120 seconds
// Now connect to the new primary and verify
rs.status()

# 4. Restart the old primary (now secondary)
systemctl stop mongod
# Make config changes
systemctl start mongod

Pattern: Safe mongodump of a Replica Set¶

Always dump from a secondary to avoid load on the primary. Use --oplog for a consistent point-in-time snapshot.

# Dump from secondary (add readPreference to connection string)
mongodump \
  --uri="mongodb://mongo1,mongo2,mongo3/mydb?replicaSet=rs0&readPreference=secondary" \
  --oplog \
  --out=/backup/$(date +%F_%H%M)

# Verify the dump
ls -lh /backup/$(date +%F_%H%M)/
wc -l /backup/$(date +%F_%H%M)/mydb/*.bson

# Restore (to a different cluster for verification)
mongorestore \
  --uri="mongodb://test-mongo:27017" \
  --oplogReplay \
  --drop \
  /backup/2024-01-15_0300/

Scenario: Write Failures on a Sharded Cluster¶

Application gets MongoServerError: cannot find a shard for the provided document or writes failing with auth errors.

// Step 1: Check mongos status
// Connect to mongos
sh.status()
// Look for: shards list, chunk distribution, config server connectivity

// Step 2: Check config server replica set
// Connect directly to config server
rs.status()  // should show all members healthy

// Step 3: Find which shard is having issues
use config
db.shards.find()
// Connect to each shard directly and check rs.status()

// Step 4: Check chunk balancer
sh.isBalancerRunning()
sh.getBalancerState()
// If balancer is stuck, stop and restart:
sh.stopBalancer()
sh.startBalancer()

Scenario: Atlas Cluster — Connection Pool Exhausted¶

Debug clue: Connection pool exhaustion is almost always a leak, not a capacity issue. Look for connections in db.currentOp() that are idle with long secs_running. Common cause: application code opens a MongoClient per request instead of sharing a single client instance across the application lifecycle.

Application logs: MongoWaitQueueFullError: Too many operations are currently waiting for a connection

# Immediate: check connection pool stats (Atlas Data Explorer or mongosh)
db.serverStatus().connections
# {"current": 490, "available": 10, "totalCreated": 5000}
# -> maxing out the connection limit

# Fix 1: Increase pool size in application (but don't exceed Atlas limits)
# Fix 2: Reduce the number of application instances creating connections
# Fix 3: Use mongos connection pooling (sharded cluster)
# Fix 4: On Atlas: upgrade cluster tier for higher connection limits

# Atlas connection limits by tier:
# M10: 1,500 max connections
# M20: 3,000
# M30: 6,000
# M50: 16,000

# Check what's holding connections
db.currentOp({ "$all": true })
# Look for connections in "idle" state with long running times — these are connection leaks

Emergency: Replica Set Has No Primary — Majority Down¶

Two of three nodes are down. The remaining secondary won't accept writes.

// Symptom: rs.status() shows majority members as "unreachable"
// and this node as SECONDARY with "health: 0" for others

// Option 1: Bring the other nodes back online (preferred)
// Fix the underlying issue (network, hardware, process)

// Option 2: Force election on the remaining node (data loss risk if replication was behind)
// Only do this if you've confirmed the other nodes are permanently gone
rs.reconfig(
  { _id: "rs0", members: [{ _id: 0, host: "mongo1:27017" }] },
  { force: true }
)
// WARNING: This can cause a "split brain" if the other nodes come back.
// After forcing: immediately firewall off or shut down the other (formerly down) nodes.

// Option 3: Add an arbiter to restore majority (if you have a 4th machine)
rs.addArb("mongo-arbiter:27017")

Useful One-Liners¶

// Replica set lag summary
rs.status().members.map(m => ({name: m.name, state: m.stateStr, lag: m.optimeDate}))

// All collections sorted by size
db.getCollectionNames().map(c => {
  var s = db[c].stats({scale:1048576}); return {coll:c, mb:s.size, docs:s.count}
}).sort((a,b) => b.mb-a.mb).slice(0,10)

// Find documents with missing index field (for partial index debugging)
db.orders.find({ customer_id: { $exists: false } }).count()

// Kill all slow queries over 30 seconds (excluding system)
db.currentOp({ secs_running: { $gt: 30 }, "ns": { $not: /^admin|local/ } }).inprog.forEach(
  op => { print("killing " + op.opid); db.killOp(op.opid); }
)

// Check replication lag numerically
rs.status().members.filter(m => m.stateStr == "SECONDARY").map(m => ({
  name: m.name,
  lagSec: (new Date() - m.optimeDate) / 1000
}))

// Profiler: top slow queries in last hour
db.system.profile.find({ ts: { $gt: new Date(new Date()-3600000) }, millis: { $gt: 100 } })
  .sort({ millis: -1 }).limit(10)
  .map(p => ({ ns: p.ns, op: p.op, ms: p.millis, query: p.query || p.command }))

// Index coverage check — which indexes haven't been used
// Unused indexes waste RAM (index data lives in WiredTiger cache) and slow writes
db.orders.aggregate([{ $indexStats: {} }]).toArray().filter(i => i.accesses.ops == 0)