Quiz: MongoDB Operations¶

6 questions

L1 (3 questions)¶

1. What is a MongoDB replica set and what happens when the primary goes down?

Show answer

A replica set is a group of mongod instances (typically 3+) maintaining the same dataset. One is primary (accepts writes), others are secondaries (replicate from the oplog). When the primary goes down, secondaries hold an election using Raft-like consensus. A new primary is elected within 10-12 seconds. During election, writes fail but reads can continue if readPreference is set to secondaryPreferred. An arbiter can break ties in even-numbered sets but holds no data.

2. What is the MongoDB oplog and why is its size important for operations?

Show answer

The oplog (operations log) is a capped collection in the local database that records all write operations. Secondaries tail the oplog to replicate changes. If a secondary falls behind by more than the oplog window (time between oldest and newest entry), it cannot catch up and requires a full resync. Size the oplog based on write volume — a 50GB oplog on a system doing 1GB/hour of writes gives a ~50-hour window. Monitor oplog lag (rs.printReplicationInfo()) and increase size if secondaries frequently lag.

3. What is the WiredTiger storage engine's journaling and how does it affect crash recovery?

Show answer

WiredTiger writes changes to an in-memory cache, then periodically flushes to disk via checkpoints (every 60 seconds by default). The journal (write-ahead log) records all changes between checkpoints. On crash, MongoDB replays the journal from the last checkpoint to recover committed writes. Journal commits happen every 50ms by default (configurable). With journaling enabled, you can lose at most 50ms of writes. Disable journaling only for ephemeral data where loss is acceptable.

L2 (3 questions)¶

1. How does MongoDB sharding work and what are the trade-offs between range-based and hash-based shard keys?

Show answer

Sharding distributes data across multiple replica sets (shards) based on a shard key. Range-based sharding keeps sequential values on the same shard — good for range queries but causes hotspots if writes are sequential (e.g., timestamps). Hash-based sharding distributes writes evenly but makes range queries scatter across all shards. Choose range for read-heavy workloads with range queries. Choose hash for write-heavy workloads needing even distribution. A compound shard key can balance both. The shard key is immutable once set — choose carefully.

2. How do you diagnose and fix slow queries in MongoDB?

Show answer

1. Enable the profiler: db.setProfilingLevel(1, {slowms: 100}) to log queries >100ms.
2. Check system.profile collection for slow queries.
3. Use explain('executionStats') on slow queries — look for COLLSCAN (full collection scan), high docsExamined vs docsReturned ratio.
4. Create appropriate indexes (compound indexes matching query predicates + sort).
5. Check for index intersection vs compound index (compound is almost always better).
6. Monitor with mongostat and mongotop for real-time metrics.

3. What are MongoDB read concern and write concern, and how do they affect consistency and durability?

Show answer

Write concern controls how many nodes must acknowledge a write: w:1 (primary only, fast but risky), w:majority (majority of replica set, durable), w:0 (fire-and-forget). Read concern controls the consistency of read data: local (latest from this node, may be rolled back), majority (only data committed to majority, durable), linearizable (strongest, waits for all prior writes). Use w:majority + readConcern:majority for strong consistency. Use w:1 + readConcern:local for performance when some data loss is acceptable.