Quiz: SQL¶

6 questions

L1 (3 questions)¶

1. A query that was fast last week is now slow. EXPLAIN shows a sequential scan instead of an index scan. What are the most common causes?

Show answer

1. Table statistics are stale — run ANALYZE.
2. The table has grown and the planner estimates a seq scan is cheaper than an index scan for a large result set.
3. Index bloat or corruption.
4. The query predicate changed or uses a function that prevents index use.
5. work_mem or other planner settings changed. Always check EXPLAIN ANALYZE for actual vs estimated rows. *Common mistake:* Drop and recreate the index immediately — this causes downtime and may not fix the root cause if statistics or the query are the problem.

2. What is the difference between a deadlock and a lock wait timeout, and how do you diagnose each in PostgreSQL?

Show answer

A deadlock is a circular dependency between transactions (A waits for B, B waits for A) — PostgreSQL detects and aborts one. A lock wait timeout is a single transaction waiting too long for a lock held by another. Diagnose with: pg_stat_activity (waiting queries), pg_locks (held/requested locks), and deadlock detection logs (log_lock_waits = on). *Common mistake:* Increase lock_timeout to avoid deadlocks — lock_timeout only affects wait duration, not circular dependencies.

3. What are the ACID properties of a database transaction and why does each matter?

Show answer

Atomicity: all operations succeed or all roll back (no partial updates). Consistency: transactions transition the database between valid states (constraints enforced). Isolation: concurrent transactions do not interfere (controlled by isolation levels). Durability: committed data survives crashes (WAL/redo log). Together they prevent data corruption, partial failures, and lost writes. *Common mistake:* ACID only applies to SQL databases — NoSQL databases like MongoDB also support ACID transactions (since 4.0).

L2 (3 questions)¶

1. You need to add a NOT NULL column with a default value to a table with 100M rows in PostgreSQL. How do you do this without locking the table for minutes?

Show answer

In PostgreSQL 11+, ALTER TABLE ADD COLUMN ... DEFAULT ... NOT NULL is instant — the default is stored in pg_attribute, not written to every row. Rows are backfilled lazily on read/update. For older versions: add the column as nullable, set the default, backfill in batches (UPDATE WHERE new_col IS NULL LIMIT 10000), then set NOT NULL. *Common mistake:* Create a new table with the column, copy data, and rename — this is the most disruptive approach and unnecessary in modern PostgreSQL.

2. Explain the difference between READ COMMITTED and REPEATABLE READ isolation levels with a concrete example.

Show answer

READ COMMITTED: each statement sees the latest committed data. If you SELECT, then another transaction commits an INSERT, a second SELECT sees the new row (non-repeatable read). REPEATABLE READ: the transaction sees a snapshot from its start. The second SELECT returns the same rows as the first, even if other transactions committed changes. Trade-off: REPEATABLE READ can cause serialization failures that need retry logic. *Common mistake:* REPEATABLE READ prevents all anomalies — it still allows phantom reads in some databases (though PostgreSQL's implementation prevents them).

3. What is connection pool exhaustion and how do you diagnose and fix it?

Show answer

When all connections in the pool are checked out and new requests queue or fail. Diagnose: monitor pool metrics (active/idle/waiting counts), check for long-running transactions holding connections (pg_stat_activity with state='idle in transaction'). Fix: increase pool size cautiously, set statement_timeout and idle_in_transaction_session_timeout, fix application code that holds connections across slow operations. *Common mistake:* Set max_connections very high on the database — this wastes memory (each connection uses ~10MB) and shifts the bottleneck to the database.