Db Transactions¶

18 cards — 🟢 5 easy | 🟡 8 medium | 🔴 5 hard

🟢 Easy (5)¶

1. What do the four letters in ACID stand for and what does each mean?

Show answer

Atomicity (all or nothing), Consistency (data moves between valid states), Isolation (concurrent transactions don't interfere), Durability (committed data survives crashes).

2. What is the default isolation level in PostgreSQL?

Show answer

Read Committed. This means a transaction can only see data committed before each statement (no dirty reads), but may see different results for the same query if another transaction commits between statements (non-repeatable reads).

3. How do you set a statement timeout in PostgreSQL to prevent runaway queries?

Show answer

SET statement_timeout = '30s'; for per-session, or configure statement_timeout in postgresql.conf (in milliseconds) for a global default.

4. What does Durability mean in practice and how does PostgreSQL guarantee it?

Show answer

Durability means once a transaction is committed, the data survives even if the server crashes immediately after. PostgreSQL guarantees this by writing all changes to the WAL (Write-Ahead Log) and fsyncing the WAL to disk before returning the commit acknowledgment. On recovery, any committed but unwritten data pages are replayed from WAL.

5. What happens if you run statements without BEGIN in PostgreSQL?

Show answer

PostgreSQL runs each statement in an implicit auto-commit transaction — the statement is automatically committed on success or rolled back on failure. Wrapping multiple statements in BEGIN...COMMIT groups them into a single atomic transaction where either all succeed or all are rolled back together.

🟡 Medium (8)¶

1. List the four SQL isolation levels from weakest to strongest.

Show answer

Read Uncommitted (allows dirty reads), Read Committed (no dirty reads), Repeatable Read (no non-repeatable reads), Serializable (no phantom reads — full isolation). PostgreSQL treats Read Uncommitted as Read Committed.

2. What happens when PostgreSQL detects a deadlock?

Show answer

PostgreSQL automatically detects deadlocks (checking every deadlock_timeout interval, default 1s) and terminates the youngest transaction involved, allowing the other to proceed. The terminated transaction receives an error that the application should handle with a retry.

3. How do you find transactions running longer than 5 minutes in PostgreSQL?

Show answer

SELECT pid, now() - xact_start AS duration, query, state FROM pg_stat_activity WHERE state != 'idle' AND xact_start < now() - interval '5 minutes' ORDER BY duration DESC;

4. Why are long-running transactions dangerous in PostgreSQL?

Show answer

They hold locks that block other transactions (including DDL like ALTER TABLE), prevent VACUUM from reclaiming dead tuples (causing table and index bloat), increase MVCC overhead, and can exhaust connection pool resources.

5. What is the practical difference between Read Committed and Serializable isolation?

Show answer

Read Committed sees only data committed before each individual statement — two identical SELECTs in the same transaction can return different results if another transaction commits between them. Serializable guarantees the transaction sees a frozen snapshot and behaves as if all transactions ran one at a time. Serializable detects conflicts and aborts one transaction, requiring retry logic.

6. What are dirty reads, non-repeatable reads, and phantom reads?

Show answer

Dirty read: reading uncommitted data from another transaction (not possible in PostgreSQL). Non-repeatable read: re-reading a row and getting different values because another transaction committed an update. Phantom read: re-running a query and getting different rows because another transaction committed an insert or delete matching the WHERE clause.

7. What are savepoints and how do they help inside a transaction?

Show answer

A savepoint marks a point within a transaction you can roll back to without aborting the entire transaction: SAVEPOINT sp1; ... ROLLBACK TO sp1; This is useful for retrying a portion of work (e.g., an insert that might violate a unique constraint) while keeping earlier work in the same transaction intact.

8. How should connection pools handle transactions?

Show answer

Connection pools (PgBouncer, HikariCP) must ensure a transaction runs entirely on a single connection — mid-transaction connection switching corrupts state. In PgBouncer, use transaction pooling mode (not statement mode) and avoid SET commands that persist beyond the transaction. Always close transactions promptly to return connections to the pool; idle-in-transaction connections waste pool capacity.

🔴 Hard (5)¶

1. How does MySQL's default isolation level differ from PostgreSQL's, and what is the practical impact?

Show answer

MySQL defaults to REPEATABLE READ, while PostgreSQL defaults to READ COMMITTED. MySQL uses gap locks at Repeatable Read to prevent phantom reads, which can increase lock contention. PostgreSQL's MVCC-based Repeatable Read avoids extra locking but may produce serialization failures that require application retries.

2. When would you use SERIALIZABLE isolation and what is the operational cost?

Show answer

Use Serializable for transactions that must appear to execute sequentially (e.g., financial transfers, inventory reservations). The cost is higher abort rates due to serialization conflicts, increased CPU overhead from dependency tracking, and the need for application-level retry logic. PostgreSQL implements it via Serializable Snapshot Isolation (SSI).

3. How do you safely terminate a specific backend in PostgreSQL, and what is the difference between pg_cancel_backend and pg_terminate_backend?

Show answer

pg_cancel_backend(pid) cancels the current query but keeps the connection alive (equivalent to sending SIGINT). pg_terminate_backend(pid) kills the entire backend process and closes the connection (equivalent to SIGTERM). Use cancel first for a gentle stop; use terminate when the backend is stuck or unresponsive.

4. What is two-phase commit (2PC) and when is it needed?

Show answer

2PC coordinates a transaction across multiple databases or services. Phase 1 (prepare): all participants confirm they can commit. Phase 2 (commit): the coordinator tells all to commit. If any participant fails to prepare, all abort. In PostgreSQL, use PREPARE TRANSACTION 'id' and COMMIT PREPARED 'id'. Needed for distributed transactions, but adds latency and complexity — prefer saga patterns where possible.

5. What is transaction ID wraparound in PostgreSQL and why is it dangerous?

Show answer

PostgreSQL uses 32-bit transaction IDs (XIDs), which wrap around after ~4 billion transactions. If VACUUM does not freeze old row versions in time, the database must shut down to prevent data loss (it would interpret old committed rows as being in the future). Monitor with: SELECT datname, age(datfrozenxid) FROM pg_database; Alert when age exceeds 500 million.