MySQL Operations: The Database You Inherited

lesson
innodb-internals
replication-(async/semi-sync/group)
gtid
slow-query-analysis
explain-plans
connection-management
backup-strategies
percona-toolkit
performance_schema
operational-nightmares ---# MySQL Operations: The Database You Inherited

Topics: InnoDB internals, replication (async/semi-sync/group), GTID, slow query analysis, EXPLAIN plans, connection management, backup strategies, Percona Toolkit, performance_schema, operational nightmares Level: L1–L2 (Foundations → Operations) Time: 60–90 minutes Prerequisites: None (everything is explained from scratch)

The Mission¶

You just joined a team. There's a MySQL primary with two replicas. The on-call channel has three open alerts:

Seconds_Behind_Master on replica-2 is climbing — it was 30 seconds an hour ago, now it's 180 seconds.
The application team reports stale data — users update their profile but see old values for "a few seconds."
A developer wants to add a column to the orders table (200 million rows) and is asking when the next maintenance window is.

Nobody on the current team set up this database. The previous DBA left six months ago. The runbook says "see Confluence" and the Confluence page says "TODO."

This lesson gives you what you need to walk into this situation and not make it worse.

First: What Are You Looking At?¶

Before touching anything, take stock. Connect to the primary and get the lay of the land.

# Is MySQL even running?
mysqladmin -u root -p ping
# mysqld is alive

# What version?
mysql -u root -p -e "SELECT VERSION();"
# 8.0.36-0ubuntu0.22.04.1

# How big is each database?
mysql -u root -p -e "
SELECT table_schema AS db,
       ROUND(SUM(data_length + index_length) / 1024 / 1024 / 1024, 2) AS size_gb
FROM information_schema.tables
GROUP BY table_schema ORDER BY size_gb DESC;"
# production: 142 GB, analytics: 29 GB, sessions: 3 GB

# How big is the orders table specifically?
mysql -u root -p -e "
SELECT table_name, table_rows,
       ROUND(data_length/1024/1024/1024,2) AS data_gb,
       ROUND(index_length/1024/1024/1024,2) AS index_gb
FROM information_schema.tables
WHERE table_schema='production' ORDER BY data_length DESC LIMIT 5;"
# orders: 198M rows, 38 GB data, 12 GB indexes

200 million rows, 38 GB of data, 12 GB of indexes. An ALTER TABLE on this will be a serious conversation. But first — the alerts.

Name Origin: MySQL was created by Michael "Monty" Widenius and David Axmark in 1995. "My" is the name of Monty's daughter. When Oracle acquired MySQL via Sun Microsystems in 2010, Monty forked it as MariaDB — named after his younger daughter, Maria. InnoDB, the storage engine that powers nearly everything in MySQL today, was developed by Innobase Oy, a Finnish company. The name stands for "Inno DataBase."

Alert 1: Replication Lag Is Climbing¶

What's Actually Happening¶

On the lagging replica, run:

mysql -e "SHOW SLAVE STATUS\G" | grep -E "Running|Seconds|Error|Master_Log"

Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Seconds_Behind_Master: 247
Master_Log_File: mysql-bin.000087
Read_Master_Log_Pos: 892341567
Relay_Master_Log_File: mysql-bin.000087
Exec_Master_Log_Pos: 441209832
Last_SQL_Error:

OK. Both threads are running (good — we're not broken, just slow). The I/O thread has received position 892M but the SQL thread has only applied up to 441M. That's 451 MB of relay log that the replica still needs to replay.

Let's break down those threads, because understanding them is the key to diagnosing every replication problem you'll ever see:

Thread	Job	Analogy
I/O thread	Connects to primary, downloads binary log events into local relay logs	The mail carrier delivering letters
SQL thread	Reads relay logs and applies the changes to the replica's data	You actually reading and acting on the mail

When Exec_Master_Log_Pos falls behind Read_Master_Log_Pos, the mail is piling up faster than you can read it.

Under the Hood: By default in MySQL 5.6 and earlier, the SQL thread is single-threaded. One thread replaying all changes, sequentially. If the primary has 16 cores blasting writes in parallel, the replica is trying to keep up with one hand tied behind its back. MySQL 5.7+ introduced parallel replication, but it's not enabled by default.

Fix: Enable Parallel Replication¶

-- On the lagging replica
STOP SLAVE SQL_THREAD;
SET GLOBAL slave_parallel_workers = 4;
SET GLOBAL slave_parallel_type = 'LOGICAL_CLOCK';
START SLAVE SQL_THREAD;

-- Verify workers are active
SELECT * FROM performance_schema.replication_applier_status_by_worker\G

Parameter	What it does
`slave_parallel_workers`	Number of parallel SQL apply threads. Start with 4, tune up if lag persists.
`slave_parallel_type`	`LOGICAL_CLOCK` groups transactions that were committed in parallel on the primary. `DATABASE` only parallelizes across databases (useless if you have one database).

Persist this in my.cnf so it survives restarts:

[mysqld]
slave_parallel_workers = 4
slave_parallel_type = LOGICAL_CLOCK

Gotcha: Seconds_Behind_Master is not real-time lag. It measures the timestamp difference between the SQL thread's current event and the clock. If the I/O thread is caught up but the SQL thread is processing a 2-hour-old event, the number is 7200 — even if the SQL thread is actually churning through events rapidly and will catch up in minutes. For accurate production monitoring, use pt-heartbeat: it inserts a timestamp on the primary every second and measures actual delivery time on replicas.

pt-heartbeat --database=percona --update --daemonize  # on primary
pt-heartbeat --database=percona --monitor             # on replica: "0.02s"

Flashcard Check: Replication¶

Question	Answer (cover this column)
What do the I/O thread and SQL thread do?	I/O downloads binary log events to relay logs. SQL applies relay log events to replica data.
Why is `Seconds_Behind_Master` misleading?	It measures event timestamp lag, not real-time delivery. A burst of old events can show high lag even while catching up quickly.
What does `slave_parallel_type = 'LOGICAL_CLOCK'` do?	Groups transactions that were committed together on the primary so the replica can replay them in parallel.
What does `Seconds_Behind_Master: NULL` mean?	The SQL thread is not running. This is worse than a number — replication is stopped.

Why the Application Sees Stale Data¶

The profile-update complaint maps directly to replication lag. Here's the pattern:

1. User updates profile  → writes go to PRIMARY
2. User refreshes page   → read goes to REPLICA (still 3 seconds behind)
3. User sees old data    → panic, files support ticket

This is the read-your-own-writes problem, and it's the #1 reason application teams distrust replicas. Three solutions, from simple to bulletproof:

Option 1: Route critical reads to the primary. After a user writes, route their next few reads to the primary. Most ORMs and proxy layers support this. Quick, but increases primary load.

Option 2: Use ProxySQL with query routing.

-- ProxySQL admin (port 6032): route writes to HG1 (primary), reads to HG2 (replicas)
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup)
  VALUES (1, 1, '^SELECT', 2);
INSERT INTO mysql_query_rules (rule_id, active, match_pattern, destination_hostgroup)
  VALUES (2, 1, '.*', 1);
LOAD MYSQL QUERY RULES TO RUNTIME;
SAVE MYSQL QUERY RULES TO DISK;

Option 3: Semi-synchronous replication. The primary waits for at least one replica to ACK before returning success to the client.

-- On primary: enable semi-sync
INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
SET GLOBAL rpl_semi_sync_master_enabled = 1;
SET GLOBAL rpl_semi_sync_master_timeout = 1000;  -- ms before falling back to async
-- On replicas: enable semi-sync, then restart replication
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
SET GLOBAL rpl_semi_sync_slave_enabled = 1;
STOP SLAVE; START SLAVE;

Mode	Write latency	Data loss risk	Use when
Async	Lowest	Reads may be stale, failover may lose data	Analytics replicas
Semi-sync	Slightly higher	Near-zero (one replica has the data)	User-facing read replicas
Group Replication	Highest (Paxos consensus)	Zero	Multi-primary or auto-failover

War Story: A SaaS company routed all reads to replicas to reduce primary load. Replication lag averaged 200ms — barely noticeable. Then a bulk import job spiked lag to 45 seconds. During those 45 seconds, the support dashboard showed customers' subscription status as "inactive" because it was reading from replicas that hadn't yet processed the renewal transactions. The billing team manually cancelled accounts that were actually paid. The fix: semi-sync for the billing replica, async for the analytics replica. Different replicas can have different replication modes depending on their criticality.

The InnoDB Buffer Pool: The Cache You Need to Understand¶

The replication investigation surfaced a clue: the replica is doing a lot of disk I/O. Let's check the buffer pool.

-- On the lagging replica
SHOW ENGINE INNODB STATUS\G

Look for the BUFFER POOL AND MEMORY section — zero free buffers means the pool is completely full. Check the hit ratio:

SELECT (1 - (
  (SELECT variable_value FROM performance_schema.global_status
   WHERE variable_name = 'Innodb_buffer_pool_reads') /
  (SELECT variable_value FROM performance_schema.global_status
   WHERE variable_name = 'Innodb_buffer_pool_read_requests')
)) * 100 AS hit_rate;
-- hit_rate: 94.32

94%. That should be above 99%. Almost 6% of reads are hitting disk instead of cache.

Mental Model: Think of the buffer pool like a desk. Your data files are filing cabinets across the room. Every time you need something that's not on your desk, you have to walk across the room, open a cabinet, and bring it back. A 94% hit rate means 6 out of every 100 lookups require that walk. At thousands of lookups per second, that's thousands of disk reads per second — and spinning disks or even SSDs are orders of magnitude slower than RAM.

Tuning the Buffer Pool¶

The general rule: on a dedicated database server, set the buffer pool to 70–80% of total RAM. This replica has 4 GB total, and the buffer pool is only 2 GB — for a 142 GB database.

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
innodb_buffer_pool_size = 12G          # 70-80% of RAM (assuming 16 GB server)
innodb_buffer_pool_instances = 8       # reduces mutex contention; 1 per GiB up to 8-16
innodb_buffer_pool_chunk_size = 128M   # must divide evenly into pool size

In MySQL 8.0, you can resize the buffer pool dynamically:

SET GLOBAL innodb_buffer_pool_size = 12884901888;  -- 12 GB in bytes
-- Watch the resize progress:
SHOW STATUS LIKE 'Innodb_buffer_pool_resize_status';

Trivia: InnoDB was not MySQL's default storage engine until version 5.5 in 2010. For the first 15 years, the default was MyISAM — which had no transactions, no row-level locking, and no crash recovery. The switch to InnoDB as the default was one of the most consequential changes in MySQL history. If you ever encounter a *.MYD or *.MYI file in a MySQL data directory, that's a MyISAM table. Convert it.

Under the Hood: How InnoDB Actually Writes Data¶

When your application writes a row, InnoDB: (1) writes to the redo log (sequential, fast), (2) updates the page in the buffer pool (RAM), (3) returns success. Dirty pages flush to disk later in the background. If MySQL crashes before the flush, the redo log replays the changes on startup. Same principle as PostgreSQL's WAL and filesystem journals.

# Redo log configuration
innodb_log_file_size = 1G            # larger = fewer checkpoints = better write throughput
innodb_log_files_in_group = 2        # total redo capacity = size * files_in_group
innodb_flush_log_at_trx_commit = 1   # The ACID knob

That last parameter is crucial:

Value	Behavior	Risk
`1`	Flush redo log to disk on every commit	Full ACID. No data loss.
`2`	Write to OS cache on commit, flush every second	Up to 1 second of data loss on OS crash
`0`	Write and flush every second	Up to 1 second of data loss on MySQL crash

Gotcha: Setting innodb_flush_log_at_trx_commit = 0 or 2 for "performance" during a bulk load and forgetting to revert it is a classic footgun. The server runs for months, everyone forgets, and then a crash loses committed transactions. If you change this, add a monitoring alert that checks the value hourly.

The Doublewrite Buffer¶

InnoDB pages are 16 KB but disk sectors are 512 bytes or 4 KB. A crash mid-write can leave a page half-old, half-new (a "torn page"). The doublewrite buffer writes pages to a contiguous area first, fsyncs, then writes them to the actual tablespace location. On crash recovery, InnoDB checks the doublewrite copy for intact pages.

Under the Hood: On copy-on-write filesystems (ZFS, Btrfs), the doublewrite buffer is redundant — the filesystem already guarantees atomic writes. Disable it with innodb_doublewrite = 0 on ZFS for a performance win.

Alert 3: The ALTER TABLE Problem¶

Back to the developer who wants to add a column to the 200-million-row orders table.

What Happens If You Just Run ALTER TABLE¶

ALTER TABLE orders ADD COLUMN fulfillment_notes TEXT NULL;

On MySQL 5.6 and earlier, this would:

Create a new, empty copy of the table with the new schema
Copy every row (200 million of them) to the new table
Swap the old table for the new one
During all of this, the table is locked — no reads, no writes

Even on MySQL 8.0, ALTERs acquire a metadata lock at start and end. If a long-running query holds a conflicting lock, the ALTER waits — and every new query queues behind the ALTER. One slow SELECT can cause a pile-up that takes down the application.

Gotcha: The metadata lock pile-up is more dangerous than the ALTER itself. Always check SHOW PROCESSLIST for long-running queries before starting any ALTER. Kill long queries first, then run the ALTER.

MySQL 8.0 Instant DDL¶

Some operations are instant in MySQL 8.0 — they modify only metadata, not data:

-- Adding a column at the end of the table (instant in 8.0+)
ALTER TABLE orders ADD COLUMN fulfillment_notes TEXT NULL, ALGORITHM=INSTANT;
-- Takes milliseconds regardless of table size

-- But rearranging columns, changing types, adding indexes — not instant:
ALTER TABLE orders ADD INDEX idx_status (status), ALGORITHM=INPLACE, LOCK=NONE;
-- This rebuilds the index online but still takes time proportional to table size

pt-online-schema-change: The Real Solution¶

For anything that's not instant, use Percona's pt-online-schema-change:

# Dry run first — always
pt-online-schema-change \
  --alter "ADD COLUMN fulfillment_notes TEXT NULL" \
  --host=primary.db.example.com \
  --user=root --password=secret \
  --dry-run \
  D=production,t=orders

# Execute
pt-online-schema-change \
  --alter "ADD COLUMN fulfillment_notes TEXT NULL" \
  --host=primary.db.example.com \
  --user=root --password=secret \
  --chunk-size=1000 \
  --max-load "Threads_running=25" \
  --critical-load "Threads_running=50" \
  --execute \
  D=production,t=orders

It creates a shadow table with the new schema, copies rows in chunks, installs triggers to replay live changes, then atomically swaps the tables. Zero downtime.

Parameter	What it does
`--chunk-size`	How many rows to copy per batch. Smaller = less impact, slower overall.
`--max-load`	Pause if this metric exceeds the threshold. Prevents overloading the server.
`--critical-load`	Abort entirely if this metric is exceeded. Emergency brake.
`--dry-run`	Show what would happen without doing it. Always run this first.

Trivia: GitHub built their own tool, gh-ost, because pt-online-schema-change uses triggers, and those triggers caused replication lag spikes on GitHub's massive MySQL fleet. gh-ost reads the binary log directly instead of using triggers, giving it a lighter footprint on the primary. If you're running large-scale MySQL with tight replication budgets, gh-ost is worth evaluating.

Flashcard Check: Schema Changes¶

Question	Answer (cover this column)
Why is ALTER TABLE dangerous on large tables?	It acquires a metadata lock that blocks all queries. Even "online" ALTERs grab a brief lock at start and end that can cause pile-ups.
What does `ALGORITHM=INSTANT` do in MySQL 8.0?	Modifies only table metadata, no data copying. Takes milliseconds. Only works for some operations (e.g., adding a nullable column at the end).
How does pt-online-schema-change avoid locking?	Creates a shadow table, copies data in chunks, uses triggers to replay live changes, then atomically swaps tables.
What's the difference between pt-osc and gh-ost?	pt-osc uses triggers (adds overhead to every write). gh-ost reads the binary log instead (lighter on the primary, heavier on the network).

Slow Queries: Finding What's Killing Performance¶

While investigating the replication lag, you noticed the primary's CPU is running hot. Time to find the offending queries.

Enable and Analyze the Slow Query Log¶

# /etc/mysql/mysql.conf.d/mysqld.cnf
[mysqld]
slow_query_log = 1
slow_query_log_file = /var/log/mysql/slow.log
long_query_time = 1
log_queries_not_using_indexes = 1

Or enable at runtime without a restart:

SET GLOBAL slow_query_log = 1;
SET GLOBAL long_query_time = 0.5;  -- catch queries over 500ms

Feed the log to pt-query-digest:

pt-query-digest /var/log/mysql/slow.log --since '1h ago'
# Rank 1: 48.2% of response time, 4217 calls, avg 211ms — that's your target
# Rank 2: 28.3% of response time, 89123 calls, avg 5.9ms — high volume, low per-call

Reading EXPLAIN Output¶

The digest shows the normalized query. Let's EXPLAIN it:

EXPLAIN SELECT o.*, u.email, u.name
FROM orders o JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending' AND o.created_at > '2026-03-01'\G
-- table: o | type: ALL | key: NULL | rows: 198442316 | Extra: Using where
-- table: u | type: eq_ref | key: PRIMARY | rows: 1

type: ALL on the orders table. Full table scan — 198 million rows examined to find pending orders from March. No index on (status, created_at).

Remember: The EXPLAIN danger signs mnemonic: "ATF" — ALL, Temporary, Filesort. type=ALL means full table scan. Using temporary means MySQL created a temp table. Using filesort means an extra sorting pass. Any of these on a large table is a performance emergency.

Fix it:

-- Composite index: status first (equality), then created_at (range)
CREATE INDEX idx_orders_status_created ON orders(status, created_at);

After adding the index, re-run the EXPLAIN:

-- type: range | key: idx_orders_status_created | rows: 23891 | filtered: 100.00

From 198 million rows to 24 thousand. An 8,300x improvement.

Under the Hood: Composite indexes follow the leftmost prefix rule. An index on (status, created_at) serves queries filtering on (status) or (status, created_at), but NOT queries filtering on only (created_at). Think of it like a phone book sorted by last name, then first name — you can look up "Smith" or "Smith, Alice" but not "Alice" without scanning the whole book.

Connection Management: The "Too Many Connections" Bomb¶

While you're optimizing queries, the application team reports intermittent ERROR 1040 (HY000): Too many connections.

SHOW GLOBAL VARIABLES LIKE 'max_connections';
-- max_connections: 151 (the DEFAULT)

SHOW GLOBAL STATUS LIKE 'Threads_connected';
-- Threads_connected: 148

SHOW GLOBAL STATUS LIKE 'Max_used_connections';
-- Max_used_connections: 151

You're at the ceiling. The default is 151 — shockingly low for a production system.

-- Immediate fix
SET GLOBAL max_connections = 500;

-- Also set wait_timeout to kill idle connections
SET GLOBAL wait_timeout = 300;          -- disconnect idle connections after 5 minutes
SET GLOBAL interactive_timeout = 300;

Persist in my.cnf:

[mysqld]
max_connections = 500
wait_timeout = 300
interactive_timeout = 300

Gotcha: Each MySQL connection consumes memory — roughly 1–10 MB per thread depending on sort_buffer_size, join_buffer_size, and other per-session variables. Setting max_connections = 10000 without tuning these buffers means 10,000 * 10 MB = 100 GB of potential memory usage. If you need thousands of connections, use a connection pooler like ProxySQL between the app and MySQL.

Check who's using them all:

SELECT user, db, command, COUNT(*) AS count
FROM information_schema.processlist
GROUP BY user, db, command ORDER BY count DESC;
-- appuser / production / Sleep: 112  ← sleeping connections, app isn't closing them
-- appuser / production / Query: 23

112 sleeping connections. The application is holding connections open and not releasing them. This is a connection pool misconfiguration on the app side — not a MySQL problem.

Backups: The Three Options¶

You check the crontab: mysqldump --all-databases | gzip > /backup/full.sql.gz — no --single-transaction, no --routines, no --triggers. This backup is locking the entire database every night at 3 AM.

The Three Backup Strategies¶

1. mysqldump — logical backup

# The right way (InnoDB tables only)
mysqldump --single-transaction --routines --triggers --events \
  --set-gtid-purged=ON \
  --all-databases \
  -u root -p | gzip > /backup/full_$(date +%F).sql.gz

Flag	Why
`--single-transaction`	Uses MVCC snapshot — no locking for InnoDB tables
`--routines --triggers --events`	Include stored procs, triggers, scheduled events
`--set-gtid-purged=ON`	Includes GTID info for replica bootstrapping

Pros: Portable, human-readable, works across versions. Cons: Slow to backup and restore large databases. A 142 GB database takes hours.

2. Percona XtraBackup — physical hot backup

# Full backup (runs while MySQL is live, minimal impact)
xtrabackup --backup --user=backup --password=secret --target-dir=/backup/full

# Prepare (apply redo logs — REQUIRED before restore)
xtrabackup --prepare --target-dir=/backup/full

# Restore: stop MySQL, copy files, fix ownership, start MySQL
systemctl stop mysql && rm -rf /var/lib/mysql/*
xtrabackup --copy-back --target-dir=/backup/full
chown -R mysql:mysql /var/lib/mysql && systemctl start mysql

Pros: Fast, minimal impact, supports incrementals. Cons: Must restore to same MySQL major version. Not human-readable.

3. Binary log (binlog) — continuous, point-in-time

Binary logs record every change. Combined with a base backup, they enable point-in-time recovery — "undo the accidental DELETE":

# Replay binlog events from a specific time window
mysqlbinlog --start-datetime="2026-03-23 14:00:00" \
            --stop-datetime="2026-03-23 14:30:00" \
            /var/lib/mysql/mysql-bin.000087 | mysql -u root -p

Strategy	Speed (backup)	Speed (restore)	Granularity	Best for
mysqldump	Slow	Slow	Full or per-table	Small databases, cross-version migration
XtraBackup	Fast	Fast	Full or incremental	Large databases, fast recovery
Binlog PITR	Continuous	Depends on range	Per-transaction	"Oops, someone ran DELETE without WHERE"

Gotcha: A backup that's never been tested is not a backup. It's a hope. Schedule monthly restore tests to a throwaway instance. Verify checksums. Run pt-table-checksum against the restored copy. Discover problems on your schedule, not during an incident.

GTID Replication: Why It Matters¶

The setup you inherited uses traditional binlog-position replication — tracking by filename and byte offset. If you need to repoint a replica during failover, you have to manually calculate the correct file and position. GTIDs fix this.

# /etc/mysql/mysql.conf.d/mysqld.cnf (on ALL nodes)
[mysqld]
gtid_mode = ON
enforce_gtid_consistency = ON
log_slave_updates = ON
binlog_format = ROW

-- Replica setup with GTIDs — no file/position needed
CHANGE MASTER TO
  MASTER_HOST='primary.db.example.com',
  MASTER_USER='replication',
  MASTER_PASSWORD='secret',
  MASTER_AUTO_POSITION=1;
START SLAVE;

With GTIDs, a replica says "I have transactions 1 through 847291 — send me 847292 and onward." No filenames, no byte offsets, no manual calculation during failover.

Trivia: Before GTIDs (introduced in MySQL 5.6, 2013), changing a replica's primary required manually calculating log positions — so error-prone it spawned an entire ecosystem of tools (MHA, MMM, Orchestrator) just to manage failover safely.

Monitoring: The Essential Dashboard¶

-- Top queries by total time (last hour)
SELECT LEFT(digest_text, 100) AS query, count_star AS calls,
       ROUND(sum_timer_wait / 1e12, 2) AS total_sec,
       ROUND(avg_timer_wait / 1e12, 4) AS avg_sec
FROM performance_schema.events_statements_summary_by_digest
WHERE last_seen > NOW() - INTERVAL 1 HOUR
ORDER BY sum_timer_wait DESC LIMIT 10;

# Key metrics at a glance
mysqladmin -u root -p extended-status | grep -E \
  "Threads_connected|Threads_running|Slow_queries|Innodb_buffer_pool_read_requests|Innodb_buffer_pool_reads"

SHOW ENGINE INNODB STATUS\G — the Swiss Army knife. Key sections:

Section	What to look for
`SEMAPHORES`	Long waits = internal contention
`LATEST DETECTED DEADLOCK`	Two transactions wanted the same locks
`BUFFER POOL AND MEMORY`	Free buffers = 0 means buffer pool is full
`LOG`	Sequence number far ahead of checkpoint = redo log too small

Metric	Healthy	Alert
Buffer pool hit rate	> 99%	< 95%
`Threads_running`	< 20	> 50
`Seconds_Behind_Master`	0	> 30
`Innodb_row_lock_waits`	Low	Sudden spike

Operational Nightmare: Disk Full on Binlog¶

It's 2 AM. The pager fires: disk space on the primary is at 98%.

df -h /var/lib/mysql
# Filesystem      Size  Used Avail Use% Mounted on
# /dev/sda1       200G  196G  4.0G  98% /var/lib/mysql

du -sh /var/lib/mysql/mysql-bin.* | tail -5
# 1.1G  /var/lib/mysql/mysql-bin.000083
# 1.1G  /var/lib/mysql/mysql-bin.000084
# 1.1G  /var/lib/mysql/mysql-bin.000085
# 1.1G  /var/lib/mysql/mysql-bin.000086
# 524M  /var/lib/mysql/mysql-bin.000087

Binary logs are eating the disk. But you can't just delete them — replicas may still need them.

-- Check what the replicas have consumed
SHOW SLAVE HOSTS;
-- Then check each replica's position:
-- Look at Master_Log_File in SHOW SLAVE STATUS on each replica

-- If all replicas are past mysql-bin.000083:
PURGE BINARY LOGS TO 'mysql-bin.000084';

-- Set automatic expiry to prevent recurrence
SET GLOBAL binlog_expire_logs_seconds = 259200;  -- 3 days (MySQL 8.0+)
SET PERSIST binlog_expire_logs_seconds = 259200;  -- survives restart

Gotcha: If you purge binary logs that a replica still needs, that replica is toast. It will fail with "Could not find first log file name in binary log index file." The only fix is rebuilding the replica from a fresh backup. Always check replica positions before purging.

This scenario connects directly to disk monitoring. If you're not alerting on disk usage at 80%, you'll hit 98% at 2 AM. Set up monitoring before it becomes an incident.

Exercises¶

Exercise 1: Quick Win (2 minutes) Connect to a MySQL instance and run:

SHOW GLOBAL STATUS LIKE 'Threads%';
SHOW GLOBAL VARIABLES LIKE 'max_connections';

Calculate: what percentage of your connection capacity is currently used?

Solution

`Threads_connected / max_connections * 100`. If this is consistently above 80%, either increase `max_connections` or deploy a connection pooler.

Exercise 2: EXPLAIN Challenge (10 minutes) Pick one of your application's most common queries. Run EXPLAIN on it. Look for type = ALL, Using filesort, or Using temporary. Propose a composite index — what columns, in what order?

Guidance

Equality conditions first, range conditions last. `WHERE status = 'active' AND created_at > '2026-01-01'` benefits from `INDEX(status, created_at)`, not `INDEX(created_at, status)`.

Exercise 3: Judgment Call (15 minutes) A developer needs to add a NOT NULL column with a default value to a 500-million-row table in production. The table receives 2,000 writes per second. What's your plan?

Approach

1. Check MySQL version. If 8.0+, test if `ALTER TABLE ... ADD COLUMN ... DEFAULT ..., ALGORITHM=INSTANT` works for this specific operation. Instant DDL takes milliseconds. 2. If not instant-eligible, use `pt-online-schema-change` or `gh-ost`. 3. Run `--dry-run` first to estimate duration and disk usage (the shadow table doubles disk needs temporarily). 4. Set `--max-load` thresholds based on current `Threads_running` baseline. 5. Schedule during lowest-traffic period, even though the tool is online — it still adds load. 6. Monitor replication lag during the operation — the trigger overhead on each write can spike lag on replicas.

Cheat Sheet¶

Task	Command
Is MySQL alive?	`mysqladmin -u root -p ping`
Version	`SELECT VERSION();`
Current queries	`SHOW FULL PROCESSLIST;`
Kill a query	`KILL QUERY <id>;` (query only) or `KILL <id>;` (connection)
Replication status	`SHOW SLAVE STATUS\G`
Buffer pool hit rate	`1 - (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests)`
Table sizes	`SELECT table_name, ROUND((data_length+index_length)/1024/1024,1) AS mb FROM information_schema.tables ORDER BY mb DESC;`
Slow query log	`pt-query-digest /var/log/mysql/slow.log --since '1h ago'`
Non-blocking schema change	`pt-online-schema-change --alter "..." --execute D=db,t=table`
Safe backup (InnoDB)	`mysqldump --single-transaction --routines --triggers --events`
Fast backup (large DB)	`xtrabackup --backup --target-dir=/backup/full`
Check indexes on a table	`SHOW INDEX FROM db.table\G`
Find blocking transactions	Query `information_schema.innodb_lock_waits` joined with `innodb_trx`
Purge old binlogs	`PURGE BINARY LOGS TO 'mysql-bin.000080';` (after checking replicas)
Real replication lag	`pt-heartbeat --database=percona --monitor`
Dynamic config (MySQL 8.0)	`SET PERSIST variable_name = value;`

Takeaways¶

Replication lag is a symptom, not a diagnosis. Check whether the SQL thread is single-threaded, the buffer pool is undersized, or the primary is generating more writes than the replica can replay. Fix the root cause.
Never ALTER TABLE a large table directly in production. Use pt-online-schema-change, gh-ost, or MySQL 8.0 instant DDL. The metadata lock pile-up can be worse than the ALTER itself.
The buffer pool hit rate is the single most important InnoDB metric. Below 99% means you're doing unnecessary disk I/O. Set it to 70–80% of RAM on dedicated servers.
Seconds_Behind_Master: NULL is not "no lag" — it's "replication is stopped." Monitor both the value and the Slave_SQL_Running state. Use pt-heartbeat for accurate lag.
A backup you've never restored is a wish, not a backup. Test restores monthly. mysqldump for small databases, XtraBackup for large ones, binlog PITR for "undo the accidental DELETE."
Disk full on binlog is the most preventable 2 AM page. Set binlog_expire_logs_seconds, monitor disk at 80%, and always check replica positions before purging.

The Disk That Filled Up — Logging, rotation, inodes, docker storage drivers. The binlog disk-full scenario here is one instance of a broader pattern.
The Database That Wouldn't Start — InnoDB crash recovery, force recovery levels, emergency dumps. What happens when the replication lag alert is the least of your problems.
The Backup Nobody Tested — Backup strategies, restore testing, RPO/RTO tradeoffs. Goes deeper on the backup verification practices touched on here.
Connection Refused — When the problem isn't "too many connections" but "can't connect at all." Systematic layer-by-layer diagnosis.
RAID: Why Your Disks Will Fail — The storage layer underneath InnoDB. Buffer pool hit rates matter more when the disks underneath are degraded.