Solution¶

Triage¶

Read the soft lockup messages and stack traces:
```
dmesg -T | grep -A 20 "soft lockup"
```

Check THP status:

cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag

Check dirty page settings:

sysctl vm.dirty_ratio vm.dirty_background_ratio vm.dirty_expire_centisecs

Check I/O and CPU:
```
iostat -xz 1 5
mpstat -P ALL 1 5
```

Root Cause¶

The dmesg stack trace shows the CPU stuck in compact_zone_order() which is the kernel's memory compaction function. This is triggered by transparent hugepages (THP), which is enabled by default on this system.

When PostgreSQL performs heavy writes, the system generates significant page cache dirty pages. When THP's defrag mode is set to always, the kernel tries to compact memory into 2MB hugepages on every allocation. Under heavy load, this compaction can hold a CPU in kernel mode for tens of seconds, triggering the soft lockup watchdog.

Contributing factors: - THP enabled with defrag=always - High vm.dirty_ratio (40%) causing large dirty page flushes - Kernel 5.4.0 base has known THP compaction stall issues fixed in later patches

Fix¶

Immediate (stop the lockups):

Disable THP (recommended for all database servers):

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag

Persist across reboots by adding to /etc/rc.local or creating a systemd unit:

# /etc/systemd/system/disable-thp.service
[Unit]
Description=Disable Transparent Huge Pages
Before=postgresql.service

[Service]
Type=oneshot
ExecStart=/bin/sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled && echo never > /sys/kernel/mm/transparent_hugepage/defrag'

[Install]
WantedBy=multi-user.target

Tune dirty page settings to reduce flush storms:

sysctl -w vm.dirty_ratio=10
sysctl -w vm.dirty_background_ratio=5
sysctl -w vm.dirty_expire_centisecs=500

Persist in /etc/sysctl.d/99-database.conf.

Medium-term:

Update the kernel to the latest 5.4.x LTS patch:

apt-get update && apt-get install linux-image-5.4.0-latest-generic

Rollback / Safety¶

Disabling THP is safe and recommended by PostgreSQL, MongoDB, Redis, and most database vendors.
Dirty page tuning may increase I/O frequency but in smaller, less disruptive batches.
Kernel updates require a reboot. Schedule during a maintenance window with database failover.

Common Traps¶

Increasing kernel.watchdog_thresh to suppress the warning. This hides the symptom without fixing the cause. The CPU is still stuck.
Assuming soft lockups are hardware failures. While they can indicate hardware issues, software causes (THP, I/O stalls, spinlock contention) are far more common.
Not disabling THP on database servers. This is one of the most common performance and stability issues. Every major database vendor documents this recommendation.
Confusing soft lockup with hard lockup. Soft lockup = CPU stuck in kernel code. Hard lockup = CPU not responding to interrupts (usually hardware or NMI issue).
Not reading the stack trace. The stack trace tells you exactly where the CPU is stuck. compact_zone = THP. blk_mq_ = I/O. rwsem_ = lock contention.