Symptoms¶
- The database server
db-primary-01intermittently becomes unresponsive for 10-30 seconds. - During these pauses, SSH sessions freeze and database queries time out.
dmesgshowsBUG: soft lockup - CPU#2 stuck for 22s!messages.- The soft lockup messages appear 2-3 times per day, correlated with heavy write workloads.
- The server has 8 CPUs and 64 GB of RAM, running a PostgreSQL database.
- The kernel version is 5.4.0 (not the latest patch level for this series).
- Monitoring shows CPU#2 spends extended periods in kernel mode during the lockup events.