Solution¶
Triage¶
- Check the mount options:
- Check for stuck processes:
- Test NFS server connectivity:
- Check NFS client state:
Root Cause¶
The NFS mount at /mnt/shared-data uses the default hard mount option. With a hard mount, when the NFS server becomes unreachable, the NFS client retries the request indefinitely. Any process that performs a filesystem operation on the mount point enters uninterruptible sleep (D state) while the kernel retries the NFS RPC.
The NFS server nfs.internal is unreachable because of a network switch failure. The hard mount ensures no data corruption (requests will eventually succeed when the server returns), but it blocks all processes that touch the mount path.
Fix¶
Immediate (unblock the system):
-
Try a force unmount:
This may fail if processes have open file handles. -
If force unmount fails, use lazy unmount:
This detaches the mount from the filesystem namespace. New processes will no longer see or hang on the mount point. Already-stuck processes remain in D state until the server returns or they are killed by a reboot. -
Kill stuck processes if possible (note: D-state processes cannot be killed with SIGKILL; they must be cleared by the kernel):
When the NFS server is restored:
-
Remount with safer options:
-
Update
/etc/fstab:
Rollback / Safety¶
- Lazy unmount makes the mount invisible but does not free kernel resources until all open files are closed.
softmounts will return EIO errors to applications when the server is unreachable. Applications must handle these errors.- If data integrity is critical, keep
hardbut addtimeoandintr(or usesoftrevalon newer kernels).
Common Traps¶
- Trying to
kill -9a D-state process. Processes in uninterruptible sleep cannot receive signals. Only the kernel can wake them when the I/O completes or the mount is unmounted. - Using
df -hto diagnose. Thedfcommand itself will hang because it stats all filesystems, including the stuck NFS mount. Usedf -h --exclude-type=nfsordf -h -x nfs4. - Assuming force unmount always works.
umount -foften fails when processes have open files.umount -lis the pragmatic solution. - Not using
_netdevin fstab. Without it, the system may hang on boot if the NFS server is unavailable during mount. - Forgetting that
intris deprecated. On Linux kernels 2.6.25+,intris a no-op. Usesoftmount instead for interruptible behavior.