Skip to content

Package Management Footguns

Mistakes that cause fleet drift, broken servers, or security gaps.


1. Running apt upgrade interactively on production

You SSH into a prod server, run apt upgrade -y, and a kernel update pulls in a new glibc. Your Java app segfaults because the JVM was compiled against the old glibc. You just took down the API for an unplanned reboot.

Fix: Never run ad-hoc upgrades on production. Use automation (Ansible, unattended-upgrades) with a tested, pinned package set. Stage through dev, staging, prod with 24-48h bake time.

Gotcha: apt upgrade -y with no hold list can pull in kernel, glibc, openssl, or systemd updates. Each of these can change runtime behavior. A glibc update can break statically-linked binaries. A systemd update can change default unit behavior. An openssl update can change TLS defaults and break connections to legacy services.


2. Using --nogpgcheck or --allow-unauthenticated

Your repo is giving a GPG error, so you add --nogpgcheck to get the install done. Now a compromised mirror can push arbitrary binaries to your fleet and you will not know.

Fix: Fix the GPG key. Import the correct key with rpm --import or install it to /usr/share/keyrings/. Treat every --nogpgcheck in your automation as a security finding.

Under the hood: GPG verification ensures the package was signed by the repo maintainer and has not been tampered with. Without it, a compromised mirror can serve a modified binary — same filename, same version, but with a backdoor. This is not theoretical: the 2024 xz backdoor (CVE-2024-3094) was inserted into a legitimate source tarball.


3. Forgetting apt-mark hold on critical packages

Unattended-upgrades runs overnight and upgrades your database client library. The new version has a breaking change. Monday morning, every service that talks to Postgres is throwing connection errors.

Fix: Hold critical packages: apt-mark hold libpq5. On RHEL, use dnf versionlock add. Monitor /var/log/unattended-upgrades/ for held-package warnings.


4. Using dpkg -i without fixing dependencies

You install a .deb with dpkg -i. It exits without error but the package is half-configured because dependencies are missing. Nothing works and apt refuses to install anything else until the broken state is resolved.

Fix: Use apt install ./package.deb instead — it resolves dependencies. If you already used dpkg -i, run apt --fix-broken install immediately.


5. Deleting /var/lib/dpkg/lock to "fix" a stuck lock

Two apt processes ran simultaneously. You delete the lock file to force your install through. Now you have a corrupted dpkg database because both processes wrote to it at the same time.

Fix: Find the process holding the lock with lsof /var/lib/dpkg/lock-frontend. Wait for it to finish or kill it cleanly. Never delete lock files directly.


6. Not cleaning package cache in Docker layers

Your Dockerfile runs apt install but never cleans up. Each image carries 500MB of cached .deb files nobody will ever use. Your container registry bills triple and image pulls take forever.

Fix: Always clean in the same RUN layer: apt-get install -y pkg && apt-get clean && rm -rf /var/lib/apt/lists/*. Separate layers mean the cache stays in the image.

Remember: Docker images are union filesystems. Each RUN instruction creates a new layer. Deleting a file in a later layer does not reclaim space — the file still exists in the earlier layer. The cleanup must happen in the same RUN command (same layer) as the install to actually reduce image size.


7. Adding repos without signed-by on modern Ubuntu

You follow a 2019 tutorial and use apt-key add. On Ubuntu 22.04+ this is deprecated and will break. The key goes into the global trust store where it can sign packages from any repo, not just the one you intended.

Fix: Use signed-by in the sources list pointing to a keyring file in /usr/share/keyrings/. Each repo gets its own scoped key.


8. Ignoring apt-mark showhold during upgrades

You held a package six months ago and forgot. Unattended-upgrades silently skips the entire transaction when it cannot resolve held dependencies. Security patches for a dozen packages never land.

Fix: Periodically audit holds: apt-mark showhold. On RHEL: dnf versionlock list. Set a calendar reminder to review holds quarterly.


9. Running rpm --rebuilddb without a backup

Your RPM database is corrupt. You run rpm --rebuilddb to fix it. The rebuild itself encounters corrupt headers and now you have lost package tracking entirely. You cannot tell what is installed.

Fix: Always back up first: cp -a /var/lib/rpm/ /var/lib/rpm.bak. Then rebuild. If the rebuild fails, you can restore from the backup and try other recovery options.


10. Fleet drift from inconsistent repo configurations

Half your fleet points to the main Ubuntu mirror, the other half points to a stale internal mirror. Packages diverge. When you deploy, some servers have OpenSSL 3.0.7 and others have 3.0.9. The app works on staging but crashes on the three servers still on the old version.

Fix: Manage repo configs with Ansible or Puppet. Pin to a snapshot mirror (Aptly, Pulp, or vendor repos with dated snapshots). Audit with fleet-wide dpkg-query diffs.

Debug clue: Quick fleet drift check: ansible all -m shell -a "dpkg-query -W openssl | cut -f2" | sort | uniq -c. If you see multiple versions, you have drift. For RPM: ansible all -m shell -a "rpm -q openssl" | sort | uniq -c.