Package Management Footguns¶

Mistakes that cause fleet drift, broken servers, or security gaps.

1. Running `apt upgrade` interactively on production¶

You SSH into a prod server, run apt upgrade -y, and a kernel update pulls in a new glibc. Your Java app segfaults because the JVM was compiled against the old glibc. You just took down the API for an unplanned reboot.

Fix: Never run ad-hoc upgrades on production. Use automation (Ansible, unattended-upgrades) with a tested, pinned package set. Stage through dev, staging, prod with 24-48h bake time.

Gotcha: apt upgrade -y with no hold list can pull in kernel, glibc, openssl, or systemd updates. Each of these can change runtime behavior. A glibc update can break statically-linked binaries. A systemd update can change default unit behavior. An openssl update can change TLS defaults and break connections to legacy services.

2. Using `--nogpgcheck` or `--allow-unauthenticated`¶

Your repo is giving a GPG error, so you add --nogpgcheck to get the install done. Now a compromised mirror can push arbitrary binaries to your fleet and you will not know.

Fix: Fix the GPG key. Import the correct key with rpm --import or install it to /usr/share/keyrings/. Treat every --nogpgcheck in your automation as a security finding.

Under the hood: GPG verification ensures the package was signed by the repo maintainer and has not been tampered with. Without it, a compromised mirror can serve a modified binary — same filename, same version, but with a backdoor. This is not theoretical: the 2024 xz backdoor (CVE-2024-3094) was inserted into a legitimate source tarball.

3. Forgetting `apt-mark hold` on critical packages¶

Unattended-upgrades runs overnight and upgrades your database client library. The new version has a breaking change. Monday morning, every service that talks to Postgres is throwing connection errors.

Fix: Hold critical packages: apt-mark hold libpq5. On RHEL, use dnf versionlock add. Monitor /var/log/unattended-upgrades/ for held-package warnings.

4. Using `dpkg -i` without fixing dependencies¶

You install a .deb with dpkg -i. It exits without error but the package is half-configured because dependencies are missing. Nothing works and apt refuses to install anything else until the broken state is resolved.

Fix: Use apt install ./package.deb instead — it resolves dependencies. If you already used dpkg -i, run apt --fix-broken install immediately.

5. Deleting `/var/lib/dpkg/lock` to "fix" a stuck lock¶

Two apt processes ran simultaneously. You delete the lock file to force your install through. Now you have a corrupted dpkg database because both processes wrote to it at the same time.

Fix: Find the process holding the lock with lsof /var/lib/dpkg/lock-frontend. Wait for it to finish or kill it cleanly. Never delete lock files directly.

6. Not cleaning package cache in Docker layers¶

Your Dockerfile runs apt install but never cleans up. Each image carries 500MB of cached .deb files nobody will ever use. Your container registry bills triple and image pulls take forever.

Fix: Always clean in the same RUN layer: apt-get install -y pkg && apt-get clean && rm -rf /var/lib/apt/lists/*. Separate layers mean the cache stays in the image.

Remember: Docker images are union filesystems. Each RUN instruction creates a new layer. Deleting a file in a later layer does not reclaim space — the file still exists in the earlier layer. The cleanup must happen in the same RUN command (same layer) as the install to actually reduce image size.

7. Adding repos without `signed-by` on modern Ubuntu¶

You follow a 2019 tutorial and use apt-key add. On Ubuntu 22.04+ this is deprecated and will break. The key goes into the global trust store where it can sign packages from any repo, not just the one you intended.

Fix: Use signed-by in the sources list pointing to a keyring file in /usr/share/keyrings/. Each repo gets its own scoped key.

8. Ignoring `apt-mark showhold` during upgrades¶

You held a package six months ago and forgot. Unattended-upgrades silently skips the entire transaction when it cannot resolve held dependencies. Security patches for a dozen packages never land.

Fix: Periodically audit holds: apt-mark showhold. On RHEL: dnf versionlock list. Set a calendar reminder to review holds quarterly.

9. Running `rpm --rebuilddb` without a backup¶

Your RPM database is corrupt. You run rpm --rebuilddb to fix it. The rebuild itself encounters corrupt headers and now you have lost package tracking entirely. You cannot tell what is installed.

Fix: Always back up first: cp -a /var/lib/rpm/ /var/lib/rpm.bak. Then rebuild. If the rebuild fails, you can restore from the backup and try other recovery options.

10. Fleet drift from inconsistent repo configurations¶

Half your fleet points to the main Ubuntu mirror, the other half points to a stale internal mirror. Packages diverge. When you deploy, some servers have OpenSSL 3.0.7 and others have 3.0.9. The app works on staging but crashes on the three servers still on the old version.

Fix: Manage repo configs with Ansible or Puppet. Pin to a snapshot mirror (Aptly, Pulp, or vendor repos with dated snapshots). Audit with fleet-wide dpkg-query diffs.

Debug clue: Quick fleet drift check: ansible all -m shell -a "dpkg-query -W openssl | cut -f2" | sort | uniq -c. If you see multiple versions, you have drift. For RPM: ansible all -m shell -a "rpm -q openssl" | sort | uniq -c.

Package Management Footguns¶

1. Running apt upgrade interactively on production¶

2. Using --nogpgcheck or --allow-unauthenticated¶

3. Forgetting apt-mark hold on critical packages¶

4. Using dpkg -i without fixing dependencies¶

5. Deleting /var/lib/dpkg/lock to "fix" a stuck lock¶