Python Debugging Footguns¶

Mistakes that hide bugs, waste debugging time, or cause silent failures in production.

1. Bare except clauses swallow real errors¶

A bare except: or except Exception: catches everything, including bugs you need to see. KeyboardInterrupt, SystemExit, and actual programming errors (TypeError, AttributeError) all vanish silently. Your code "works" but is fundamentally broken.

# Terrible: hides every possible error
try:
    result = process_data(payload)
except:
    pass  # "It's fine."

# Also bad: catches too much
try:
    user = db.get_user(user_id)
    send_email(user.email, body)
except Exception as e:
    logger.error(f"Something went wrong: {e}")
    # Was it a DB connection error? A missing email field? A DNS failure?
    # You'll never know because you caught everything in one bucket.

# Better: catch specific exceptions
try:
    user = db.get_user(user_id)
except DatabaseConnectionError:
    logger.error("DB connection failed", exc_info=True)
    raise
except UserNotFoundError:
    logger.warning(f"User {user_id} not found")
    return None

Fix: Catch the narrowest exception possible. Always log exc_info=True for unexpected exceptions. Never use bare except: in production code.

2. Mutable default arguments cause cross-call contamination¶

A default argument is evaluated once at function definition time, not on each call. A mutable default (list, dict, set) is shared across all calls. This creates baffling bugs where a function "remembers" data from previous invocations.

# Bug: results accumulate across calls
def add_item(item, items=[]):
    items.append(item)
    return items

print(add_item("a"))  # ['a']
print(add_item("b"))  # ['a', 'b'] — WHAT?!

# Fix: use None as sentinel
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

Why this is a debugging footgun: The bug doesn't show up in unit tests that test each call in isolation. It appears in production when the function is called repeatedly in the same process. It looks like a caching bug or a memory leak.

3. Import side effects cause loading-order nightmares¶

Modules that execute significant logic at import time (connecting to databases, starting threads, reading config files) create fragile, order-dependent loading. Tests that import the module trigger production side effects. Circular imports crash with mysterious ImportError or AttributeError: module has no attribute.

# config.py — runs on import, not on use
import os
import redis

# This connection attempt happens when ANY module imports config
REDIS_CLIENT = redis.Redis(host=os.environ["REDIS_HOST"])  # Crashes if env var missing

# Better: lazy initialization
_redis_client = None

def get_redis():
    global _redis_client
    if _redis_client is None:
        _redis_client = redis.Redis(host=os.environ["REDIS_HOST"])
    return _redis_client

Fix: Defer expensive operations to function calls. Keep module-level code limited to constants, type definitions, and simple assignments.

4. Circular imports produce baffling AttributeError¶

Module A imports from module B, which imports from module A. Python doesn't crash immediately — it gives you a partially initialized module. You get AttributeError: module 'foo' has no attribute 'bar' even though bar is clearly defined in foo.py.

# models.py
from validators import validate_user  # validators imports models → circular

class User:
    def save(self):
        validate_user(self)

# validators.py
from models import User  # models imports validators → circular

def validate_user(user):
    if not isinstance(user, User):
        raise TypeError("Expected User")

Fix: Move the import inside the function, use TYPE_CHECKING for type hints only, or restructure to break the cycle:

# validators.py — import inside function to break cycle
def validate_user(user):
    from models import User  # Lazy import
    if not isinstance(user, User):
        raise TypeError("Expected User")

5. Print output disappears in Docker (unbuffered stdout)¶

Python buffers stdout by default. In Docker, there's no TTY, so output is fully buffered. Your print() debugging statements produce nothing in docker logs until the buffer fills or the process exits. You think your code isn't running. It is — you just can't see it.

# Your Dockerfile
CMD ["python", "app.py"]
# print() output is invisible in docker logs for minutes

# Fix: disable buffering
CMD ["python", "-u", "app.py"]
# Or set the environment variable:
ENV PYTHONUNBUFFERED=1

Fix: Always set PYTHONUNBUFFERED=1 in Docker images running Python. Or use python -u. Or use logging instead of print (logging writes to stderr, which is unbuffered by default).

6. pdb in production threads causes deadlocks¶

You leave a breakpoint() or import pdb; pdb.set_trace() in code that runs in a thread pool or async worker. The debugger tries to read from stdin, which doesn't exist in production. The thread hangs forever, holding whatever locks or resources it had. Other threads waiting on those resources also hang. Eventually the whole service is frozen.

# This will hang the worker thread in production
def process_task(task):
    breakpoint()  # Waiting for stdin that doesn't exist
    return task.result

# Gunicorn worker, Celery task, asyncio coroutine — all will hang

Fix: Never commit breakpoint() or pdb.set_trace() to production code. Use a pre-commit hook to catch it:

# .pre-commit-config.yaml
- repo: https://github.com/pre-commit/pre-commit-hooks
  hooks:
  - id: debug-statements  # Catches pdb, breakpoint, ipdb, etc.

For production debugging, use debugpy with a remote attach pattern, or rely on logging.

7. Confusing `is` vs `==` creates intermittent bugs¶

is checks identity (same object in memory). == checks equality (same value). CPython interns small integers (-5 to 256) and short strings, so is sometimes works by accident in tests but fails in production with larger values.

# Works by accident (CPython interns small ints)
>>> a = 256
>>> b = 256
>>> a is b
True

# Fails with larger values
>>> a = 257
>>> b = 257
>>> a is b
False  # Different objects, same value

# Common bug in real code
def check_status(code):
    if code is 200:  # WRONG: should be ==
        return "OK"
    # Works in tests (200 is interned), may fail in production

Fix: Use is only for None, True, False, and sentinel objects. Use == for all value comparisons. Python 3.8+ emits SyntaxWarning for is with literals.

8. Silent exception swallowing in threads¶

Exceptions in threads don't propagate to the main thread. If a thread crashes, the main thread has no idea. The thread dies silently and work stops being processed. You notice hours later when a queue backs up.

import threading

def worker():
    raise RuntimeError("Database connection failed")

t = threading.Thread(target=worker)
t.start()
t.join()
# No exception raised here. The thread is dead. Main thread continues happily.

Fix: Use concurrent.futures.ThreadPoolExecutor which captures exceptions in Future objects, or wrap thread targets with exception handling that logs and re-raises:

from concurrent.futures import ThreadPoolExecutor

with ThreadPoolExecutor(max_workers=4) as pool:
    future = pool.submit(worker)
    result = future.result()  # Raises the RuntimeError here

9. GIL misconceptions lead to wrong debugging conclusions¶

You profile a multi-threaded CPU-bound Python program and see threads aren't running in parallel. You conclude threading is broken. It's not — the GIL prevents true parallel execution of Python bytecode. But I/O-bound code (network, disk, sleep) does release the GIL and runs concurrently. Debugging the wrong problem wastes hours.

# CPU-bound: GIL means threads run sequentially
# This is NOT faster with threads — it may be slower due to GIL contention
def cpu_work():
    return sum(i * i for i in range(10_000_000))

# I/O-bound: GIL is released during I/O waits
# Threads DO help here
def io_work():
    response = requests.get("https://api.example.com")
    return response.json()

Fix: For CPU-bound parallelism, use multiprocessing or ProcessPoolExecutor. For I/O-bound concurrency, threads or asyncio work fine. Profile first to determine which category your bottleneck falls into.