Portal | Level: L1: Foundations | Topics: Python Debugging, Python Automation | Domain: DevOps & Tooling

Python Debugging - Primer¶

Why This Matters¶

Every DevOps engineer writes Python: deployment scripts, monitoring integrations, API clients, configuration generators. When those scripts fail in production at 3 AM, you need debugging skills that go beyond adding print() statements. Understanding Python's debugging tools — from pdb to profilers to production tracing — is the difference between a 10-minute fix and a 4-hour guessing game.

pdb: The Built-in Debugger¶

Under the hood: The breakpoint() builtin (PEP 553, Python 3.7) calls sys.breakpointhook(), which defaults to pdb.set_trace(). You can swap the debugger entirely via the PYTHONBREAKPOINT environment variable — set it to ipdb.set_trace, pudb.set_trace, or 0 to disable all breakpoints. This makes breakpoint() a universal hook, not just a pdb shortcut.

Python ships with pdb, a full interactive debugger. Since Python 3.7, breakpoint() is the preferred way to enter it.

Entering the Debugger¶

# Modern (Python 3.7+)
def process_data(records):
    for record in records:
        breakpoint()  # Drops into pdb here
        transform(record)

# Legacy
import pdb; pdb.set_trace()

# From the command line (debug from the start)
$ python -m pdb script.py

# Post-mortem debugging (after an exception)
$ python -m pdb script.py
# When it crashes, you're dropped into pdb at the crash site

Essential pdb Commands¶

Command	Short	What It Does
`next`	`n`	Execute the next line (step over function calls)
`step`	`s`	Step into a function call
`continue`	`c`	Continue execution until next breakpoint
`list`	`l`	Show source code around current line
`longlist`	`ll`	Show entire current function
`print expr`	`p expr`	Print the value of an expression
`pretty-print`	`pp expr`	Pretty-print (useful for dicts, lists)
`where`	`w`	Show the call stack (where am I?)
`up`	`u`	Move up one frame in the call stack
`down`	`d`	Move down one frame in the call stack
`break`	`b`	Set a breakpoint (`b 42` = line 42, `b func` = at function)
`clear`	`cl`	Clear breakpoints
`return`	`r`	Continue until the current function returns
`quit`	`q`	Quit the debugger
`!statement`		Execute a Python statement (useful when variable names shadow commands)

Common pdb Workflow¶

# You're debugging a data processing pipeline
def process_batch(items):
    results = []
    for item in items:
        breakpoint()
        # In pdb:
        # p item           — inspect the current item
        # p len(results)   — check progress
        # p item.keys()    — see what fields exist
        # !item['status'] = 'fixed'  — mutate data on the fly
        # c                — continue to next iteration
        result = transform(item)
        results.append(result)
    return results

Conditional Breakpoints¶

# Break only when a condition is met
def process_order(order):
    breakpoint()  # In pdb, type: condition 1 order.total > 10000
    # Or set it programmatically:
    if order.total > 10000:
        breakpoint()

Enhanced Debuggers: ipdb and pdb++¶

ipdb¶

ipdb adds IPython features to pdb: tab completion, syntax highlighting, better tracebacks.

$ pip install ipdb

import ipdb; ipdb.set_trace()

# Or set it as the default breakpoint handler
$ PYTHONBREAKPOINT=ipdb.set_trace python script.py

pdb++ (pdbpp)¶

pdbpp is a drop-in replacement that enhances pdb with sticky mode (shows code continuously), syntax highlighting, and tab completion.

$ pip install pdbpp
# Now 'breakpoint()' automatically uses pdb++ instead of pdb

Key pdb++ features: - Sticky mode: sticky command shows a continuously updated code listing - Smart command parsing: foo prints variable foo instead of requiring p foo - Better tab completion: completes variable names, attributes, and methods

Remote Debugging with debugpy¶

For debugging code running in Docker containers, remote servers, or as background services, debugpy (Microsoft's Debug Adapter Protocol server) lets you attach VS Code or any DAP client.

# Add to your application startup
import debugpy
debugpy.listen(("0.0.0.0", 5678))
print("Waiting for debugger attach...")
debugpy.wait_for_client()  # Optional: pause until debugger connects

# Expose the debug port in Docker
$ docker run -p 5678:5678 myapp

# In VS Code launch.json:
{
    "name": "Attach to Remote",
    "type": "python",
    "request": "attach",
    "connect": {"host": "localhost", "port": 5678}
}

For production, start debugpy only when signaled (don't leave it running):

import signal
import debugpy

def enable_debugger(signum, frame):
    debugpy.listen(("127.0.0.1", 5678))
    print("Debugger listening on port 5678")

signal.signal(signal.SIGUSR1, enable_debugger)
# Send SIGUSR1 to enable: kill -USR1 <pid>

The logging Module¶

For production debugging, logging is your primary tool. It's built-in, configurable, and doesn't require interactive access.

Logging Levels¶

Level	Value	When to Use
`DEBUG`	10	Detailed diagnostic info (variable values, flow tracing)
`INFO`	20	Confirmation that things are working as expected
`WARNING`	30	Something unexpected but not broken (deprecation, retry)
`ERROR`	40	An operation failed but the app continues
`CRITICAL`	50	The application cannot continue

Production Logging Pattern¶

import logging

# Configure once at application startup
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s %(name)s %(levelname)s %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S%z",
)

logger = logging.getLogger(__name__)

def process_request(request_id, payload):
    logger.info("Processing request %s", request_id)
    try:
        result = do_work(payload)
        logger.debug("Result for %s: %r", request_id, result)
        return result
    except ValidationError as e:
        logger.warning("Validation failed for %s: %s", request_id, e)
        raise
    except Exception:
        logger.exception("Unexpected error processing %s", request_id)
        # logger.exception() auto-includes the traceback
        raise

Structured Logging (JSON)¶

For production systems shipping logs to ELK/Loki/Datadog:

import json
import logging

class JSONFormatter(logging.Formatter):
    def format(self, record):
        log_entry = {
            "timestamp": self.formatTime(record),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
        }
        if record.exc_info:
            log_entry["exception"] = self.formatException(record.exc_info)
        return json.dumps(log_entry)

handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logging.root.addHandler(handler)

Or use the python-json-logger package for a ready-made solution.

Tracebacks and Exception Inspection¶

traceback Module¶

import traceback

try:
    risky_operation()
except Exception:
    # Print traceback without re-raising
    traceback.print_exc()

    # Capture traceback as a string (for logging, alerting)
    tb_str = traceback.format_exc()
    logger.error("Operation failed:\n%s", tb_str)

sys.exc_info()¶

import sys

try:
    risky_operation()
except Exception:
    exc_type, exc_value, exc_tb = sys.exc_info()
    # exc_type: <class 'ValueError'>
    # exc_value: the exception instance
    # exc_tb: traceback object (for programmatic inspection)

Exception Chaining (Python 3)¶

try:
    config = load_config()
except FileNotFoundError as e:
    raise RuntimeError("Cannot start without config") from e
    # The traceback shows both exceptions:
    # FileNotFoundError: config.yaml not found
    # The above exception was the direct cause of:
    # RuntimeError: Cannot start without config

assert and debug¶

# Assertions are removed when Python runs with -O (optimize)
assert isinstance(data, dict), f"Expected dict, got {type(data)}"

# __debug__ is True normally, False with -O
if __debug__:
    validate_expensive_invariant(data)

# In production, run with optimization to skip asserts:
$ python -O app.py
# Never use assert for input validation in production code

warnings Module¶

import warnings

# Issue a deprecation warning
def old_api():
    warnings.warn("old_api() is deprecated, use new_api()", DeprecationWarning, stacklevel=2)
    return new_api()

# Control warning behavior
warnings.filterwarnings("error", category=DeprecationWarning)  # Turn warnings into exceptions
warnings.filterwarnings("ignore", message=".*experimental.*")   # Silence specific warnings

# From command line
$ python -W error::DeprecationWarning script.py    # Warnings become errors
$ python -W ignore script.py                        # Silence all warnings

faulthandler: Debugging Segfaults and Hangs¶

Debug clue: If a Python process suddenly exits with no traceback, check dmesg for segfault or killed entries. A missing traceback usually means a C extension crashed (segfault), the OOM killer fired, or the process received a signal like SIGKILL.

When Python crashes with a segfault (common with C extensions), the normal traceback is lost. faulthandler dumps the Python stack trace on crash.

# Enable via environment variable (simplest)
$ PYTHONFAULTHANDLER=1 python app.py

# Or enable in code
import faulthandler
faulthandler.enable()

# Dump traceback on signal (debugging hangs)
import faulthandler
import signal
faulthandler.register(signal.SIGUSR1)
# Now: kill -USR1 <pid> prints stack trace to stderr without stopping the process

Profiling: Finding Where Time Goes¶

cProfile¶

# Profile an entire script
$ python -m cProfile -s cumulative script.py

# Top output columns:
# ncalls    — number of calls
# tottime   — time in this function (excluding subcalls)
# cumtime   — cumulative time (including subcalls)
# percall   — per-call time

# Profile a specific function
import cProfile

cProfile.run('process_batch(data)', sort='cumulative')

# Save profile for analysis
$ python -m cProfile -o profile.out script.py
$ python -m pstats profile.out
# In pstats: sort cumulative, stats 20

line_profiler (Line-by-Line)¶

$ pip install line_profiler

# Decorate the function you want to profile
@profile
def slow_function():
    data = load_data()        # 0.1s
    processed = transform(data)  # 3.2s  <-- bottleneck found
    save_results(processed)   # 0.3s

$ kernprof -l -v script.py

Memory Profiling with tracemalloc¶

import tracemalloc

tracemalloc.start()

# ... your code runs ...

snapshot = tracemalloc.take_snapshot()
top_stats = snapshot.statistics('lineno')

print("Top 10 memory allocations:")
for stat in top_stats[:10]:
    print(stat)

objgraph: Finding Memory Leaks¶

$ pip install objgraph

import objgraph

# What types have the most instances?
objgraph.show_most_common_types(limit=10)

# What's holding a reference to this object? (why isn't it GC'd?)
objgraph.show_backrefs(objgraph.by_type('MyClass')[0], filename='refs.png')

# How many new objects since last check?
objgraph.show_growth(limit=10)

strace on Python Processes¶

When Python itself seems stuck and the debugger can't help, strace shows what system calls the process is making:

# Attach to a running Python process
$ strace -p <pid> -e trace=network,read,write -f
# -f follows child threads
# -e filters to specific syscall categories

# Common findings:
# - Stuck on read() from a socket → waiting for a network response (DNS? API? DB?)
# - Stuck on futex() → waiting on a lock (GIL contention? threading deadlock?)
# - Stuck on poll() with timeout → event loop waiting (normal for idle async)
# - Repeated open()/stat() on missing file → config or import path issue

# Trace a Python script from the start
$ strace -o trace.log -f python script.py
$ grep -c 'open(' trace.log    # How many file opens?

The trace Module¶

Built-in tracing of Python execution — shows every line as it runs:

# Trace all executed lines
$ python -m trace --trace script.py

# Count line executions (find hot paths)
$ python -m trace --count script.py
# Creates .cover files showing execution counts per line

# List functions called
$ python -m trace --listfuncs script.py

Debugging in Production (Without Interactive Access)¶

In production, you rarely have interactive debugger access. Your toolkit:

Logging with appropriate levels and structured output
Metrics (Prometheus counters for error rates, latencies, queue depths)
Tracing (OpenTelemetry spans for request flow across services)
faulthandler for segfault stack traces
py-spy for sampling profiler that attaches to running processes without restart
Signal handlers that dump state on SIGUSR1/SIGUSR2

# py-spy: attach to a running Python process (no restart needed)
$ pip install py-spy

# Live top-like view of where time is spent
$ py-spy top --pid <pid>

# Record a flame graph
$ py-spy record -o profile.svg --pid <pid> --duration 30

# Dump current stack traces of all threads
$ py-spy dump --pid <pid>

py-spy reads process memory directly — it works on processes you didn't instrument, including those running in Docker containers (use --pid of the container's PID 1 from the host namespace).

Gotcha: py-spy requires SYS_PTRACE capability, which Docker drops by default. Run with docker run --cap-add SYS_PTRACE or set ptrace_scope on the host: sysctl kernel.yama.ptrace_scope=0. In Kubernetes, add the capability to the pod's securityContext.

Prerequisites¶

Python for Infrastructure (Topic Pack, L1)

Perl Flashcards (CLI) (flashcard_deck, L1) — Python Automation
Python Async & Concurrency (Topic Pack, L2) — Python Automation
Python Drills (Drill, L0) — Python Automation
Python Exercises (Quest Ladder) (CLI) (Exercise Set, L0) — Python Automation
Python Flashcards (CLI) (flashcard_deck, L1) — Python Automation
Python Packaging (Topic Pack, L2) — Python Automation
Python for Infrastructure (Topic Pack, L1) — Python Automation
Skillcheck: Python Automation (Assessment, L0) — Python Automation
Software Development Flashcards (CLI) (flashcard_deck, L1) — Python Automation

Python Debugging - Primer¶

Why This Matters¶

pdb: The Built-in Debugger¶

Entering the Debugger¶

Essential pdb Commands¶

Common pdb Workflow¶

Conditional Breakpoints¶

Enhanced Debuggers: ipdb and pdb++¶

ipdb¶

pdb++ (pdbpp)¶

Remote Debugging with debugpy¶

The logging Module¶

Logging Levels¶

Production Logging Pattern¶

Structured Logging (JSON)¶

Tracebacks and Exception Inspection¶

traceback Module¶

sys.exc_info()¶

Exception Chaining (Python 3)¶

assert and debug¶

warnings Module¶

faulthandler: Debugging Segfaults and Hangs¶

Profiling: Finding Where Time Goes¶

cProfile¶

line_profiler (Line-by-Line)¶

Memory Profiling with tracemalloc¶

objgraph: Finding Memory Leaks¶

strace on Python Processes¶

The trace Module¶

Debugging in Production (Without Interactive Access)¶

Wiki Navigation¶

Prerequisites¶

Pages that link here¶

Python Debugging - Primer¶

Why This Matters¶

pdb: The Built-in Debugger¶

Entering the Debugger¶

Essential pdb Commands¶

Common pdb Workflow¶

Conditional Breakpoints¶

Enhanced Debuggers: ipdb and pdb++¶

ipdb¶

pdb++ (pdbpp)¶

Remote Debugging with debugpy¶

The logging Module¶

Logging Levels¶

Production Logging Pattern¶

Structured Logging (JSON)¶

Tracebacks and Exception Inspection¶

traceback Module¶

sys.exc_info()¶

Exception Chaining (Python 3)¶

assert and debug¶

warnings Module¶

faulthandler: Debugging Segfaults and Hangs¶

Profiling: Finding Where Time Goes¶

cProfile¶

line_profiler (Line-by-Line)¶

Memory Profiling with tracemalloc¶

objgraph: Finding Memory Leaks¶

strace on Python Processes¶

The trace Module¶

Debugging in Production (Without Interactive Access)¶

Wiki Navigation¶

Prerequisites¶

Related Content¶

Pages that link here¶