Skip to content

Interview Cheatsheet Python

  • lesson ---# Python Interview Cheatsheet

When to use Python over Bash: JSON/YAML parsing | APIs with auth/retries | data structures beyond arrays | error recovery | >100 lines | parallel execution | tests - Bash = text-stream glue (pipes, one-liners, package installs) | Python = data-structure processor (logic, APIs, fan-out) - 100-line rule: if your Bash script has data structures or error recovery, it's already Python in disguise


Core Language

Types: str int float bool None | Explicit conversion: int("8080") fails loudly on bad input (good) - Falsy: 0, 0.0, "", [], {}, set(), None, False — everything else truthy | if items: = "is this non-empty?" - is None not == None (identity vs equality) | f-strings: f"host:{host} port:{port}" | f"{val:.1f}%" formatting

Data structures:

Type Literal Key trait Use for
list [1,2,3] Mutable, ordered Collections, iteration
tuple (1,2,3) Immutable Multiple returns, dict keys
dict {"k":"v"} Key→value mapping Config, structured data
set {1,2,3} O(1) lookup, unique Membership tests, dedup
  • dict.get(key, default) avoids KeyError | Counter replaces sort|uniq -c | defaultdict(list) auto-initializes
  • Comprehensions: [x for x in items if cond] (list) | {k: v for ...} (dict) | Keep readable or use a loop

Functions: Named args + defaults | Return any type (tuples for multiple values) | Local scope by default - Type hints: def check(host: str, port: int = 22) -> bool: — readability + tooling, no runtime cost - Dataclasses: @dataclass class Host: name: str; port: int = 22 — named fields, defaults, printable, less bugs - Mutable default trap: def f(tags=[]) shares list across ALL calls → use tags: list | None = None

Control flow: if/elif/else (no brackets, colon + indent) | for x in items: | for i, x in enumerate(items): - zip(list_a, list_b) = parallel iteration | range(10) = 0-9 | while cond: | break/continue


Files and Paths

from pathlib import Path

p = Path("/etc/nginx") / "conf.d" / "app.conf"   # / operator builds paths
p.exists()  p.is_file()  p.is_dir()               # checks
p.read_text()  p.write_text(content)               # read/write
p.parent  p.name  p.stem  p.suffix                 # components
list(Path(".").rglob("*.yaml"))                     # recursive glob
  • with open(f) as fh: guarantees close even on exceptions | "w" truncates immediately | "a" appends
  • Streaming: for line in open(bigfile): = constant memory | f.readlines() = loads EVERYTHING (OOM on GB files)
  • Atomic writes: tempfile.mkstemp() → write → Path.rename() (atomic on same filesystem)

Error Handling

try:
    result = risky_operation()
except FileNotFoundError:
    handle_missing()              # specific exception
except (ConnectionError, TimeoutError) as e:
    log.warning("failed: %s", e)  # multiple types
else:
    use(result)                   # only if no exception
finally:
    cleanup()                     # always runs
  • Catch specific exceptions — bare except: or except Exception: pass hides real bugs
  • Common: FileNotFoundError PermissionError ValueError KeyError ConnectionError TimeoutError
  • No set -e equivalent — Python's try/except is per-operation, specific, and recoverable (better)

subprocess

result = subprocess.run(
    ["kubectl", "get", "pods", "-o", "json"],   # list args, NOT string
    check=True,           # raise CalledProcessError on non-zero exit
    text=True,            # return strings, not bytes
    capture_output=True,  # capture stdout/stderr
    timeout=30,           # don't hang forever
)
pods = json.loads(result.stdout)
  • Never shell=True with untrusted input — shell injection vulnerability
  • Use subprocess for: systemctl, iptables, docker, git (no Python equivalent)
  • Don't use for: file reading (open), HTTP (requests), JSON (json), string ops (native Python)

HTTP/APIs

from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

session = requests.Session()
retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[500,502,503,504])
session.mount("http://", HTTPAdapter(max_retries=retry))
session.mount("https://", HTTPAdapter(max_retries=retry))

resp = session.get("http://api.internal/health", timeout=(5, 30))
resp.raise_for_status()
data = resp.json()
  • Always timeout=(connect, read) — no timeout = hangs forever, cron piles up, zombies accumulate
  • Sessions reuse TCP connections (1 TLS handshake instead of N) | Only auto-retry GET/HEAD (idempotent)

Config Formats

Format Module Gotcha
JSON json (stdlib) Strict, no comments
YAML yaml.safe_load() Never yaml.load() — code execution vuln
TOML tomllib (3.11+) Read-only in stdlib
CSV csv (stdlib)
INI configparser

Config precedence pattern: defaults < config file < env vars < CLI flags (matches kubectl/aws/terraform) - os.environ.get("KEY", "default") | os.environ["REQUIRED"] raises KeyError if missing


AWS, SSH, Concurrency

boto3: boto3.client('ec2', region_name='us-east-1') | Creds: explicit → env → profile → instance role (never hardcode) - Always paginate: paginator = client.get_paginator('describe_instances') — without it you silently miss results - Error handling: except ClientError as e: → check e.response['Error']['Code']

paramiko (SSH): Always try/finally: client.close() — unclosed connections leak file descriptors - Production: use RejectPolicy() not AutoAddPolicy() | Set timeout=10 on connect

Concurrency: ThreadPoolExecutor(max_workers=20) for I/O-bound fan-out (HTTP, SSH, files) - GIL released during I/O — threading works great for infra | ProcessPoolExecutor for CPU-bound (rare) - as_completed(futures) processes results as they arrive | Cap max_workers — don't DDoS your own infra


Packaging and Testing

Virtual environments: python3 -m venv .venvsource .venv/bin/activatepip install | Never pollute system Python - Outside venv: python3 | Inside venv: python is fine (PEP 394) - pip freeze > requirements.txt | pip-compile for reproducible deps | pyproject.toml for packages

Testing: pytest | tmp_path for filesystem | monkeypatch for env vars | Mock at boundaries (HTTP, subprocess) - Dry-run / --check mode in CLI tools = interview gold


Footguns

Footgun Consequence Fix
No timeout on requests Script hangs forever, cron piles up timeout=(5, 30) always
shell=True + user input Shell injection Pass args as list
yaml.load() Arbitrary code execution yaml.safe_load()
No AWS pagination Silently miss 80% of results Always use paginators
except Exception: pass Hides real bugs Catch specific exceptions
def f(tags=[]) Shared mutable default Use None + create fresh
f.readlines() on big files OOM (3x file size in RAM) for line in f: (streaming)
Logging secrets Credentials in CI logs no_log/redaction
os.system() No capture, no safety subprocess.run()

Stdlib worth naming

pathlib subprocess json csv tomllib configparser logging argparse collections concurrent.futures re hashlib tempfile shutil socket datetime

Third-party: requests boto3 paramiko jinja2 pyyaml click pytest


30-second answer

"I use Bash for thin glue and Python when I need data structures, APIs, retries, testing, or maintainable logic. My defaults are pathlib for files, requests with timeouts and retries, subprocess without shell=True, structured logging, atomic writes for config, and ThreadPoolExecutor for I/O-bound fan-out. I always paginate AWS calls and always set HTTP timeouts. I frame Python as operator-grade automation, not just syntax knowledge."