Interview Cheatsheet Python
- lesson ---# Python Interview Cheatsheet
When to use Python over Bash: JSON/YAML parsing | APIs with auth/retries | data structures beyond arrays | error recovery | >100 lines | parallel execution | tests - Bash = text-stream glue (pipes, one-liners, package installs) | Python = data-structure processor (logic, APIs, fan-out) - 100-line rule: if your Bash script has data structures or error recovery, it's already Python in disguise
Core Language¶
Types: str int float bool None | Explicit conversion: int("8080") fails loudly on bad input (good)
- Falsy: 0, 0.0, "", [], {}, set(), None, False — everything else truthy | if items: = "is this non-empty?"
- is None not == None (identity vs equality) | f-strings: f"host:{host} port:{port}" | f"{val:.1f}%" formatting
Data structures:
| Type | Literal | Key trait | Use for |
|---|---|---|---|
list |
[1,2,3] |
Mutable, ordered | Collections, iteration |
tuple |
(1,2,3) |
Immutable | Multiple returns, dict keys |
dict |
{"k":"v"} |
Key→value mapping | Config, structured data |
set |
{1,2,3} |
O(1) lookup, unique | Membership tests, dedup |
dict.get(key, default)avoids KeyError |Counterreplacessort|uniq -c|defaultdict(list)auto-initializes- Comprehensions:
[x for x in items if cond](list) |{k: v for ...}(dict) | Keep readable or use a loop
Functions: Named args + defaults | Return any type (tuples for multiple values) | Local scope by default
- Type hints: def check(host: str, port: int = 22) -> bool: — readability + tooling, no runtime cost
- Dataclasses: @dataclass class Host: name: str; port: int = 22 — named fields, defaults, printable, less bugs
- Mutable default trap: def f(tags=[]) shares list across ALL calls → use tags: list | None = None
Control flow: if/elif/else (no brackets, colon + indent) | for x in items: | for i, x in enumerate(items):
- zip(list_a, list_b) = parallel iteration | range(10) = 0-9 | while cond: | break/continue
Files and Paths¶
from pathlib import Path
p = Path("/etc/nginx") / "conf.d" / "app.conf" # / operator builds paths
p.exists() p.is_file() p.is_dir() # checks
p.read_text() p.write_text(content) # read/write
p.parent p.name p.stem p.suffix # components
list(Path(".").rglob("*.yaml")) # recursive glob
with open(f) as fh:guarantees close even on exceptions |"w"truncates immediately |"a"appends- Streaming:
for line in open(bigfile):= constant memory |f.readlines()= loads EVERYTHING (OOM on GB files) - Atomic writes:
tempfile.mkstemp()→ write →Path.rename()(atomic on same filesystem)
Error Handling¶
try:
result = risky_operation()
except FileNotFoundError:
handle_missing() # specific exception
except (ConnectionError, TimeoutError) as e:
log.warning("failed: %s", e) # multiple types
else:
use(result) # only if no exception
finally:
cleanup() # always runs
- Catch specific exceptions — bare
except:orexcept Exception: passhides real bugs - Common:
FileNotFoundErrorPermissionErrorValueErrorKeyErrorConnectionErrorTimeoutError - No
set -eequivalent — Python'stry/exceptis per-operation, specific, and recoverable (better)
subprocess¶
result = subprocess.run(
["kubectl", "get", "pods", "-o", "json"], # list args, NOT string
check=True, # raise CalledProcessError on non-zero exit
text=True, # return strings, not bytes
capture_output=True, # capture stdout/stderr
timeout=30, # don't hang forever
)
pods = json.loads(result.stdout)
- Never
shell=Truewith untrusted input — shell injection vulnerability - Use subprocess for:
systemctl,iptables,docker,git(no Python equivalent) - Don't use for: file reading (
open), HTTP (requests), JSON (json), string ops (native Python)
HTTP/APIs¶
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry
session = requests.Session()
retry = Retry(total=3, backoff_factor=0.5, status_forcelist=[500,502,503,504])
session.mount("http://", HTTPAdapter(max_retries=retry))
session.mount("https://", HTTPAdapter(max_retries=retry))
resp = session.get("http://api.internal/health", timeout=(5, 30))
resp.raise_for_status()
data = resp.json()
- Always
timeout=(connect, read)— no timeout = hangs forever, cron piles up, zombies accumulate - Sessions reuse TCP connections (1 TLS handshake instead of N) | Only auto-retry GET/HEAD (idempotent)
Config Formats¶
| Format | Module | Gotcha |
|---|---|---|
| JSON | json (stdlib) |
Strict, no comments |
| YAML | yaml.safe_load() |
Never yaml.load() — code execution vuln |
| TOML | tomllib (3.11+) |
Read-only in stdlib |
| CSV | csv (stdlib) |
— |
| INI | configparser |
— |
Config precedence pattern: defaults < config file < env vars < CLI flags (matches kubectl/aws/terraform)
- os.environ.get("KEY", "default") | os.environ["REQUIRED"] raises KeyError if missing
AWS, SSH, Concurrency¶
boto3: boto3.client('ec2', region_name='us-east-1') | Creds: explicit → env → profile → instance role (never hardcode)
- Always paginate: paginator = client.get_paginator('describe_instances') — without it you silently miss results
- Error handling: except ClientError as e: → check e.response['Error']['Code']
paramiko (SSH): Always try/finally: client.close() — unclosed connections leak file descriptors
- Production: use RejectPolicy() not AutoAddPolicy() | Set timeout=10 on connect
Concurrency: ThreadPoolExecutor(max_workers=20) for I/O-bound fan-out (HTTP, SSH, files)
- GIL released during I/O — threading works great for infra | ProcessPoolExecutor for CPU-bound (rare)
- as_completed(futures) processes results as they arrive | Cap max_workers — don't DDoS your own infra
Packaging and Testing¶
Virtual environments: python3 -m venv .venv → source .venv/bin/activate → pip install | Never pollute system Python
- Outside venv: python3 | Inside venv: python is fine (PEP 394)
- pip freeze > requirements.txt | pip-compile for reproducible deps | pyproject.toml for packages
Testing: pytest | tmp_path for filesystem | monkeypatch for env vars | Mock at boundaries (HTTP, subprocess)
- Dry-run / --check mode in CLI tools = interview gold
Footguns¶
| Footgun | Consequence | Fix |
|---|---|---|
| No timeout on requests | Script hangs forever, cron piles up | timeout=(5, 30) always |
shell=True + user input |
Shell injection | Pass args as list |
yaml.load() |
Arbitrary code execution | yaml.safe_load() |
| No AWS pagination | Silently miss 80% of results | Always use paginators |
except Exception: pass |
Hides real bugs | Catch specific exceptions |
def f(tags=[]) |
Shared mutable default | Use None + create fresh |
f.readlines() on big files |
OOM (3x file size in RAM) | for line in f: (streaming) |
| Logging secrets | Credentials in CI logs | no_log/redaction |
os.system() |
No capture, no safety | subprocess.run() |
Stdlib worth naming¶
pathlib subprocess json csv tomllib configparser logging argparse collections concurrent.futures re hashlib tempfile shutil socket datetime
Third-party: requests boto3 paramiko jinja2 pyyaml click pytest
30-second answer¶
"I use Bash for thin glue and Python when I need data structures, APIs, retries, testing, or maintainable logic. My defaults are pathlib for files, requests with timeouts and retries, subprocess without shell=True, structured logging, atomic writes for config, and ThreadPoolExecutor for I/O-bound fan-out. I always paginate AWS calls and always set HTTP timeouts. I frame Python as operator-grade automation, not just syntax knowledge."