Career Engineering for Ops People - Street-Level Ops¶
What experienced ops engineers know about navigating the job market that career advice blogs miss entirely.
Quick Diagnosis Commands¶
# Audit your own public presence
# Google yourself — what does a hiring manager see?
# Check your GitHub contribution graph
# Check your LinkedIn headline (this is your billboard)
# Tools for job market research
# levels.fyi — salary data by company, role, level
# LinkedIn Jobs — filter: Remote, DevOps/SRE/Platform, last 24 hours
# Indeed — broader market, more traditional companies
# Glassdoor — interview questions + salary ranges
# Blind — anonymous tech worker forums
# Resume quality check
# Run your resume through an ATS simulator:
# jobscan.co — compare resume against job description
# Check PDF renders correctly (no broken formatting)
# Confirm links are clickable (GitHub, LinkedIn, portfolio)
Gotcha: The ATS Black Hole¶
You applied to 50 jobs online. Zero callbacks. Your resume is getting filtered by Applicant Tracking Systems before a human ever sees it. ATS software looks for keyword matches, and ops resumes written in natural language often fail.
Fix: Mirror the exact terminology from the job description.
Remember: ATS systems are keyword matchers, not context readers. Mnemonic: Match the JD, not your ego. If the job says "Terraform," write "Terraform" — not "IaC tooling" or "declarative infrastructure." If they say "Terraform," don't write "infrastructure as code tooling." If they say "Kubernetes," don't write "container orchestration platform." Use both the acronym and the full name: "Amazon Web Services (AWS)." Keep formatting simple — ATS chokes on tables, columns, headers in sidebars, and fancy templates.
Gotcha: "DevOps Engineer" Means Something Different Everywhere¶
You accept a "Senior DevOps Engineer" role expecting to build platforms. You arrive and find out you're writing Jenkins pipelines and babysitting a legacy Java app. The title matched but the work didn't.
Fix: During interviews, ask specific questions:
"What does a typical week look like for this role?"
"What's the ratio of project work to interrupt-driven work?"
"What's the on-call rotation? How many incidents per month?"
"What's your deployment frequency? How automated is it?"
"What infrastructure do you run? Cloud, on-prem, hybrid?"
"Who owns the Kubernetes clusters? Who owns the CI/CD pipeline?"
"What does the team's tech stack look like today vs. where
you want it in a year?"
Gotcha: Overvaluing Certifications, Undervaluing Portfolio¶
You spent six months getting three certifications but have no public work to show. Meanwhile, someone with zero certs but a detailed homelab blog and active GitHub contributions gets the offer. Certs are table stakes; they prove you can pass a test. Portfolio proves you can do the work.
Fix: Balance both. Get the CKA or RHCE — they open doors. But also build a public homelab repo, write a blog post about a real problem you solved, or contribute a fix to an open-source project. The combination is stronger than either alone.
Gotcha: Bombing the "Tell Me About an Outage" Question¶
You describe the outage but forget to mention what you specifically did. Or you describe what the team did without clarifying your individual contribution. Or you describe fixing it but not preventing recurrence.
Fix: Use this structure:
1. Context: "We ran 200 production services on Kubernetes..."
2. The incident: "At 2am, our monitoring showed a 40% spike
in 5xx errors across all services..."
3. YOUR role: "I was the primary on-call. I identified the
root cause as a DNS resolution failure caused by CoreDNS
pod evictions during a node scaling event..."
4. Resolution: "I manually scaled CoreDNS, added PodDisruptionBudgets,
and implemented DNS caching at the application layer..."
5. Prevention: "I wrote a postmortem, added alerting for CoreDNS
pod count, and built a chaos test that simulates DNS failures
during scaling events"
6. Result: "Zero DNS-related incidents in the following 6 months"
Gotcha: Salary Anchoring Trap¶
The recruiter asks, "What are you currently making?" or "What are your salary expectations?" early in the process. You name a number. Now you're anchored — the offer will be 10-15% above your stated number, even if the role pays 40% more at market rate.
Fix: Deflect early salary questions:
"I'd prefer to learn more about the role and team first.
I'm sure we can find a number that works for both sides."
"I'm looking at opportunities in the $X-$Y range, but I'm
flexible depending on total compensation and growth opportunities."
"What's the budgeted range for this role?"
Pattern: The 30-60-90 Day Plan (Interview Differentiator)¶
When interviewing for senior+ roles, prepare a 30-60-90 day plan based on what you've learned about the company during the interview process.
First 30 days: Listen and learn
├── Map the current infrastructure and deployment pipeline
├── Shadow on-call rotation, read recent postmortems
├── Identify the team's biggest pain points
├── Set up local dev environment, make a small contribution
└── Build relationships with adjacent teams (dev, security, product)
Days 30-60: Quick wins
├── Fix one well-understood pain point (the "trust builder")
├── Improve documentation for the area you've been learning
├── Propose improvements based on what you've observed
├── Start contributing to on-call rotation
└── Share knowledge from your previous experience (brown bags, docs)
Days 60-90: Strategic impact
├── Own a meaningful project from planning to delivery
├── Introduce a practice the team is missing (SLOs, chaos testing, etc.)
├── Mentor a junior team member
├── Present findings and proposals to leadership
└── Set goals for the next quarter with your manager
Presenting this in the final interview round signals senior-level thinking.
Gotcha: Do not present a 30-60-90 plan that prescribes specific technical solutions before you have joined. It signals arrogance, not preparation. Focus on learning activities, relationship-building, and discovery. The specific fixes should emerge from what you learn, not from what you assumed.
Pattern: Resume Tailoring Workflow¶
Don't mass-apply with one resume. Build a system:
1. Read the job description carefully
2. Highlight 5-7 key requirements
3. For each requirement, find your strongest matching experience
4. Rewrite your bullet points to mirror their language
5. Adjust your summary to align with their stack
Time per application: 20-30 minutes
Hit rate: 5-10x higher than generic applications
> **One-liner:** Five tailored applications beat fifty generic ones. Every time.
Example tailoring:
JD says: "Experience with Kubernetes in production environments"
Before: "Worked with container orchestration tools"
After: "Operated 15-node Kubernetes clusters serving 200 RPS
with automated scaling, rolling deployments, and
PodDisruptionBudgets for zero-downtime upgrades"
Pattern: Networking (Human, Not TCP/IP)¶
Low-effort, high-return networking strategies:
├── Join Slack/Discord communities
│ ├── Kubernetes Slack (#kubernetes-users, #sig-cluster-lifecycle)
│ ├── DevOps Chat (devopschat.co)
│ ├── Rands Leadership Slack (for management-track)
│ └── Local tech community Slack/Discord
│
├── Attend meetups (virtual counts)
│ ├── Local Kubernetes/DevOps/Linux meetups
│ ├── CNCF community events
│ └── Offer to give a talk (20-minute lightning talks are easy entry)
│
├── LinkedIn presence
│ ├── Post about problems you solved (short, technical)
│ ├── Comment on others' posts with substantive additions
│ ├── Connect with people you meet at events
│ └── Recruiters are active here — make your profile findable
│
└── Open source
├── File issues with good reproduction steps
├── Fix documentation (lowest barrier to entry)
├── Submit small bug fixes
└── Even one PR to a known project is a talking point
Pattern: Translating Military Experience¶
Military and government ops experience is extremely valuable but poorly understood by civilian hiring managers.
Military Term → Civilian Translation
──────────────────────────────────────────────────
NIPR/SIPR → Segmented network environments
with strict access controls
IA (Information Assurance) → Security compliance and auditing
STIG compliance → CIS Benchmark / security hardening
(mention STIG too — DoD companies know it)
ACAS/Nessus scanning → Vulnerability scanning and remediation
PKI/CAC → Certificate-based authentication
DISA → Security compliance authority
COMSEC → Encryption key management
Watch floor / NOC → Network Operations Center
Trouble ticket / remedy → Incident management / ITSM
PCS / deployment → Relocations / temporary assignments
NCO / team lead → Technical lead, first-line supervisor
Emergency: You Got Laid Off¶
First 48 hours:
1. File for unemployment (don't wait)
2. Secure your references (email 3-5 former colleagues NOW,
before the news spreads and people scatter)
3. Update LinkedIn ("Open to Work" banner — yes, use it)
4. Download your work from company systems before access is cut
(your own notes, sanitized architecture diagrams, nothing proprietary)
5. Check COBRA/insurance continuation options
First 2 weeks:
1. Update resume (use this time to really polish it)
2. Rebuild your homelab to practice current tech
3. Start the CKA or other cert you've been putting off
4. Reach out to your network: "I'm looking — here's what I do"
5. Apply to 5-10 well-matched roles (quality over quantity)
6. Set a daily routine: 2 hours job search, 2 hours skill building
Mental health:
- A layoff is not a performance judgment (especially in tech)
- Take a few days before job hunting if you can afford to
- Stay connected with former colleagues (they're your network)
- Exercise, sleep, and routine matter more than you think
Emergency: Counter-Offer Situation¶
You have an external offer. Your current employer counter-offers with a raise and title bump.
Before accepting the counter-offer, consider:
1. Why did you start looking in the first place?
- Money alone? Counter-offer might fix it
- Growth, culture, management? Counter-offer won't fix it
2. Statistics: 50%+ of people who accept counter-offers leave
within 12 months anyway
3. You've now signaled you're a flight risk — future promotions
and layoff decisions may be affected
4. The raise they're offering was available before — they just
didn't give it to you until you had leverage
When to accept: you genuinely love the team and work, and
money was the only issue.
When to decline: you were looking because of management,
growth ceiling, or culture. Money is a bandage, not a fix.