Thinking Out Loud: Kubernetes RBAC¶
A senior SRE's internal monologue while working through a real RBAC task. This isn't a tutorial — it's a window into how experienced engineers actually think.
The Situation¶
A developer from the data team opens an urgent ticket: "I can't deploy my CronJob to the analytics namespace. I get forbidden errors. I have admin access in staging — why not in production?" I'm the platform engineer who manages cluster access.
The Monologue¶
Let me start by understanding exactly what error they're getting. "Forbidden" in Kubernetes is almost always RBAC. Let me verify.
no. Okay, so Sarah doesn't have permission to create CronJobs in the analytics namespace. Let me see what permissions she DOES have.
She can get, list, and watch pods, services, and configmaps. She can get logs. But she can't create deployments, cronjobs, or any workload resources. This looks like a viewer role.
Let me find the actual RoleBinding that gives her access.
kubectl get rolebindings -n analytics -o custom-columns='NAME:.metadata.name,ROLE:.roleRef.name,SUBJECTS:.subjects[*].name'
There's a analytics-viewers binding that maps the namespace-viewer ClusterRole to the data-team group. Sarah is in the data-team group. Let me check what the namespace-viewer role actually allows.
Read-only access to core resources. That's correct for production — we intentionally give read-only access by default. The issue is that Sarah needs more permissions for the analytics namespace specifically.
Mental Model: RBAC Layer Cake¶
Kubernetes RBAC stacks: ClusterRoles define permissions, RoleBindings scope them to a namespace, and ClusterRoleBindings scope them cluster-wide. A user's effective permissions are the UNION of all their bindings. Never edit ClusterRoles for namespace-specific needs — create a namespace-scoped Role instead.
Now, the question is: what permissions does she actually need? She said CronJob, but let me think about what a CronJob actually requires...
A CronJob creates Jobs, which create Pods. The CronJob also needs a ServiceAccount to run as. If I give her permission to create CronJobs but she uses the default ServiceAccount, the CronJob's pods will run with whatever permissions the default SA has. That could be a security issue if the default SA has been given broad permissions.
Let me check what the default SA in analytics can do.
Whoa. The default SA can read secrets and create pods. That's way too broad. Someone gave the default SA permissions — probably during initial namespace setup. I need to flag this, but it's a separate issue.
For now, let me focus on Sarah's access. I'll create a Role that gives her CronJob management but nothing dangerous.
cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: cronjob-manager
namespace: analytics
rules:
- apiGroups: ["batch"]
resources: ["cronjobs", "jobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
EOF
Wait, I almost forgot — she'll also need to create ConfigMaps and Secrets for the CronJob config, and maybe manage ServiceAccounts. But I should NOT just throw all that in. Let me ask what her CronJob actually does.
Actually, let me re-read her ticket... she says "deploy my CronJob" and attached a YAML. Let me look at what resources the YAML references.
The dry-run shows she needs: create cronjobs.batch, and the CronJob references a ConfigMap etl-config and a Secret db-credentials. Those already exist in the namespace. So she doesn't need to create them — just the CronJob itself needs to read them at runtime, which is the pod's ServiceAccount concern, not her RBAC.
Mental Model: User Permissions vs Workload Permissions¶
RBAC controls what a user can do via kubectl. ServiceAccount permissions control what a pod can do at runtime. These are two separate security planes. A user might have permission to create a CronJob but the CronJob's pods run with a completely different set of permissions via their ServiceAccount. Always think about both planes.
Okay, the Role I defined is sufficient. Let me bind it.
cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: sarah-cronjob-manager
namespace: analytics
subjects:
- kind: User
name: sarah@company.com
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: cronjob-manager
apiGroup: rbac.authorization.k8s.io
EOF
Actually — should I bind to Sarah specifically or to the data-team group? If other data team members need the same access, binding to the group is better. But I should start narrow and widen later. Sarah's the one who asked, and I shouldn't assume the whole team needs deploy access to production.
Let me verify it works.
yes. Let me test the full flow with her YAML.
cronjob.batch/etl-daily-report created (dry run). Good. Now one more thing — I should check that she can't escalate. Can she modify the ServiceAccount or create new ones?
kubectl auth can-i create serviceaccounts -n analytics --as=sarah@company.com
kubectl auth can-i bind roles -n analytics --as=sarah@company.com
No to both. Good. She can create CronJobs but can't modify the security context. The CronJob will run as the default SA which... brings me back to that earlier issue. The default SA has too many permissions. I need to create a dedicated SA for analytics workloads and lock down the default.
Let me at least document the security issue I found and file a ticket. For today, Sarah can deploy her CronJob, but the broader default-SA issue needs a proper fix.
What Made This Senior-Level¶
| Junior Would... | Senior Does... | Why |
|---|---|---|
Grant admin ClusterRoleBinding to fix the permission error quickly |
Create a scoped Role with exactly the needed permissions | Least privilege — admin access in production is a security incident waiting to happen |
| Only think about the user's kubectl permissions | Check both the user's RBAC and the workload's ServiceAccount permissions | These are two separate security planes — fixing one while the other is misconfigured creates a false sense of security |
| Bind the role to the entire team group | Bind to the specific user, then widen based on actual need | Start narrow, document the decision, and widen when there's an actual request |
| Not notice the overprivileged default ServiceAccount | Flag it as a security issue and file a follow-up ticket | Senior engineers have a habit of noting security issues they find while working on other things |
Key Heuristics Used¶
- RBAC Layer Cake: ClusterRoles define what, RoleBindings define where and who. Never edit cluster-wide roles for namespace-specific needs.
- User vs Workload Permissions: A user's kubectl permissions and a pod's ServiceAccount permissions are two separate security planes — always check both.
- Start Narrow, Widen Later: Grant the minimum permissions needed for the specific request, then expand based on documented need.
Cross-References¶
- Primer — RBAC concepts: Roles, ClusterRoles, RoleBindings, and ServiceAccounts
- Street Ops —
kubectl auth can-iand RBAC debugging commands - Footguns — Overprivileged default ServiceAccounts and the
adminClusterRoleBinding trap