Skip to content

Thinking Out Loud: Kubernetes RBAC

A senior SRE's internal monologue while working through a real RBAC task. This isn't a tutorial — it's a window into how experienced engineers actually think.

The Situation

A developer from the data team opens an urgent ticket: "I can't deploy my CronJob to the analytics namespace. I get forbidden errors. I have admin access in staging — why not in production?" I'm the platform engineer who manages cluster access.

The Monologue

Let me start by understanding exactly what error they're getting. "Forbidden" in Kubernetes is almost always RBAC. Let me verify.

kubectl auth can-i create cronjobs.batch -n analytics --as=sarah@company.com

no. Okay, so Sarah doesn't have permission to create CronJobs in the analytics namespace. Let me see what permissions she DOES have.

kubectl auth can-i --list -n analytics --as=sarah@company.com

She can get, list, and watch pods, services, and configmaps. She can get logs. But she can't create deployments, cronjobs, or any workload resources. This looks like a viewer role.

Let me find the actual RoleBinding that gives her access.

kubectl get rolebindings -n analytics -o custom-columns='NAME:.metadata.name,ROLE:.roleRef.name,SUBJECTS:.subjects[*].name'

There's a analytics-viewers binding that maps the namespace-viewer ClusterRole to the data-team group. Sarah is in the data-team group. Let me check what the namespace-viewer role actually allows.

kubectl get clusterrole namespace-viewer -o yaml

Read-only access to core resources. That's correct for production — we intentionally give read-only access by default. The issue is that Sarah needs more permissions for the analytics namespace specifically.

Mental Model: RBAC Layer Cake

Kubernetes RBAC stacks: ClusterRoles define permissions, RoleBindings scope them to a namespace, and ClusterRoleBindings scope them cluster-wide. A user's effective permissions are the UNION of all their bindings. Never edit ClusterRoles for namespace-specific needs — create a namespace-scoped Role instead.

Now, the question is: what permissions does she actually need? She said CronJob, but let me think about what a CronJob actually requires...

A CronJob creates Jobs, which create Pods. The CronJob also needs a ServiceAccount to run as. If I give her permission to create CronJobs but she uses the default ServiceAccount, the CronJob's pods will run with whatever permissions the default SA has. That could be a security issue if the default SA has been given broad permissions.

Let me check what the default SA in analytics can do.

kubectl auth can-i --list -n analytics --as=system:serviceaccount:analytics:default

Whoa. The default SA can read secrets and create pods. That's way too broad. Someone gave the default SA permissions — probably during initial namespace setup. I need to flag this, but it's a separate issue.

For now, let me focus on Sarah's access. I'll create a Role that gives her CronJob management but nothing dangerous.

cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: cronjob-manager
  namespace: analytics
rules:
  - apiGroups: ["batch"]
    resources: ["cronjobs", "jobs"]
    verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
  - apiGroups: [""]
    resources: ["pods", "pods/log"]
    verbs: ["get", "list", "watch"]
EOF

Wait, I almost forgot — she'll also need to create ConfigMaps and Secrets for the CronJob config, and maybe manage ServiceAccounts. But I should NOT just throw all that in. Let me ask what her CronJob actually does.

Actually, let me re-read her ticket... she says "deploy my CronJob" and attached a YAML. Let me look at what resources the YAML references.

kubectl apply -f /tmp/sarah-cronjob.yaml --dry-run=server -n analytics --as=sarah@company.com 2>&1

The dry-run shows she needs: create cronjobs.batch, and the CronJob references a ConfigMap etl-config and a Secret db-credentials. Those already exist in the namespace. So she doesn't need to create them — just the CronJob itself needs to read them at runtime, which is the pod's ServiceAccount concern, not her RBAC.

Mental Model: User Permissions vs Workload Permissions

RBAC controls what a user can do via kubectl. ServiceAccount permissions control what a pod can do at runtime. These are two separate security planes. A user might have permission to create a CronJob but the CronJob's pods run with a completely different set of permissions via their ServiceAccount. Always think about both planes.

Okay, the Role I defined is sufficient. Let me bind it.

cat <<'EOF' | kubectl apply -f -
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: sarah-cronjob-manager
  namespace: analytics
subjects:
  - kind: User
    name: sarah@company.com
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: cronjob-manager
  apiGroup: rbac.authorization.k8s.io
EOF

Actually — should I bind to Sarah specifically or to the data-team group? If other data team members need the same access, binding to the group is better. But I should start narrow and widen later. Sarah's the one who asked, and I shouldn't assume the whole team needs deploy access to production.

Let me verify it works.

kubectl auth can-i create cronjobs.batch -n analytics --as=sarah@company.com

yes. Let me test the full flow with her YAML.

kubectl apply -f /tmp/sarah-cronjob.yaml --dry-run=server -n analytics --as=sarah@company.com

cronjob.batch/etl-daily-report created (dry run). Good. Now one more thing — I should check that she can't escalate. Can she modify the ServiceAccount or create new ones?

kubectl auth can-i create serviceaccounts -n analytics --as=sarah@company.com
kubectl auth can-i bind roles -n analytics --as=sarah@company.com

No to both. Good. She can create CronJobs but can't modify the security context. The CronJob will run as the default SA which... brings me back to that earlier issue. The default SA has too many permissions. I need to create a dedicated SA for analytics workloads and lock down the default.

Let me at least document the security issue I found and file a ticket. For today, Sarah can deploy her CronJob, but the broader default-SA issue needs a proper fix.

What Made This Senior-Level

Junior Would... Senior Does... Why
Grant admin ClusterRoleBinding to fix the permission error quickly Create a scoped Role with exactly the needed permissions Least privilege — admin access in production is a security incident waiting to happen
Only think about the user's kubectl permissions Check both the user's RBAC and the workload's ServiceAccount permissions These are two separate security planes — fixing one while the other is misconfigured creates a false sense of security
Bind the role to the entire team group Bind to the specific user, then widen based on actual need Start narrow, document the decision, and widen when there's an actual request
Not notice the overprivileged default ServiceAccount Flag it as a security issue and file a follow-up ticket Senior engineers have a habit of noting security issues they find while working on other things

Key Heuristics Used

  1. RBAC Layer Cake: ClusterRoles define what, RoleBindings define where and who. Never edit cluster-wide roles for namespace-specific needs.
  2. User vs Workload Permissions: A user's kubectl permissions and a pod's ServiceAccount permissions are two separate security planes — always check both.
  3. Start Narrow, Widen Later: Grant the minimum permissions needed for the specific request, then expand based on documented need.

Cross-References

  • Primer — RBAC concepts: Roles, ClusterRoles, RoleBindings, and ServiceAccounts
  • Street Opskubectl auth can-i and RBAC debugging commands
  • Footguns — Overprivileged default ServiceAccounts and the admin ClusterRoleBinding trap