Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers
Problem
RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months. The core tension is between security and developer productivity. Teams either over-grant permissions (giving cluster-admin to CI/CD pipelines, granting wildcard verbs to developer roles) or under-grant them (developers cannot view logs, exec into pods, or port-forward for debugging, so they file tickets and wait).
The specific challenges:
cluster-adminis the default escape hatch. When a CI/CD pipeline fails because it lacks permissions, the quickest fix iscluster-admin. Once granted, nobody removes it. Within 6 months, 30-50% of service accounts in a typical cluster havecluster-adminor equivalent broad permissions.- Namespace-scoped vs. cluster-scoped is confusing. Roles are namespace-scoped. ClusterRoles are cluster-scoped. But ClusterRoles can be bound to a namespace via RoleBinding. This flexibility creates inconsistency: some teams use ClusterRoles for everything, others duplicate Roles across namespaces.
- Service account tokens are over-mounted. By default, every pod gets a service account token that can query the API server. Most workloads never need this, but the token is there, ready for an attacker to use.
- Auditing permissions is manual. There is no built-in tool to answer “who can delete pods in production?” or “which service accounts have access to secrets?” without scripting against the RBAC API.
This article provides a complete RBAC design: namespace-scoped roles for workloads, composable ClusterRoles via aggregation, per-workload service accounts, impersonation for safe debugging, an audit script to detect over-permissive bindings, and a break-glass emergency access pattern.
Target systems: Kubernetes 1.29+ with RBAC enabled (default since 1.6).
Threat Model
- Adversary: Compromised CI/CD pipeline, malicious insider, or attacker with stolen credentials (service account token, user certificate, OIDC token).
- Access level: Varies. Could be a pod with an auto-mounted service account token, a developer with kubectl access, or a CI/CD system with broad deployment permissions.
- Objective: Privilege escalation (create a pod with
cluster-adminservice account), lateral movement (access secrets in other namespaces), data exfiltration (read secrets, configmaps, or pod logs), and cluster disruption (delete deployments, modify RBAC to lock out administrators). - Blast radius: With
cluster-admin, the entire cluster including all namespaces, secrets, and control plane resources. With properly scoped RBAC, limited to the specific namespace and resource types granted.
Configuration
Step 1: Namespace-Scoped Roles for Developer Teams
Never use ClusterRoles directly for workload-level access. Define namespace-scoped Roles for each team.
Developer role (can view and debug, cannot modify infrastructure):
# developer-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: developer
namespace: team-alpha
rules:
# View workloads
- apiGroups: ["apps"]
resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
verbs: ["get", "list", "watch"]
# View pods, logs, and exec for debugging
- apiGroups: [""]
resources: ["pods", "pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/exec", "pods/portforward"]
verbs: ["create"]
# View services and endpoints
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["get", "list", "watch"]
# View configmaps (not secrets)
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "list", "watch"]
# View events for troubleshooting
- apiGroups: [""]
resources: ["events"]
verbs: ["get", "list", "watch"]
Deployer role (CI/CD pipelines):
# deployer-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: deployer
namespace: team-alpha
rules:
# Manage deployments and rollbacks
- apiGroups: ["apps"]
resources: ["deployments"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
- apiGroups: ["apps"]
resources: ["deployments/rollback"]
verbs: ["create"]
# Manage services
- apiGroups: [""]
resources: ["services"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
# Manage configmaps and secrets (needed for deployment)
- apiGroups: [""]
resources: ["configmaps", "secrets"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
# View pods for deployment status
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch"]
# Manage horizontal pod autoscalers
- apiGroups: ["autoscaling"]
resources: ["horizontalpodautoscalers"]
verbs: ["get", "list", "watch", "create", "update", "patch"]
Bind the roles:
# developer-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: team-alpha
subjects:
- kind: Group
name: "team-alpha-developers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role
name: developer
apiGroup: rbac.authorization.k8s.io
---
# deployer-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: deployer-binding
namespace: team-alpha
subjects:
- kind: ServiceAccount
name: ci-deployer
namespace: team-alpha
roleRef:
kind: Role
name: deployer
apiGroup: rbac.authorization.k8s.io
Step 2: ClusterRole Aggregation for Composable Permissions
Instead of duplicating roles across namespaces, use ClusterRole aggregation to compose permissions from smaller building blocks.
# Base view role (aggregated into composite roles)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: workload-viewer
labels:
rbac.systemhardening.com/aggregate-to-developer: "true"
rbac.systemhardening.com/aggregate-to-deployer: "true"
rules:
- apiGroups: ["apps"]
resources: ["deployments", "replicasets", "statefulsets"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods", "pods/log", "services", "endpoints", "events"]
verbs: ["get", "list", "watch"]
---
# Debug role (only for developers)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: pod-debugger
labels:
rbac.systemhardening.com/aggregate-to-developer: "true"
rules:
- apiGroups: [""]
resources: ["pods/exec", "pods/portforward"]
verbs: ["create"]
---
# Aggregated developer ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: aggregated-developer
aggregationRule:
clusterRoleSelectors:
- matchLabels:
rbac.systemhardening.com/aggregate-to-developer: "true"
rules: [] # Rules are automatically filled by the controller
Then bind the aggregated ClusterRole at the namespace level:
# Bind at namespace scope (not cluster scope)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: developer-binding
namespace: team-alpha
subjects:
- kind: Group
name: "team-alpha-developers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: aggregated-developer
apiGroup: rbac.authorization.k8s.io
Step 3: Per-Workload Service Accounts with Disabled Auto-Mounting
Create a dedicated service account for each workload. Disable token auto-mounting on every service account unless the workload explicitly needs API server access.
# service-account-no-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: web-app
namespace: production
automountServiceAccountToken: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: web-app
template:
metadata:
labels:
app: web-app
spec:
serviceAccountName: web-app
automountServiceAccountToken: false
containers:
- name: web
image: registry.example.com/web-app:1.4.2
For workloads that need API access (controllers, operators):
# service-account-with-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: config-reloader
namespace: production
automountServiceAccountToken: true # Explicitly opt in
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: config-reloader
namespace: production
rules:
- apiGroups: [""]
resources: ["configmaps"]
verbs: ["get", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: config-reloader
namespace: production
subjects:
- kind: ServiceAccount
name: config-reloader
namespace: production
roleRef:
kind: Role
name: config-reloader
apiGroup: rbac.authorization.k8s.io
Step 4: Impersonation for Safe Debugging
Instead of giving developers direct access to production, use impersonation to let platform engineers act as a developer for troubleshooting.
# impersonation-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: impersonate-developers
rules:
- apiGroups: [""]
resources: ["users", "groups"]
verbs: ["impersonate"]
resourceNames:
- "team-alpha-developers"
- "team-beta-developers"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: platform-team-impersonate
subjects:
- kind: Group
name: "platform-engineers"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: impersonate-developers
apiGroup: rbac.authorization.k8s.io
# Platform engineer verifies what a developer can see:
kubectl get pods -n team-alpha --as=developer@example.com \
--as-group=team-alpha-developers
# Test if a service account has too many permissions:
kubectl auth can-i delete pods -n production \
--as=system:serviceaccount:production:ci-deployer
# Expected: "no"
kubectl auth can-i create deployments -n production \
--as=system:serviceaccount:production:ci-deployer
# Expected: "yes"
Step 5: RBAC Audit Script
#!/bin/bash
# rbac-audit.sh
# Detect over-permissive RBAC bindings in the cluster.
echo "=== Cluster-Admin Bindings ==="
echo "These service accounts or users have full cluster access:"
kubectl get clusterrolebindings -o json | \
jq -r '.items[] | select(.roleRef.name == "cluster-admin") |
.subjects[]? | "\(.kind): \(.name) (namespace: \(.namespace // "cluster-wide"))"'
echo ""
echo "=== Wildcard Verb Rules ==="
echo "Roles with '*' verbs (equivalent to full access on those resources):"
kubectl get clusterroles -o json | \
jq -r '.items[] | select(.rules[]?.verbs[]? == "*") |
"\(.metadata.name): \(.rules[] | select(.verbs[] == "*") |
"resources=\(.resources // ["*"] | join(",")) verbs=\(.verbs | join(","))")"' | \
sort -u
echo ""
echo "=== Service Accounts with Secrets Access ==="
echo "Service accounts that can read secrets (potential credential theft):"
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}'); do
kubectl auth can-i get secrets -n "$ns" \
--as=system:serviceaccount:"$ns":default 2>/dev/null | \
grep -q "yes" && echo " default SA in $ns can read secrets"
done
echo ""
echo "=== Pods with Auto-Mounted Tokens ==="
echo "Pods that have service account tokens mounted (may not need API access):"
kubectl get pods -A -o json | \
jq -r '.items[] | select(
.spec.automountServiceAccountToken != false and
(.spec.containers[].volumeMounts[]?.mountPath? // "" |
contains("/var/run/secrets"))
) | "\(.metadata.namespace)/\(.metadata.name) (SA: \(.spec.serviceAccountName))"' | \
head -20
Step 6: Break-Glass Emergency Access
For incidents where normal RBAC is too restrictive, provide a controlled escalation path with full audit logging.
# break-glass-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: break-glass-admin
rules:
- apiGroups: ["*"]
resources: ["*"]
verbs: ["*"]
---
# Time-limited binding (create during incident, delete after)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: break-glass-incident-2026-04-22
annotations:
incident: "INC-12345"
created-by: "oncall-sre@example.com"
expires: "2026-04-22T18:00:00Z"
subjects:
- kind: User
name: "oncall-sre@example.com"
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: break-glass-admin
apiGroup: rbac.authorization.k8s.io
# Automate creation and cleanup:
# Grant break-glass access (requires approval from second SRE)
kubectl create clusterrolebinding "break-glass-$(date +%s)" \
--clusterrole=break-glass-admin \
--user="oncall-sre@example.com"
# Set a reminder to revoke (or use a CronJob):
echo "REVOKE BREAK-GLASS ACCESS" | at now + 4 hours
# Revoke after incident:
kubectl delete clusterrolebinding break-glass-incident-2026-04-22
Pair this with audit logging (see Article #23) so all actions taken during break-glass access are recorded.
Expected Behaviour
After implementing this RBAC design:
- Developers can view pods, logs, exec into containers, and port-forward in their team namespace
- Developers cannot access other team namespaces or cluster-scoped resources
- CI/CD pipelines can deploy workloads and manage secrets in their assigned namespace only
- Pods without explicit API server needs have no mounted service account token
rbac-audit.shreports zero unexpectedcluster-adminbindingskubectl auth can-i --list --as=system:serviceaccount:production:web-appreturns minimal permissions- Break-glass access is audited and time-limited
Trade-offs
| Control | Impact | Risk | Mitigation |
|---|---|---|---|
| Namespace-scoped roles only | Prevents accidental cross-namespace access | More roles to manage (one set per namespace) | Use ClusterRole aggregation bound at namespace level to reduce duplication |
| Disable auto-mount on all service accounts | Eliminates unused API tokens from pods | Workloads that need API access break until explicitly configured | Document which workloads need API access. Use admission policy to enforce the annotation |
| No wildcard verbs | Forces explicit permission grants | Initial setup takes longer; new resource types need role updates | Use aggregation labels so new permissions compose automatically |
| Break-glass pattern | Controlled escalation during incidents | Risk of forgetting to revoke access | Automate expiry with a CronJob that deletes old break-glass bindings. Alert if a binding exists for more than 4 hours |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| CI/CD pipeline lacks deploy permissions | Deployment pipeline fails with “forbidden” error | Pipeline logs show 403 errors; kubectl auth can-i confirms missing permission |
Add the specific verb/resource to the deployer Role. Never escalate to cluster-admin |
| Developer cannot debug pods | Developer reports “forbidden” on exec or port-forward | kubectl auth can-i create pods/exec --as=developer@example.com -n team-alpha returns “no” |
Add pods/exec and pods/portforward to the developer Role |
| Service account token missing for controller | Controller pod fails to authenticate to API server | Pod logs show “unauthorized” or “forbidden”; pod cannot list/watch resources | Set automountServiceAccountToken: true on the specific service account and pod spec |
| Aggregation label missing on new ClusterRole | New permissions do not appear in the aggregated role | kubectl get clusterrole aggregated-developer -o yaml does not include expected rules |
Add the correct aggregation label to the new ClusterRole |
| Break-glass binding not revoked | Over-privileged access persists after incident | rbac-audit.sh reports unexpected cluster-admin binding; security review catches it |
Delete the binding immediately. Add a CronJob to clean up stale break-glass bindings |
When to Consider a Managed Alternative
Transition point: When RBAC management spans 5+ teams, 10+ namespaces, and 50+ role bindings, manual maintenance and auditing become unreliable. Role drift (permissions added during debugging and never removed) accumulates. Auditing “who can access what” requires custom scripts that break when RBAC structures change.
Recommended providers:
- Sysdig (#122): RBAC visualization showing which users and service accounts have access to which resources. Detects over-permissive bindings and unused permissions. Provides recommendations for least-privilege role definitions based on observed API call patterns.
What you still control: The role definitions, team-to-namespace mapping, and break-glass process remain your decisions. Managed tools help you visualize, audit, and detect drift, but the access model is yours to design.
Premium content pack: RBAC template pack organized by team structure (single-team, multi-team, platform-team-plus-app-teams). Includes Kyverno policies to enforce “every namespace must have a deployer role,” “no wildcard verbs,” and “no cluster-admin bindings outside kube-system.” Includes the complete RBAC audit script with Prometheus metric export.