Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers

Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers

Problem

RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months. The core tension is between security and developer productivity. Teams either over-grant permissions (giving cluster-admin to CI/CD pipelines, granting wildcard verbs to developer roles) or under-grant them (developers cannot view logs, exec into pods, or port-forward for debugging, so they file tickets and wait).

The specific challenges:

  • cluster-admin is the default escape hatch. When a CI/CD pipeline fails because it lacks permissions, the quickest fix is cluster-admin. Once granted, nobody removes it. Within 6 months, 30-50% of service accounts in a typical cluster have cluster-admin or equivalent broad permissions.
  • Namespace-scoped vs. cluster-scoped is confusing. Roles are namespace-scoped. ClusterRoles are cluster-scoped. But ClusterRoles can be bound to a namespace via RoleBinding. This flexibility creates inconsistency: some teams use ClusterRoles for everything, others duplicate Roles across namespaces.
  • Service account tokens are over-mounted. By default, every pod gets a service account token that can query the API server. Most workloads never need this, but the token is there, ready for an attacker to use.
  • Auditing permissions is manual. There is no built-in tool to answer “who can delete pods in production?” or “which service accounts have access to secrets?” without scripting against the RBAC API.

This article provides a complete RBAC design: namespace-scoped roles for workloads, composable ClusterRoles via aggregation, per-workload service accounts, impersonation for safe debugging, an audit script to detect over-permissive bindings, and a break-glass emergency access pattern.

Target systems: Kubernetes 1.29+ with RBAC enabled (default since 1.6).

Threat Model

  • Adversary: Compromised CI/CD pipeline, malicious insider, or attacker with stolen credentials (service account token, user certificate, OIDC token).
  • Access level: Varies. Could be a pod with an auto-mounted service account token, a developer with kubectl access, or a CI/CD system with broad deployment permissions.
  • Objective: Privilege escalation (create a pod with cluster-admin service account), lateral movement (access secrets in other namespaces), data exfiltration (read secrets, configmaps, or pod logs), and cluster disruption (delete deployments, modify RBAC to lock out administrators).
  • Blast radius: With cluster-admin, the entire cluster including all namespaces, secrets, and control plane resources. With properly scoped RBAC, limited to the specific namespace and resource types granted.

Configuration

Step 1: Namespace-Scoped Roles for Developer Teams

Never use ClusterRoles directly for workload-level access. Define namespace-scoped Roles for each team.

Developer role (can view and debug, cannot modify infrastructure):

# developer-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: developer
  namespace: team-alpha
rules:
  # View workloads
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets", "daemonsets"]
    verbs: ["get", "list", "watch"]
  # View pods, logs, and exec for debugging
  - apiGroups: [""]
    resources: ["pods", "pods/log"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods/exec", "pods/portforward"]
    verbs: ["create"]
  # View services and endpoints
  - apiGroups: [""]
    resources: ["services", "endpoints"]
    verbs: ["get", "list", "watch"]
  # View configmaps (not secrets)
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "list", "watch"]
  # View events for troubleshooting
  - apiGroups: [""]
    resources: ["events"]
    verbs: ["get", "list", "watch"]

Deployer role (CI/CD pipelines):

# deployer-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: deployer
  namespace: team-alpha
rules:
  # Manage deployments and rollbacks
  - apiGroups: ["apps"]
    resources: ["deployments"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  - apiGroups: ["apps"]
    resources: ["deployments/rollback"]
    verbs: ["create"]
  # Manage services
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  # Manage configmaps and secrets (needed for deployment)
  - apiGroups: [""]
    resources: ["configmaps", "secrets"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
  # View pods for deployment status
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["get", "list", "watch"]
  # Manage horizontal pod autoscalers
  - apiGroups: ["autoscaling"]
    resources: ["horizontalpodautoscalers"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]

Bind the roles:

# developer-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-binding
  namespace: team-alpha
subjects:
  - kind: Group
    name: "team-alpha-developers"
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: Role
  name: developer
  apiGroup: rbac.authorization.k8s.io
---
# deployer-binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: deployer-binding
  namespace: team-alpha
subjects:
  - kind: ServiceAccount
    name: ci-deployer
    namespace: team-alpha
roleRef:
  kind: Role
  name: deployer
  apiGroup: rbac.authorization.k8s.io

Step 2: ClusterRole Aggregation for Composable Permissions

Instead of duplicating roles across namespaces, use ClusterRole aggregation to compose permissions from smaller building blocks.

# Base view role (aggregated into composite roles)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: workload-viewer
  labels:
    rbac.systemhardening.com/aggregate-to-developer: "true"
    rbac.systemhardening.com/aggregate-to-deployer: "true"
rules:
  - apiGroups: ["apps"]
    resources: ["deployments", "replicasets", "statefulsets"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods", "pods/log", "services", "endpoints", "events"]
    verbs: ["get", "list", "watch"]
---
# Debug role (only for developers)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: pod-debugger
  labels:
    rbac.systemhardening.com/aggregate-to-developer: "true"
rules:
  - apiGroups: [""]
    resources: ["pods/exec", "pods/portforward"]
    verbs: ["create"]
---
# Aggregated developer ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: aggregated-developer
aggregationRule:
  clusterRoleSelectors:
    - matchLabels:
        rbac.systemhardening.com/aggregate-to-developer: "true"
rules: []  # Rules are automatically filled by the controller

Then bind the aggregated ClusterRole at the namespace level:

# Bind at namespace scope (not cluster scope)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: developer-binding
  namespace: team-alpha
subjects:
  - kind: Group
    name: "team-alpha-developers"
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: aggregated-developer
  apiGroup: rbac.authorization.k8s.io

Step 3: Per-Workload Service Accounts with Disabled Auto-Mounting

Create a dedicated service account for each workload. Disable token auto-mounting on every service account unless the workload explicitly needs API server access.

# service-account-no-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: web-app
  namespace: production
automountServiceAccountToken: false
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      serviceAccountName: web-app
      automountServiceAccountToken: false
      containers:
        - name: web
          image: registry.example.com/web-app:1.4.2

For workloads that need API access (controllers, operators):

# service-account-with-token.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: config-reloader
  namespace: production
automountServiceAccountToken: true  # Explicitly opt in
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: config-reloader
  namespace: production
rules:
  - apiGroups: [""]
    resources: ["configmaps"]
    verbs: ["get", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: config-reloader
  namespace: production
subjects:
  - kind: ServiceAccount
    name: config-reloader
    namespace: production
roleRef:
  kind: Role
  name: config-reloader
  apiGroup: rbac.authorization.k8s.io

Step 4: Impersonation for Safe Debugging

Instead of giving developers direct access to production, use impersonation to let platform engineers act as a developer for troubleshooting.

# impersonation-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: impersonate-developers
rules:
  - apiGroups: [""]
    resources: ["users", "groups"]
    verbs: ["impersonate"]
    resourceNames:
      - "team-alpha-developers"
      - "team-beta-developers"
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: platform-team-impersonate
subjects:
  - kind: Group
    name: "platform-engineers"
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: impersonate-developers
  apiGroup: rbac.authorization.k8s.io
# Platform engineer verifies what a developer can see:
kubectl get pods -n team-alpha --as=developer@example.com \
  --as-group=team-alpha-developers

# Test if a service account has too many permissions:
kubectl auth can-i delete pods -n production \
  --as=system:serviceaccount:production:ci-deployer
# Expected: "no"

kubectl auth can-i create deployments -n production \
  --as=system:serviceaccount:production:ci-deployer
# Expected: "yes"

Step 5: RBAC Audit Script

#!/bin/bash
# rbac-audit.sh
# Detect over-permissive RBAC bindings in the cluster.

echo "=== Cluster-Admin Bindings ==="
echo "These service accounts or users have full cluster access:"
kubectl get clusterrolebindings -o json | \
  jq -r '.items[] | select(.roleRef.name == "cluster-admin") |
    .subjects[]? | "\(.kind): \(.name) (namespace: \(.namespace // "cluster-wide"))"'

echo ""
echo "=== Wildcard Verb Rules ==="
echo "Roles with '*' verbs (equivalent to full access on those resources):"
kubectl get clusterroles -o json | \
  jq -r '.items[] | select(.rules[]?.verbs[]? == "*") |
    "\(.metadata.name): \(.rules[] | select(.verbs[] == "*") |
    "resources=\(.resources // ["*"] | join(",")) verbs=\(.verbs | join(","))")"' | \
  sort -u

echo ""
echo "=== Service Accounts with Secrets Access ==="
echo "Service accounts that can read secrets (potential credential theft):"
for ns in $(kubectl get namespaces -o jsonpath='{.items[*].metadata.name}'); do
  kubectl auth can-i get secrets -n "$ns" \
    --as=system:serviceaccount:"$ns":default 2>/dev/null | \
    grep -q "yes" && echo "  default SA in $ns can read secrets"
done

echo ""
echo "=== Pods with Auto-Mounted Tokens ==="
echo "Pods that have service account tokens mounted (may not need API access):"
kubectl get pods -A -o json | \
  jq -r '.items[] | select(
    .spec.automountServiceAccountToken != false and
    (.spec.containers[].volumeMounts[]?.mountPath? // "" |
    contains("/var/run/secrets"))
  ) | "\(.metadata.namespace)/\(.metadata.name) (SA: \(.spec.serviceAccountName))"' | \
  head -20

Step 6: Break-Glass Emergency Access

For incidents where normal RBAC is too restrictive, provide a controlled escalation path with full audit logging.

# break-glass-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: break-glass-admin
rules:
  - apiGroups: ["*"]
    resources: ["*"]
    verbs: ["*"]
---
# Time-limited binding (create during incident, delete after)
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: break-glass-incident-2026-04-22
  annotations:
    incident: "INC-12345"
    created-by: "oncall-sre@example.com"
    expires: "2026-04-22T18:00:00Z"
subjects:
  - kind: User
    name: "oncall-sre@example.com"
    apiGroup: rbac.authorization.k8s.io
roleRef:
  kind: ClusterRole
  name: break-glass-admin
  apiGroup: rbac.authorization.k8s.io
# Automate creation and cleanup:
# Grant break-glass access (requires approval from second SRE)
kubectl create clusterrolebinding "break-glass-$(date +%s)" \
  --clusterrole=break-glass-admin \
  --user="oncall-sre@example.com"

# Set a reminder to revoke (or use a CronJob):
echo "REVOKE BREAK-GLASS ACCESS" | at now + 4 hours

# Revoke after incident:
kubectl delete clusterrolebinding break-glass-incident-2026-04-22

Pair this with audit logging (see Article #23) so all actions taken during break-glass access are recorded.

Expected Behaviour

After implementing this RBAC design:

  • Developers can view pods, logs, exec into containers, and port-forward in their team namespace
  • Developers cannot access other team namespaces or cluster-scoped resources
  • CI/CD pipelines can deploy workloads and manage secrets in their assigned namespace only
  • Pods without explicit API server needs have no mounted service account token
  • rbac-audit.sh reports zero unexpected cluster-admin bindings
  • kubectl auth can-i --list --as=system:serviceaccount:production:web-app returns minimal permissions
  • Break-glass access is audited and time-limited

Trade-offs

Control Impact Risk Mitigation
Namespace-scoped roles only Prevents accidental cross-namespace access More roles to manage (one set per namespace) Use ClusterRole aggregation bound at namespace level to reduce duplication
Disable auto-mount on all service accounts Eliminates unused API tokens from pods Workloads that need API access break until explicitly configured Document which workloads need API access. Use admission policy to enforce the annotation
No wildcard verbs Forces explicit permission grants Initial setup takes longer; new resource types need role updates Use aggregation labels so new permissions compose automatically
Break-glass pattern Controlled escalation during incidents Risk of forgetting to revoke access Automate expiry with a CronJob that deletes old break-glass bindings. Alert if a binding exists for more than 4 hours

Failure Modes

Failure Symptom Detection Recovery
CI/CD pipeline lacks deploy permissions Deployment pipeline fails with “forbidden” error Pipeline logs show 403 errors; kubectl auth can-i confirms missing permission Add the specific verb/resource to the deployer Role. Never escalate to cluster-admin
Developer cannot debug pods Developer reports “forbidden” on exec or port-forward kubectl auth can-i create pods/exec --as=developer@example.com -n team-alpha returns “no” Add pods/exec and pods/portforward to the developer Role
Service account token missing for controller Controller pod fails to authenticate to API server Pod logs show “unauthorized” or “forbidden”; pod cannot list/watch resources Set automountServiceAccountToken: true on the specific service account and pod spec
Aggregation label missing on new ClusterRole New permissions do not appear in the aggregated role kubectl get clusterrole aggregated-developer -o yaml does not include expected rules Add the correct aggregation label to the new ClusterRole
Break-glass binding not revoked Over-privileged access persists after incident rbac-audit.sh reports unexpected cluster-admin binding; security review catches it Delete the binding immediately. Add a CronJob to clean up stale break-glass bindings

When to Consider a Managed Alternative

Transition point: When RBAC management spans 5+ teams, 10+ namespaces, and 50+ role bindings, manual maintenance and auditing become unreliable. Role drift (permissions added during debugging and never removed) accumulates. Auditing “who can access what” requires custom scripts that break when RBAC structures change.

Recommended providers:

  • Sysdig (#122): RBAC visualization showing which users and service accounts have access to which resources. Detects over-permissive bindings and unused permissions. Provides recommendations for least-privilege role definitions based on observed API call patterns.

What you still control: The role definitions, team-to-namespace mapping, and break-glass process remain your decisions. Managed tools help you visualize, audit, and detect drift, but the access model is yours to design.

Premium content pack: RBAC template pack organized by team structure (single-team, multi-team, platform-team-plus-app-teams). Includes Kyverno policies to enforce “every namespace must have a deployer role,” “no wildcard verbs,” and “no cluster-admin bindings outside kube-system.” Includes the complete RBAC audit script with Prometheus metric export.