Hardening the AI Control Plane: Kill Switches, Rate Limits, and Human-in-the-Loop Gates
Problem
AI agents with write access to production systems can execute 100+ infrastructure changes per minute. A malfunctioning agent (bad prompt, hallucinated action, context confusion, or prompt injection) causes more damage in 5 minutes than a human attacker in an hour. The control plane around AI agents is a critical security boundary that most organisations have not built.
Threat Model
- Adversary: Malfunctioning agent (incorrect decisions at machine speed) or attacker manipulating the agent through prompt injection.
- Blast radius: Everything the agent’s credentials can reach, at machine speed. Without rate limits: hundreds of changes per minute. Without a kill switch: minutes of damage before anyone can intervene.
Configuration
Kill Switch Implementation
Target: revoke all agent access within 10 seconds.
#!/bin/bash
# kill-all-agents.sh - Emergency kill for ALL AI agents.
# Execute this when any agent exhibits unexpected behaviour.
set -e
TIMESTAMP=$(date -u +%Y-%m-%dT%H:%M:%SZ)
echo "[$TIMESTAMP] KILL SWITCH ACTIVATED by $(whoami)"
# 1. Network kill (fastest - blocks agent immediately)
echo "Blocking agent network egress..."
kubectl apply -f - <<'EOF'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: emergency-agent-kill
namespace: ai-agents
spec:
podSelector: {}
policyTypes:
- Egress
egress: []
EOF
# 2. Scale to zero (stops agent processes)
echo "Scaling agent deployments to zero..."
kubectl scale deployment --all -n ai-agents --replicas=0
# 3. Revoke all Vault leases (invalidates all agent credentials)
echo "Revoking Vault agent credentials..."
vault lease revoke -prefix auth/kubernetes/login/ 2>/dev/null || true
vault lease revoke -prefix secret/data/agents/ 2>/dev/null || true
# 4. Delete agent service accounts (prevents re-authentication)
echo "Removing agent service accounts..."
kubectl delete serviceaccount --all -n ai-agents --ignore-not-found
# 5. Create incident
echo "Creating incident..."
curl -X POST "https://api.incident.io/v2/incidents" \
-H "Authorization: Bearer ${INCIDENT_IO_API_KEY}" \
-H "Content-Type: application/json" \
-d "{
\"name\": \"AI Agent Kill Switch Activated\",
\"summary\": \"Kill switch activated at $TIMESTAMP by $(whoami). All agent access revoked.\",
\"severity\": {\"id\": \"critical\"}
}" 2>/dev/null || echo "Incident creation failed, create manually."
# 6. Preserve audit logs
echo "Agent audit logs preserved in Axiom dataset 'agent-audit'."
echo "Investigation query: source='agent' AND timestamp > '$TIMESTAMP' - 30m"
echo ""
echo "=== KILL SWITCH COMPLETE ==="
echo "All agent access revoked. Agent pods terminated."
echo "Manual intervention required to restore agent access."
Rate Limiting Agent Actions
# OPA/Gatekeeper constraint: limit mutations per service account per minute
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8sagentmutationratelimit
spec:
crd:
spec:
names:
kind: K8sAgentMutationRateLimit
validation:
openAPIV3Schema:
type: object
properties:
maxMutationsPerMinute:
type: integer
agentServiceAccounts:
type: array
items:
type: string
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8sagentmutationratelimit
violation[{"msg": msg}] {
input.review.userInfo.username == input.parameters.agentServiceAccounts[_]
input.review.operation == "CREATE" # or UPDATE, DELETE
# Rate limiting via external cache (Redis) or admission counter
msg := sprintf("Agent %v exceeded mutation rate limit", [input.review.userInfo.username])
}
Prometheus-based rate monitoring:
# Alert when agent mutation rate exceeds threshold
groups:
- name: agent-rate-limiting
rules:
- alert: AgentMutationRateHigh
expr: >
sum by (user) (
rate(apiserver_request_total{
verb=~"create|update|patch|delete",
user=~"system:serviceaccount:ai-agents:.*"
}[5m])
) > 1 # More than 1 mutation per second sustained over 5 minutes
labels:
severity: warning
annotations:
summary: "Agent {{ $labels.user }} making {{ $value | humanize }}/sec mutations"
runbook: "Investigate agent behaviour. Consider activating kill switch if unexpected."
Human-in-the-Loop Approval Gates
# Kyverno policy: block destructive operations from agent service accounts.
# Agents can create and update but not delete.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: agent-block-destructive
spec:
validationFailureAction: Enforce
rules:
- name: block-agent-deletes
match:
any:
- resources:
kinds: ["*"]
subjects:
- kind: ServiceAccount
name: ai-agent-*
namespace: ai-agents
validate:
message: "AI agents cannot delete resources. Request human approval."
deny:
conditions:
all:
- key: "{{ request.operation }}"
operator: In
value: ["DELETE"]
Blast Radius Containment
# Agent namespace isolation - agents can only operate in their designated target.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: agent-containment
namespace: ai-agents
spec:
podSelector:
matchLabels:
app.kubernetes.io/component: agent
policyTypes:
- Egress
egress:
# DNS only
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: kube-system
ports:
- protocol: UDP
port: 53
# Target namespace only
- to:
- namespaceSelector:
matchLabels:
agent-target: "true"
ports:
- protocol: TCP
port: 443
# Vault for credentials
- to:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: vault
ports:
- protocol: TCP
port: 8200
Behavioural Drift Monitoring
# Alert when agent behaviour deviates from its baseline pattern.
groups:
- name: agent-behaviour
rules:
# Alert if an agent starts accessing resource types it hasn't touched before
- alert: AgentNewResourceType
expr: >
count by (user, resource) (
rate(apiserver_request_total{
user=~"system:serviceaccount:ai-agents:.*"
}[1h]) > 0
)
unless
count by (user, resource) (
rate(apiserver_request_total{
user=~"system:serviceaccount:ai-agents:.*"
}[7d]) > 0
)
labels:
severity: warning
annotations:
summary: "Agent {{ $labels.user }} accessing new resource type: {{ $labels.resource }}"
Expected Behaviour
- Kill switch revokes all agent access within 10 seconds
- Rate limit alerts fire when agent exceeds 1 mutation/second sustained
- DELETE operations by agents are blocked by Kyverno policy
- Agent network egress limited to DNS, Vault, and target namespace only
- Behavioural drift alerts fire when agent accesses new resource types
- Incident automatically created when kill switch activates
Trade-offs
| Control | Impact | Risk | Mitigation |
|---|---|---|---|
| Rate limiting (1 mutation/sec) | Slows agent for legitimate batch operations | Legitimate batch tasks blocked | Per-task rate override mechanism. Document expected rates per agent type. |
| Block all deletes | Agent cannot perform cleanup operations | Manual cleanup needed for agent-managed resources | Human performs deletes. Or: allow deletes on specific resource types (e.g., completed Jobs) with explicit policy exception. |
| Namespace containment | Agent cannot manage cross-namespace workflows | Multi-namespace operations need multiple agents | Deploy one scoped agent per target namespace. |
| Kill switch (nuclear option) | ALL agents stop immediately | Legitimate agent work in progress is interrupted | Kill switch should be the last resort. Rate limiting and containment handle most issues without full shutdown. |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| Kill switch script fails | Agent continues operating after activation | Manual check: agent pods still running; API calls still succeeding | Fallback: manually delete the ai-agents namespace: kubectl delete namespace ai-agents. Nuclear option but guaranteed to work. |
| Rate limit bypass | Agent makes mutations faster than detected | Prometheus scrape interval (30s) creates a detection gap | Add admission webhook for real-time rate limiting (not just monitoring). |
| Kyverno policy not enforcing | Agent performs a delete that should be blocked | Audit log shows DELETE by agent service account | Check Kyverno webhook registration. Verify policy is in Enforce mode. |
| Behavioural drift false positive | Alert fires on legitimate new agent capability | Agent team deploys new feature; drift alert triggers | Document expected agent capabilities. Update baseline when new features are deployed. |
When to Consider a Managed Alternative
Building AI agent control plane infrastructure requires Vault (#65) for credentials, admission webhooks for rate limiting, and monitoring for drift detection.
- HCP Vault (#65): Managed Vault for dynamic agent credential lifecycle.
- Sysdig (#122): Runtime monitoring of agent containers with automated response and ML anomaly detection.
- Grafana Cloud (#108) + Grafana OnCall (#178): Alerting and escalation when agent behaviour deviates.
- Incident.io (#175): Automated incident creation on kill switch activation.
Premium content pack: AI control plane policy pack. Vault policies, Kyverno constraints, network policies, Prometheus alert rules, kill switch scripts, and behavioural drift monitoring for AI agent deployments.