Container Escape Detection: Runtime Signals, Kernel Indicators, and Response Automation

Problem

Container escapes are the highest-impact attack in Kubernetes. A single compromised pod that escapes its container gains access to the underlying node, and from there to every other pod on that node, the kubelet credentials, and potentially the entire cluster. Detection must catch the escape attempt before it succeeds, because once the attacker has node-level access, they can disable the monitoring that would detect them.

The specific challenges:

Escape techniques are kernel-level. Container escapes exploit namespace manipulation (nsenter, unshare), cgroup breakouts, /proc filesystem abuse, and mounted host paths. Detecting these requires kernel-level instrumentation, not application-level logging.
Legitimate admin operations look like escapes. nsenter into a container namespace is a normal debugging tool. Mounting host paths is required for log collection and storage. Detection rules must distinguish between authorized admin actions and attack techniques.
New escape techniques appear regularly. CVE-2024-21626 (runc WORKDIR escape), CVE-2022-0185 (file system context escape), and Leaky Vessels (2024) each introduced new escape vectors. Static detection rules that check for known techniques miss novel exploits.
Privileged containers bypass all protections. Containers running with privileged: true or with specific dangerous capabilities (SYS_ADMIN, SYS_PTRACE) can escape trivially. Detection for privileged containers is a different problem than detection for standard containers.

This article covers Falco rules for known escape techniques, Tetragon TracingPolicies for kernel-level detection and blocking, Kubernetes audit log patterns, and automated response.

Target systems: Kubernetes clusters with Falco or Tetragon deployed as DaemonSets. Prometheus + Alertmanager. Cilium for network-level response.

Threat Model

Adversary: An attacker who has gained code execution inside a container (through application vulnerability, supply chain compromise, or compromised image). They attempt to break out of the container namespace to reach the host node.
Blast radius: A successful container escape gives the attacker root on the node. From there: access to kubelet credentials (can impersonate the node in the cluster), access to all pods on the node (including secrets mounted as volumes), ability to pivot to other nodes via the cluster network, and potential access to the cloud provider metadata service for further privilege escalation.

Configuration

Falco Rules for Known Escape Techniques

# falco-rules-container-escape.yaml
# Rules detecting common container escape techniques.

# Rule 1: nsenter or unshare execution inside a container.
# nsenter allows entering another namespace (escape to host namespace).
# unshare creates new namespaces (can be used to gain capabilities).
- rule: Namespace Manipulation in Container
  desc: >
    Detected nsenter or unshare execution inside a container.
    This is a strong indicator of a container escape attempt.
  condition: >
    spawned_process
    and container
    and (proc.name in (nsenter, unshare))
    and not (k8s.ns.name in (kube-system, monitoring))
  output: >
    Namespace manipulation in container
    (command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name
     pod=%k8s.pod.name user=%user.name)
  priority: CRITICAL
  tags: [container-escape, namespace]

# Rule 2: mount syscall from a non-init process in a container.
# Mounting filesystems inside a container is unusual and may indicate
# an attempt to mount the host filesystem.
- rule: Unexpected Mount in Container
  desc: >
    A process inside a container executed a mount syscall.
    This may indicate an attempt to mount host filesystems.
  condition: >
    evt.type = mount
    and container
    and proc.pid != 1
    and not (proc.pname in (mount, umount, systemd))
    and not (k8s.ns.name in (kube-system))
  output: >
    Mount syscall in container
    (command=%proc.cmdline container=%container.name
     image=%container.image.repository namespace=%k8s.ns.name)
  priority: CRITICAL
  tags: [container-escape, mount]

# Rule 3: write to sensitive /proc paths.
# /proc/sysrq-trigger can reboot the host.
# /proc/*/mem can read/write other process memory.
- rule: Write to Sensitive Proc Path
  desc: >
    A container process wrote to a sensitive /proc path that could
    affect the host or other processes.
  condition: >
    open_write
    and container
    and (fd.name startswith /proc/sysrq-trigger
         or fd.name startswith /proc/self/mem
         or fd.name startswith /host/proc)
  output: >
    Write to sensitive /proc path
    (file=%fd.name command=%proc.cmdline container=%container.name
     namespace=%k8s.ns.name pod=%k8s.pod.name)
  priority: CRITICAL
  tags: [container-escape, proc]

# Rule 4: access to Docker socket or containerd socket.
# Direct socket access allows creating new privileged containers.
- rule: Container Runtime Socket Access
  desc: >
    A process inside a container accessed the container runtime socket.
    This allows creating new containers with arbitrary privileges.
  condition: >
    (open_read or open_write)
    and container
    and (fd.name in (/var/run/docker.sock, /run/containerd/containerd.sock,
                     /var/run/crio/crio.sock))
  output: >
    Container runtime socket access
    (socket=%fd.name command=%proc.cmdline container=%container.name
     namespace=%k8s.ns.name pod=%k8s.pod.name)
  priority: CRITICAL
  tags: [container-escape, runtime-socket]

# Rule 5: cgroup escape attempt (notify_on_release).
# Classic cgroup v1 escape: write to notify_on_release and release_agent.
- rule: Cgroup Escape Attempt
  desc: >
    A container process wrote to cgroup notify_on_release or release_agent,
    which is the classic cgroup v1 container escape technique.
  condition: >
    open_write
    and container
    and (fd.name contains notify_on_release or fd.name contains release_agent)
  output: >
    Cgroup escape attempt
    (file=%fd.name command=%proc.cmdline container=%container.name
     namespace=%k8s.ns.name pod=%k8s.pod.name)
  priority: CRITICAL
  tags: [container-escape, cgroup]

# Rule 6: access to host filesystem via mounted paths.
# Pods with hostPath mounts may access sensitive host files.
- rule: Sensitive Host Path Access
  desc: >
    A container accessed sensitive files through a host path mount.
  condition: >
    (open_read or open_write)
    and container
    and (fd.name startswith /host/etc/shadow
         or fd.name startswith /host/etc/kubernetes
         or fd.name startswith /host/root/.ssh
         or fd.name startswith /host/var/lib/kubelet)
  output: >
    Sensitive host path access
    (file=%fd.name command=%proc.cmdline container=%container.name
     namespace=%k8s.ns.name pod=%k8s.pod.name)
  priority: CRITICAL
  tags: [container-escape, host-path]

Tetragon TracingPolicies for Real-Time Blocking

Tetragon can block escape attempts at the kernel level, not just detect them:

# tetragon-escape-policy.yaml
# TracingPolicy that kills the process attempting a container escape.
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: container-escape-prevention
spec:
  kprobes:
    # Block nsenter/unshare in non-system namespaces.
    - call: "__x64_sys_setns"
      syscall: true
      selectors:
        - matchNamespaces:
            - namespace: Pid
              operator: NotIn
              values:
                - "host_ns"
          matchActions:
            - action: Sigkill
              argError: -1
      args:
        - index: 0
          type: int
        - index: 1
          type: int

    # Block mount syscall from containers (non-init processes).
    - call: "__x64_sys_mount"
      syscall: true
      selectors:
        - matchPIDs:
            - operator: NotIn
              followForks: true
              values:
                - 1
          matchNamespaces:
            - namespace: Mnt
              operator: NotIn
              values:
                - "host_ns"
          matchActions:
            - action: Sigkill

    # Block writes to cgroup escape paths.
    - call: "security_file_open"
      selectors:
        - matchArgs:
            - index: 0
              operator: Postfix
              values:
                - "notify_on_release"
                - "release_agent"
          matchActions:
            - action: Sigkill

Kubernetes Audit Log Patterns

Detect escape-adjacent activity through the API server:

# Prometheus alerting rules based on Kubernetes audit log events.
groups:
  - name: container-escape-audit
    rules:
      # Alert: exec into a pod with suspicious commands.
      - alert: SuspiciousPodExec
        expr: >
          sum by (user, namespace, pod) (
            rate(apiserver_audit_event_total{
              verb="create",
              resource="pods/exec",
              request_uri=~".*command=(nsenter|chroot|mount|unshare).*"
            }[5m])
          ) > 0
        for: 1m
        labels:
          severity: critical
          detection_type: container_escape
        annotations:
          summary: >
            Suspicious exec: {{ $labels.user }} ran escape-related
            command in {{ $labels.namespace }}/{{ $labels.pod }}

      # Alert: pod created with privileged security context.
      - alert: PrivilegedPodCreated
        expr: >
          sum by (user, namespace) (
            rate(apiserver_audit_event_total{
              verb="create",
              resource="pods",
              request_object=~".*privileged.*true.*"
            }[5m])
          ) > 0
        for: 1m
        labels:
          severity: warning
          detection_type: privilege_escalation
        annotations:
          summary: >
            Privileged pod created by {{ $labels.user }}
            in {{ $labels.namespace }}

Automated Response

# Falcosidekick configuration: auto-respond to container escape events.
config:
  kubernetesPolicyReport:
    enabled: true
    minimumpriority: "critical"

  webhook:
    address: "http://response-automation:8080/falco"
    minimumpriority: "critical"

# Response actions for container escape:
# 1. Kill the offending pod immediately.
# 2. Apply network quarantine to prevent lateral movement.
# 3. Cordon the node (prevent new pods from scheduling).
# 4. Page the security team with forensic context.

---
# response-actions.yaml (webhook handler configuration)
actions:
  container_escape:
    rules:
      - "Namespace Manipulation in Container"
      - "Cgroup Escape Attempt"
      - "Container Runtime Socket Access"
    steps:
      - type: kubectl
        command: "delete pod {{ .pod }} -n {{ .namespace }} --grace-period=0"
      - type: kubectl
        command: "label pod {{ .pod }} -n {{ .namespace }} security.quarantine=true"
      - type: kubectl
        command: "cordon {{ .node }}"
      - type: alert
        severity: critical
        channel: "#security-incidents"

Expected Behaviour

nsenter, unshare, and mount execution inside containers triggers a CRITICAL alert within seconds
Writes to /proc/sysrq-trigger, cgroup escape paths, and container runtime sockets are detected and blocked
Tetragon kills escape processes at the kernel level before the escape completes
Suspicious kubectl exec commands are flagged through Kubernetes audit log monitoring
Privileged pod creation generates a WARNING alert
Automated response kills the pod, quarantines the workload, and cordons the node within 30 seconds
False positive rate below 1 per week after excluding kube-system and monitoring namespaces

Trade-offs

Decision	Impact	Risk	Mitigation
Tetragon Sigkill on escape attempt	Blocks escape in real time, before completion	False positive kills a legitimate process	Exclude kube-system, monitoring, and other trusted namespaces. Test rules in audit mode (action: Post) before enabling Sigkill.
Auto-cordon node on escape detection	Prevents attacker from scheduling new pods on compromised node	Reduces cluster capacity; may cause scheduling pressure	Auto-uncordon after security team clears the node (within SLA). Ensure sufficient capacity to absorb one cordoned node.
Falco + Tetragon (both deployed)	Falco for visibility and alerting; Tetragon for blocking	Two DaemonSets add resource overhead (100-200MB RAM per node)	Use Falco for detection/alerting only. Use Tetragon for enforcement. Do not duplicate rules between them.
Excluding kube-system from rules	Reduces false positives from system components	Attacker could deploy malicious workload in kube-system	Restrict kube-system namespace with RBAC and admission control. Alert on any non-system workload deployed to kube-system.

Failure Modes

Failure	Symptom	Detection	Recovery
Falco DaemonSet not running on a node	No escape detection on that node	`absent(falco_events_total{node="X"})` or DaemonSet pod count mismatch	Check node taints/tolerations. Ensure Falco DaemonSet has appropriate tolerations for all nodes.
Tetragon policy not loaded	Escape attempts detected by Falco but not blocked	Tetragon logs show policy parse error; escape process is not killed	Validate TracingPolicy with `kubectl describe tracingpolicy`. Check Tetragon agent logs for BPF program load errors.
Escape technique uses unknown vector	No rule matches; escape succeeds undetected	Post-incident investigation reveals new technique	Subscribe to container security advisories. Update rules within 48 hours of new CVE disclosure. Add generic behavioural rules (unexpected capability usage).
Auto-response kills legitimate pod	Service disruption; pod restarts in a loop	Pod restart count increases; service health checks fail	Review the triggering event. Add an exception for the specific workload if it legitimately needs the detected behaviour.
Node cordoned but not uncordoned	Cluster capacity shrinks over time as nodes accumulate cordons	Node count in schedulable state decreasing	Set a TTL on cordon actions (auto-uncordon after 4 hours unless security team extends). Alert on nodes cordoned for more than 2 hours.

When to Consider a Managed Alternative

Self-managed container escape detection requires Falco and/or Tetragon DaemonSet operation, rule maintenance for new CVEs, automated response infrastructure, and regular rule tuning (6-8 hours/month).

Sysdig (#122): Managed Falco rules with automatic updates for new escape techniques. Drift detection that identifies unexpected changes inside containers. Multi-cluster rule management from a single console.
Aqua (#123): Runtime protection with container escape prevention built in. Enforcement mode blocks escape attempts without custom rule writing. Integrates vulnerability scanning with runtime detection.

Premium content pack: Container escape Falco rule pack. 20+ rules covering nsenter, unshare, mount, cgroup breakout, proc filesystem abuse, runtime socket access, and host path exploitation. Includes Tetragon TracingPolicies, automated response configurations, and a testing framework for validating detection rules.