Kubernetes / Platform Security Articles

Kubernetes hardening guides covering RBAC, network policies, admission control, secrets management, runtime security, and AI workloads.

Kubernetes Security and Hardening Guides

Advanced 15 min read

Admission Webhook PR Poisoning: How a Merged PR Becomes a Cluster Backdoor

A pull request that modifies a ValidatingWebhookConfiguration or MutatingWebhookConfiguration — or the controller that registers it — can silently remove security controls from an entire cluster. This guide covers the attack patterns, detection with OPA/Kyverno policies, and GitOps controls that prevent webhook configuration from being silently changed.

Advanced 16 min read

Kubernetes CVE Auto-Remediation Operator: Closing the Patch Window Automatically

A Kubernetes operator that watches OSV and NVD CVE feeds, correlates findings against running workload images, and automatically remediates via image-pin updates, DaemonSet rollouts, and node-pool rotations when EPSS score exceeds a configured threshold — shrinking the patch window from days to minutes.

Advanced 14 min read

Kubernetes Structured Authorization Config: Hardening Multi-Webhook Auth Chains

Kubernetes 1.30 graduated the Structured Authorization Config API to GA, enabling multiple authorization webhooks chained with CEL conditions. Misconfigured chains silently bypass RBAC; this guide covers safe chain design, CEL condition hardening, and fail-open/fail-closed reasoning.

advanced 13 min read

Argo Workflows Template Injection via User-Controlled Parameters

Argo Workflows evaluates template expressions using user-supplied workflow parameters; without input validation, an attacker with workflow submission access can inject expressions that execute arbitrary commands in the workflow executor, exfiltrate secrets, or pivot to other cluster workloads.

intermediate 13 min read

EPSS-Driven CVE Patch Prioritization for Kubernetes Workloads

CVSS severity alone cannot prioritize patching when hundreds of CVEs affect your Kubernetes images; the Exploit Prediction Scoring System (EPSS) provides a probability-of-exploitation score that focuses remediation on the CVEs most likely to be actively exploited in the next 30 days.

intermediate 13 min read

Automated ingress-nginx Version Management and CVE Response

ingress-nginx has had multiple critical CVEs including annotation injection attacks; manual Helm chart version management leaves clusters exposed for weeks; automate detection of new releases, staged canary rollout, and rollback to reduce patch lag to hours.

advanced 14 min read

Securing Kubernetes Sidecar Injection Against Rogue Container Injection

Mutating webhook sidecar injection — used by Istio, Dapr, and custom platform injectors — can be abused to inject rogue containers or modify existing ones; audit injection logic, enforce webhook TLS, restrict injection to approved namespaces, and validate injector output.

intermediate 14 min read

Security Validation for AI-Generated Kubernetes Manifests

AI assistants generating Kubernetes Deployment, RBAC, and Service YAML reproduce predictable misconfigurations — privileged containers, missing securityContext, broad ClusterRoleBindings; validate with Polaris, kube-score, and Kyverno before admission.

intermediate 14 min read

Hardening the Kubernetes Secrets Store CSI Driver

The Secrets Store CSI Driver mounts external secrets from AWS, Azure, GCP, and Vault into pods via provider plugins; its sync-to-Kubernetes-Secret behaviour, RBAC surface, and provider pod permissions are common misconfiguration sources that expose secrets beyond their intended scope.

intermediate 14 min read

Isolating AI Training Batch Jobs in Kubernetes

AI training jobs on Kubernetes have access to large GPU nodes, model weights, and training datasets; isolate them from production namespaces with dedicated node pools, network policy, and RBAC to prevent cross-job data leakage and lateral movement.

advanced 14 min read

Kubernetes Subresource RBAC Escalation: Restricting exec, portforward, and proxy

RBAC permissions on pods/exec, pods/portforward, pods/log, and nodes/proxy are functionally equivalent to cluster compromise yet routinely over-provisioned; audit who holds these grants and replace them with time-limited JIT access.

advanced 15 min read

Securing the Kubernetes API Aggregation Layer Against Privilege Escalation

Extension API servers registered via API aggregation can intercept credentials, bypass RBAC, and escalate to cluster-admin; harden the aggregation layer with mutual TLS, bounded permissions, and routing controls.

intermediate 12 min read

Kubernetes Node Kernel Patch Velocity: Draining and Replacing Nodes at Speed After a Critical CVE

When a critical kernel LPE like Dirty Frag (CVE-2026-43284/43500) drops with a public PoC, the window between disclosure and exploitation may be hours. Kubernetes clusters running hundreds of nodes need a systematic, automated approach to kernel patching — identifying vulnerable nodes, draining workloads safely, applying patches, and verifying remediation — without days of manual work.

intermediate 11 min read

Azure Workload Identity for AKS: Federated Credential Access to Azure Resources

Azure Workload Identity replaces pod identity (now deprecated) and Managed Identity limitations in AKS by using OIDC federation between the AKS OIDC issuer and Azure AD. Kubernetes pods receive projected service account tokens that can be exchanged for Azure AD access tokens without any stored credentials. This article covers enabling the OIDC issuer on AKS, creating federated credentials, configuring workload identity in pods, and auditing with Azure Monitor.

intermediate 12 min read

Container Image Signing Policy Enforcement: From cosign to Admission Control

Signing container images is only useful if admission control verifies the signature before the image runs. This article covers the end-to-end enforcement pipeline: signing images with cosign in CI, verifying signatures with Kyverno ImageVerify and OPA Gatekeeper, configuring signature transparency with Rekor, handling multi-architecture image indexes, and the key distribution problem in enterprise environments.

advanced 13 min read

ContainerSSH Kubernetes Backend: Hardened Pod-per-Session SSH Access

ContainerSSH's Kubernetes backend launches a dedicated Pod for each SSH connection, giving each session its own process namespace, filesystem, and network identity. The security of this model depends entirely on the Pod spec returned by the config webhook: a misconfigured PodSecurityContext, missing NetworkPolicy, or overly broad RBAC for the ContainerSSH service account can turn an isolation mechanism into a cluster escape path.

intermediate 12 min read

Automating Container Image Patching in Kubernetes with Copa and Kyverno

Running Copa (Copacetic) as a Kubernetes CronJob continuously scans images in a registry and patches those above a vulnerability severity threshold, while Kyverno admission policies block unpatched images from being scheduled. Together they create a closed-loop container patching system that operates independently of application teams and upstream image publishers.

Advanced 15 min read

ETCd Compromise: The Blast Radius of Your Kubernetes Backing Store

ETCd holds every Kubernetes secret, service account token, and config in base64-encoded plaintext. A direct etcd connection bypasses all RBAC — there are no Kubernetes access controls between an etcd client and the data. Attackers who reach etcd (via node compromise, misconfigured backup access, or exposed port) can read every secret and forge service account tokens. This article covers the attack paths, what data is exposed, and how to detect and recover.

Advanced 13 min read

External Secrets Operator: Syncing Cloud Secrets Without Storing Them in Kubernetes

The External Secrets Operator (ESO) reconciles secrets from AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, and HashiCorp Vault into Kubernetes Secrets on a defined refresh interval. The Kubernetes Secret is a cache — the authoritative copy lives in the cloud secret store. This article covers ESO's security model, ClusterSecretStore RBAC scoping, detecting sync failures before they become outages, and what happens when cloud credentials are compromised.

Advanced 14 min read

BOLA and BFLA in Kubernetes-Hosted APIs: Object-Level Authorisation Gaps in Multi-Tenant Deployments

Broken Object-Level Authorisation (OWASP API1) and Broken Function-Level Authorisation (OWASP API5) are the top two API vulnerability classes. In Kubernetes multi-tenant deployments, namespace isolation creates a false sense of per-tenant authorisation — but the application inside the namespace still needs to enforce that Tenant A cannot access Tenant B's resources. This article implements OPA and Kyverno-based enforcement patterns for request-level authorisation.

intermediate 11 min read

Kubernetes Service Account Token Security: Projection, Audience Binding, and Theft Prevention

Kubernetes service account tokens are the primary credential for pod-to-API-server communication and OIDC federation. Long-lived auto-mounted tokens without audience or expiry binding are a persistent source of credential theft risk. This article covers projected service account tokens (TokenRequest API), disabling automounting, audience-bound tokens for OIDC, detecting token theft with audit logs, and migrating from legacy tokens.

Advanced 13 min read

Kyverno Controller Security: Hardening the Policy Engine That Enforces Your Security Policies

Kyverno's admission webhook intercepts every pod creation, secret write, and RBAC change in the cluster. Compromising the Kyverno controller — via a CVE in the controller binary, a misconfigured webhook, or a supply chain attack on the Kyverno image — breaks all policy enforcement silently. This article hardens the Kyverno deployment itself and implements monitoring that detects when Kyverno is bypassed or degraded.

advanced 13 min read

Overlayfs Copy-on-Write Container Escape: CVE-2023-0386 and Writeback Race Mitigations

Overlayfs implements copy-on-write by copying files from the lower (image) layer to the upper (writable) layer on first write. Races in this writeback path and privilege copy semantics have enabled container escapes — CVE-2023-0386 allowed setuid files to be copied with preserved capabilities outside a user namespace. This article covers the overlayfs CoW mechanism, the escape chain, kernel patches, and Kubernetes-level mitigations.

intermediate 11 min read

Sigstore and Cosign: Keyless Container Image Signing and Verification

Sigstore's keyless signing model uses short-lived certificates bound to OIDC identity, recorded in Rekor's transparency log, eliminating long-lived private keys from the supply chain. This article covers cosign keyless signing in GitHub Actions, Rekor log integration, verifying image signatures in admission controllers, and enforcing signature policy with Kyverno and OPA Gatekeeper.

advanced 13 min read

SPIFFE and SPIRE: Cryptographic Workload Identity for Zero Trust Kubernetes

SPIFFE (Secure Production Identity Framework for Everyone) defines a universal workload identity standard using X.509 SVIDs and JWT-SVIDs. SPIRE implements SPIFFE with a Kubernetes-native attestation model, automatic cert rotation, and federation across trust domains. This article covers deploying SPIRE on Kubernetes, configuring workload attestation, federating across clusters, and integrating SPIFFE identity with Envoy and Istio.

Intermediate 13 min read

AI-Generated Kubernetes Operators vs. Maintained Open Source: The CVE Response Gap

An LLM can generate a Kubernetes operator with reconciliation logic, CRD definitions, and RBAC in under an hour. That operator has no maintainer, no security advisory channel, no CVE disclosure process, and no patch history. When a vulnerability is found — in its RBAC grants, its webhook handling, or its dependency chain — there is nobody to call and no patch coming.

Advanced 14 min read

Custom CodeQL Queries for Kubernetes Security: Scanning for RBAC Misconfigs, Pod Security Gaps, and Helm Secrets

The default CodeQL query packs don't cover Kubernetes-specific vulnerabilities — RBAC wildcard rules in Go controller code, unencrypted Kubernetes Secrets in Helm values, privileged container specs baked into application manifests. This guide writes custom CodeQL queries for Kubernetes controllers, operator code, and Helm chart generation that surface misconfigurations at the source code level.

Advanced 12 min read

containerd CVE-2022-23648: Path Traversal That Exposed the Host Filesystem

A crafted OCI image config with an empty Target.Path in a volume mount definition caused containerd to bind-mount the host root filesystem into the container. Every pod on a vulnerable node running any image from an untrusted registry had read access to the complete host filesystem — including kubelet credentials, cloud instance metadata, and secrets from co-located pods.

Advanced 14 min read

Agentic Bot Detection at Kubernetes Ingress: Envoy ext_authz Scoring for LLM-Driven Traffic

OpenAI Operator, Claude Computer Use, Microsoft Copilot Browser, and open-source browser-automation agents generate HTTP traffic that passes every CAPTCHA and mimics human timing. Standard WAF rules and bot score APIs fail. Envoy's ext_authz filter enables a multi-signal scoring pipeline at ingress — before requests reach application pods — combining TLS fingerprint, request graph, and inter-request timing signals.

advanced 14 min read

Kubernetes Dynamic Resource Allocation (DRA) Security Hardening

Securing the GA DRA API in Kubernetes 1.32+: ResourceClaim RBAC, driver trust boundaries, GPU/TPU isolation, and multi-tenant DRA threat model.

advanced 14 min read

Kubernetes In-Place Pod Resize Security: Admission Policy and Resource-Cap Enforcement on 1.33+

In-place pod resize went GA in Kubernetes 1.33. The new resize subresource changes how resource limits are enforced at runtime — admission webhooks must update, ResourceQuotas behave differently, and a misconfigured cluster lets a tenant escape its original limits. Production hardening guide.

Advanced 14 min read

LLM Agents with kubectl Access: Prompt Injection from Logs and Manifests into Cluster Compromise

LLM SRE and coding agents now ship with Kubernetes API tools equivalent to kubectl. A prompt injection payload embedded in a pod log, ConfigMap, or CRD field is indistinguishable from a legitimate instruction to the agent. When the agent has cluster-admin or namespace-admin RBAC, one injected instruction becomes a cluster-wide compromise.

Advanced 13 min read

MCP Servers in Kubernetes: RBAC Scoping and Network Isolation for Agent Tool Backends

MCP servers deployed as Kubernetes services give AI agents programmatic access to cluster resources, databases, and APIs. An MCP server with cluster-admin RBAC or unrestricted network access becomes a fully capable attack pivot when an agent is prompt-injected. Least-privilege service accounts, NetworkPolicy, and admission control gates reduce the blast radius to the minimum required for the tool's legitimate function.

Intermediate 13 min read

Kubernetes Operator Security Disclosure: Reporting and Responding to Vulnerabilities in Custom Controllers

Kubernetes operators ship to production clusters with elevated RBAC permissions and direct API server access — a vulnerability in an operator can compromise the entire cluster. This guide covers how to report operator vulnerabilities responsibly, how operator maintainers should handle disclosures, CVSS scoring for Kubernetes-specific issues, and what cluster operators should do when a vulnerability is published.

Advanced 14 min read

Post-Quantum Certificate Management in Kubernetes: Migrating Cluster PKI to Hybrid Certificates

Kubernetes control plane PKI, service mesh CAs, SPIFFE SVIDs, and Ingress TLS certificates all rely on RSA or ECDSA — algorithms vulnerable to harvest-now-decrypt-later. This guide maps the Kubernetes certificate landscape, implements hybrid PQC certificates with cert-manager and step-ca, and provides a phased migration roadmap for production clusters.

Advanced 13 min read

runc CVE-2019-5736: Overwriting the Container Runtime from Inside a Container

CVE-2019-5736 allowed a malicious container to overwrite the host runc binary by exploiting /proc/self/exe during container exec. Any kubectl exec or docker exec into an attacker-controlled container gave root on the host. Every container runtime that used runc was affected.

Intermediate 10 min read

Argo CD Secret Extraction via Read-Only Access: CVE-2026-42880

CVE-2026-42880 (CVSS 9.6) lets any read-only Argo CD user extract plaintext Kubernetes Secrets via the Server-Side Diffs API when IncludeMutationWebhook=true is annotated. Patch to v3.3.9, audit annotations, and harden Argo CD RBAC.

Advanced 12 min read

Hardening Kubernetes Against LLM-Automated Container Escapes

The UK AI Security Institute found LLMs escape containers ~50% of the time, 100% with exposed Docker sockets or privileged pods. Eliminate the specific misconfigurations that make automated escape trivial and harden the remaining attack surface against systematic AI exploitation.

Advanced 14 min read

Kubernetes PCI DSS Compliance: Scope Reduction, Network Isolation, and Audit Trails

Running card-processing workloads in Kubernetes requires explicit PCI DSS scope reduction, strict NetworkPolicy isolation, pod-level security controls, and per-node audit logging that satisfies Requirements 1, 2, 7, and 10. This guide maps Kubernetes controls to PCI DSS v4.0 and provides assessor-ready evidence commands.

Advanced 11 min read

gRPC-Go HTTP/2 Path Authorization Bypass: CVE-2026-33186

CVE-2026-33186 (CVSS 9.1) allows attackers to bypass path-based gRPC authorization by omitting the leading slash from the :path pseudo-header. Upgrade to gRPC-Go 1.79.3 and audit authorization interceptors for deny-list patterns.

Advanced 11 min read

ingress-nginx Annotation Injection 2026: CVE-2026-24512 and the New Hardening Controls

CVE-2026-24512 and related April–May 2026 CVEs allow nginx config injection via Ingress annotations, leading to RCE with cluster-wide Secret access. Patch to v1.13.7+, disable configuration-snippet, and enforce annotation allowlisting.

Advanced 12 min read

Kubernetes Incident Response for npm Supply Chain Compromises

If your K8s cluster built or ran containers during the Axios attack window, you need a playbook. Scope affected pods via image provenance, identify exposed credentials, rotate secrets cluster-wide, and use network logs to determine if the RAT reached C2.

advanced 16 min read

Contour Ingress Controller Security

Harden Contour against CVE-2026-41246 Lua code injection via HTTPProxy cookie-rewriting, xDS credential leakage, and tracking silent security fixes in Contour's rapid release cycle.

Advanced 11 min read

Kubernetes Defence Against Compromised npm Packages: Lessons from Axios

The Axios supply chain attack hit every CI pipeline running npm install during a 3-hour window. Enforce npm ci --ignore-scripts in Dockerfiles via Kyverno, block build-pod egress, and prevent runtime node_modules mutation in Kubernetes.

Advanced 12 min read

Kubernetes at the IT/OT Boundary: Zero Trust for Industrial Edge

CISA's OT Zero Trust guidance places IT-side infrastructure in a DMZ zone. Learn how to use Kubernetes network policy as ISA/IEC 62443 conduit enforcement, isolate OT-adjacent workloads, and prevent K8s from bridging into OT networks.

Advanced 11 min read

Kubernetes for OT Security Tooling: Deploying Malcolm and Zeek in the SOC

CISA recommends Malcolm for OT network traffic analysis. Deploy it on Kubernetes for reproducible SOC infrastructure — DaemonSet packet capture, persistent storage for 90-day retention, and RBAC-controlled analyst access.

Intermediate 10 min read

Kubernetes SPDY Streaming DoS: Hardening Against CVE-2026-35469

CVE-2026-35469 lets an attacker crash kubelet and kube-apiserver via malformed SPDY frames. Learn how the silent-branch pattern works and how to close the window with version pinning, RBAC restrictions, and streaming endpoint controls.

advanced 17 min read

Cluster API Security for Kubernetes Fleet Management

Secure Cluster API (CAPI) deployments by hardening controller RBAC, provider credentials, bootstrap token lifecycle, and Machine provisioning pipelines.

advanced 15 min read

Kubernetes CSI NFS and SMB Driver Security

Harden Kubernetes CSI drivers for NFS and SMB against CVE-2026-3864/3865 subDir path traversal, malicious volume provisioning, and silent fixes in the fast-moving CSI driver ecosystem.

advanced 16 min read

gRPC-Go HTTP/2 Authorization Bypass Hardening

Harden gRPC-Go services against CVE-2026-33186-class authorization bypass via malformed :path pseudo-headers, and track silent fixes in fast-moving google.golang.org/grpc releases.

advanced 16 min read

ingress-nginx Annotation Injection Hardening

Harden ingress-nginx against annotation-based configuration injection attacks—CVE-2026-3288 class—with admission controls, annotation allowlisting, and upstream release monitoring.

advanced 17 min read

KubeVirt VM Security on Kubernetes

Harden KubeVirt virtual machine workloads with virt-launcher pod security, VM isolation, live migration hardening, and tracking KubeVirt's open source CVE disclosure patterns.

advanced 15 min read

OCI Image Volume Security in Kubernetes

Secure OCI image volumes (KEP-4639) in Kubernetes 1.31+ by hardening image pull credentials, mount path validation, and admission controls—and tracking silent fixes in evolving implementations.

intermediate 12 min read

CoreDNS Security Hardening: Rebinding Protection, Plugin Configuration, and DNSSEC Forwarding

CoreDNS is the authoritative DNS server for Kubernetes service discovery. Misconfigured plugins, missing rebinding protection, and unauthenticated health endpoints expose the cluster to DNS-based attacks. Locking down CoreDNS limits lateral movement and prevents DNS-based data exfiltration.

advanced 16 min read

Karpenter Node Provisioning Security

Harden Karpenter-managed node provisioning by securing NodePools, EC2NodeClass IAM roles, node registration, and instance metadata access.

intermediate 13 min read

kube-bench: CIS Kubernetes Benchmark Automation and Remediation

The CIS Kubernetes Benchmark defines 200+ controls across the API server, etcd, kubelet, and scheduler. kube-bench automates this check and integrates into CI/CD so benchmark regressions are caught before they reach production.

intermediate 12 min read

Kubernetes CronJob Security: Least Privilege, Concurrency Controls, and Credential Isolation

CronJobs run privileged operations on a schedule — database backups, report generation, secret rotation. A CronJob that accumulates permissions over time, leaves credentials in completed pods, or runs with unbounded concurrency creates persistent attack surface. Hardening CronJobs applies the same least-privilege principles as long-running workloads.

intermediate 13 min read

Kubernetes Operator Security: RBAC Scoping, Webhook Hardening, and Privilege Minimisation

Operators run with elevated Kubernetes permissions to manage custom resources. Overpermissive ClusterRoles, insecure admission webhooks, and unvalidated CRD inputs are common attack vectors. Scoping operator permissions to the minimum required limits blast radius from operator compromise.

intermediate 12 min read

Kubernetes Resource Quotas and LimitRanges: Preventing Noisy Neighbour and Denial of Service

Without resource quotas, a single namespace can consume all cluster CPU, memory, and storage — starving other tenants or crashing the control plane. ResourceQuota and LimitRange enforce per-namespace and per-pod resource bounds, making resource exhaustion attacks and accidental runaway workloads containable.

advanced 14 min read

Cilium Network Policy: FQDN Filtering, L7 Policies, and Hubble Observability

Cilium's CiliumNetworkPolicy extends standard Kubernetes NetworkPolicy with DNS-based egress control, HTTP/gRPC L7 rules, and cryptographic identity. Hubble provides flow-level visibility without packet capture.

intermediate 13 min read

Kubernetes OIDC Authentication and kubectl Access Control

Static kubeconfigs with long-lived certificates are the norm but not the standard. OIDC authentication gives kubectl short-lived tokens, group-based RBAC, and a full audit trail tied to real identities.

intermediate 14 min read

Kyverno Policy Development and Testing: Validate, Mutate, and Generate

Kyverno enforces Kubernetes security policy as YAML. Writing effective validate, mutate, and generate policies — and testing them with Chainsaw — turns admission control from a checkpoint into a continuous guardrail.

intermediate 13 min read

Kubernetes Backup Security with Velero: Encryption, RBAC, and Immutable Storage

Velero backups contain every Kubernetes secret, PersistentVolume, and workload configuration. Without encryption and immutable storage, they are a single-shot path to full cluster compromise or ransomware.

advanced 15 min read

cert-manager PKI Hardening: Intermediate CAs, Short-Lived Certificates, and Trust Chain Design

cert-manager manages certificate lifecycle at scale, but its default configuration creates long-lived certs and flat trust hierarchies. Harden the PKI layer your services depend on.

advanced 14 min read

CSI Driver Security: Volume-Mount Hardening, Privileged Drivers, and Inline Ephemeral Volumes

CSI drivers run with broad privileges by design. Their security posture often goes unaudited — until one is the exfil path or the privilege-escalation step.

intermediate 13 min read

External Secrets Operator: Pulling Secrets from KMS, Vault, and Cloud Stores into Kubernetes

Native Kubernetes Secrets are visible to anyone with namespace get. External Secrets Operator pulls from your real secret store on schedule, with rotation and audit.

intermediate 13 min read

Native Sidecar Containers in Kubernetes 1.29+: Lifecycle, Security, and Mesh Migration

restartPolicy: Always init containers GA'd in 1.29 fix the long-standing init/main race. Bigger security wins for service-mesh and log-shipper deployments.

advanced 14 min read

Kubernetes RuntimeClass: gVisor and Kata Containers for Production Workload Isolation

RuntimeClass lets you select a sandboxed container runtime per workload. gVisor intercepts syscalls in userspace; Kata Containers run workloads in lightweight VMs. Each changes the threat model.

advanced 16 min read

Confidential Containers on Kubernetes: AMD SEV-SNP, Intel TDX, and the Attestation Flow

Confidential Containers move workload isolation from the kernel to the silicon. Encrypted memory, hardware-attested boot, and a different threat model than user namespaces.

advanced 14 min read

User Namespaces for Pods: UID Remapping, Container Escape Defense, and the GA Path in Kubernetes 1.30+

userns: true remaps Pod UIDs into a per-Pod range. A container running as root sees uid 0 inside; the host sees an unprivileged user. Big hardening win, easy to enable.

intermediate 15 min read

ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks

VAP replaces webhook admission for the policies you write most often. No Kyverno, no OPA, no network round-trip, no webhook availability risk.

intermediate 17 min read

Gateway API Security Patterns: Multi-Team Routing, ReferenceGrant, and Delegated Trust on Kubernetes

Gateway API replaces Ingress with a multi-role model that separates infrastructure, cluster operator, and application developer concerns. New surface, new threat model.

advanced 26 min read

LLMs on Kubernetes: Understanding the Threat Model and Deploying an LLM Gateway

Kubernetes orchestrates LLM workloads but has no awareness of what those workloads do. An Ollama pod with healthy readiness probes and stable resource usage can still leak secrets, execute prompt injection, and grant models excessive agency over internal services. This article covers the LLM-specific threat model for Kubernetes and implements an LLM gateway as the policy enforcement layer.

intermediate 22 min read

Kubernetes Node Hardening: From OS Configuration to kubelet Lockdown

A Kubernetes node is a Linux machine running kubelet, a container runtime, and your workloads.

advanced 16 min read

GPU Workload Isolation: MIG, MPS, and vGPU Security Boundaries

Multi-tenant GPU sharing without isolation risks data leakage between workloads through shared GPU memory.

intermediate 13 min read

GPU Cost and Security Monitoring: Detecting Abuse and Optimising Spend

GPU compute costs between $2 and $30 per hour per device. A single unauthorised cryptocurrency mining pod running on an A100 for a weekend generates..

intermediate 14 min read

LLM Rate Limiting in Production: Token Budgets, Per-User Quotas, and Abuse Detection

Request-count rate limiting fails for LLM workloads because a single request can consume 100K tokens. Token-based rate limiting with per-user quotas and abuse detection prevents runaway costs and catches prompt injection probing before it escalates.

advanced 22 min read

Runtime Security with Falco on Kubernetes: Rules, Tuning, and Response Automation

Prevention-only security has a binary failure mode: either the control holds and the attacker is stopped, or the control fails and the attacker...

intermediate 22 min read

Kubernetes Network Policies That Actually Work: From Default Deny to Microsegmentation

By default, every pod in a Kubernetes cluster can communicate with every other pod across all namespaces. There are no network boundaries.

intermediate 15 min read

LLM Cost Controls: Budget Enforcement, Token Metering, and Spend Alerting

Without enforced budgets, a single team can exhaust an organization's entire AI spend in days. Token metering with per-team budgets, automatic request rejection at limits, model routing by cost, and chargeback dashboards turn LLM spending from a surprise into a managed line item.

intermediate 18 min read

Kubelet Security Configuration: Authentication, Authorization, and Read-Only Port

The kubelet runs on every node in the cluster with root-level access to the container runtime, all pod specifications, mounted secrets, and the host..

intermediate 20 min read

Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers

RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months.

intermediate 20 min read

Kubernetes Secrets Management: External Secrets Operator, Vault, and Sealed Secrets

Kubernetes Secrets are base64-encoded, not encrypted. Anyone with RBAC read access to secrets in a namespace can decode every credential stored there.

advanced 18 min read

AI Incident Forensics: Reconstructing What an AI System Did, Why, and What Data It Accessed

When a traditional application causes an incident, you examine logs, traces, and database queries to reconstruct what happened.

intermediate 16 min read

Hardening Model Inference Endpoints: Authentication, Rate Limiting, and Input Validation

Model inference endpoints are GPU-backed and expensive, $2-30 per hour per GPU. A single unprotected endpoint exposed to the internet can accumulate..

intermediate 22 min read

Kubernetes Admission Control: From PodSecurity Standards to Custom OPA/Kyverno Policies

Without admission control, any user with deployment permissions can run privileged containers, mount the host filesystem, use the host network, run...

advanced 16 min read

AI Data Leakage Prevention: Input Filtering, Output Scanning, and Audit Trails

AI systems leak data in ways traditional applications do not. A language model trained on customer data can reproduce verbatim customer records in...

intermediate 14 min read

Jupyter Notebook Security: Authentication, Isolation, and Data Protection

JupyterHub is a code execution platform. Every notebook cell is arbitrary code running with whatever permissions the notebook server process has.

intermediate 20 min read

Multi-Tenancy Hardening in Kubernetes: Namespace Isolation, Resource Quotas, and Network Boundaries

Kubernetes namespaces provide logical separation, not security isolation. By default, pods in namespace A can send network traffic to pods in...

advanced 17 min read

Building a Content Filtering Pipeline for LLM Applications: From Raw Input to Safe Output

A single content filter is not a pipeline. Most LLM deployments add one filter (usually on output) and call it done.

advanced 17 min read

AI Red Teaming Methodology: Structured Adversarial Testing for LLM Applications

Traditional security testing (penetration testing, vulnerability scanning) does not cover AI-specific attack surfaces.

intermediate 20 min read

Kubernetes Image Policy Enforcement: Cosign, Notation, and Admission Webhooks

Without image policy enforcement, any container image from any registry can run in a Kubernetes cluster.

advanced 16 min read

Securing RAG Pipelines: Vector Database Access Control, Document Poisoning, and Retrieval Filtering

Retrieval-Augmented Generation (RAG) adds a knowledge base to LLM applications, the model retrieves relevant documents before generating a response.

intermediate 20 min read

Pod Security Context Deep Dive: runAsNonRoot, readOnlyRootFilesystem, and Capabilities

Kubernetes SecurityContext has over 15 configurable fields, but most teams only set runAsNonRoot: true and consider the job done.

intermediate 18 min read

Vector Database Security: Access Control, Embedding Protection, and Query Isolation

Vector databases are the backbone of RAG (Retrieval-Augmented Generation) systems.

intermediate 17 min read

A/B Model Deployment Safety: Canary Rollouts, Traffic Splitting, and Automated Rollback for ML Models

Deploying a new ML model version is not the same as deploying a new application version.

intermediate 22 min read

Kubernetes API Server Hardening: Flags, Authentication, and Audit Logging

The API server is the front door to the Kubernetes cluster. Every kubectl command, every controller reconciliation, every pod scheduling decision,...

intermediate 20 min read

Seccomp Profiles for Production Workloads: Writing, Testing, and Deploying Custom Profiles

The default container runtime allows approximately 300 syscalls. A compromised container can use unshare to create new namespaces, clone to spawn...

intermediate 18 min read

etcd Encryption at Rest: Configuration, Key Rotation, and Performance Impact

Kubernetes Secrets are stored in etcd as base64-encoded plaintext. Base64 is an encoding, not encryption.

advanced 18 min read

Implementing AI Guardrails: Input Validation, Output Filtering, and Safety Classifiers in Production

Deploying an LLM without guardrails is deploying an application where any user can make it say or do anything.

intermediate 21 min read

Hardening Kubernetes Ingress Controllers: NGINX, Traefik, and Envoy Compared

The ingress controller is the internet-facing entry point to a Kubernetes cluster.

advanced 18 min read

LLM Observability in Production: Monitoring Latency, Token Usage, Safety Violations, and Drift

Traditional application monitoring (CPU, memory, HTTP status codes, latency) tells you nothing about what an LLM is doing.

intermediate 16 min read

Hardening Model Serving Frameworks: TorchServe, Triton, and vLLM Security Configuration

Model serving frameworks ship with defaults optimised for development: management APIs exposed on all interfaces without authentication, model files..

advanced 18 min read

Securing Fine-Tuning Pipelines: Data Isolation, Checkpoint Integrity, and Access Control

Fine-tuning pipelines are high-value targets. They consume expensive GPU hours, process proprietary training data, and produce model checkpoints that...

intermediate 18 min read

Hardening the Kubernetes Scheduler: Topology Constraints and Security-Aware Placement

The Kubernetes scheduler places pods on nodes based on resource availability and basic constraints.

intermediate 22 min read

Kubernetes Audit Log Analysis: What to Log, How to Query, and What to Alert On

Kubernetes audit logs record every request to the API server: who made the request, what they asked for, and whether it succeeded.

advanced 14 min read

Securing Model Artifact Pipelines: From Training to Serving

Model files are opaque binaries ranging from 1GB to over 1TB. You cannot code-review a set of weights.

advanced 17 min read

RLHF Data Protection: Securing Human Feedback Loops, Preference Data, and Reward Models

Reinforcement Learning from Human Feedback (RLHF) pipelines introduce unique security surfaces that standard ML training workflows do not have.

intermediate 13 min read

AI API Key Management: Rotation, Scoping, and Abuse Detection

AI services have turned API keys into direct spending controls. A leaked OpenAI or Anthropic key can generate thousands of dollars in charges within...

advanced 16 min read

Prompt Injection Defence in Production: Input Validation, Output Filtering, and Monitoring

Prompt injection is the SQL injection of AI systems, the most common and most damaging attack class against LLM-powered applications.

advanced 15 min read

Network Segmentation for AI Training Infrastructure

AI training clusters frequently share networks with production services. A training job that can reach the production database is one compromised...

intermediate 14 min read

Observability for LLM Applications: Token Usage, Latency Anomalies, and Output Classification

LLM-powered applications have unique observability requirements that standard APM tools do not address: token-based cost tracking (not just request...

intermediate 16 min read

Model Registry Access Control: Versioning, Signing, and Promotion Gates

Model registries are the bridge between training and production. A model pushed to the production registry gets served to users.

intermediate 19 min read

Kubernetes Service Account Token Security: Bound Tokens, Projected Volumes, and OIDC

Every pod in Kubernetes receives a service account token by default. In clusters running older configurations or without explicit hardening, these...