Kubernetes / Platform Security Articles
Kubernetes hardening guides covering RBAC, network policies, admission control, secrets management, runtime security, and AI workloads.
Kubernetes Security and Hardening Guides
Admission Webhook PR Poisoning: How a Merged PR Becomes a Cluster Backdoor
A pull request that modifies a ValidatingWebhookConfiguration or MutatingWebhookConfiguration — or the controller that registers it — can silently remove security controls from an entire cluster. This guide covers the attack patterns, detection with OPA/Kyverno policies, and GitOps controls that prevent webhook configuration from being silently changed.
Kubernetes CVE Auto-Remediation Operator: Closing the Patch Window Automatically
A Kubernetes operator that watches OSV and NVD CVE feeds, correlates findings against running workload images, and automatically remediates via image-pin updates, DaemonSet rollouts, and node-pool rotations when EPSS score exceeds a configured threshold — shrinking the patch window from days to minutes.
Kubernetes Structured Authorization Config: Hardening Multi-Webhook Auth Chains
Kubernetes 1.30 graduated the Structured Authorization Config API to GA, enabling multiple authorization webhooks chained with CEL conditions. Misconfigured chains silently bypass RBAC; this guide covers safe chain design, CEL condition hardening, and fail-open/fail-closed reasoning.
Argo Workflows Template Injection via User-Controlled Parameters
Argo Workflows evaluates template expressions using user-supplied workflow parameters; without input validation, an attacker with workflow submission access can inject expressions that execute arbitrary commands in the workflow executor, exfiltrate secrets, or pivot to other cluster workloads.
EPSS-Driven CVE Patch Prioritization for Kubernetes Workloads
CVSS severity alone cannot prioritize patching when hundreds of CVEs affect your Kubernetes images; the Exploit Prediction Scoring System (EPSS) provides a probability-of-exploitation score that focuses remediation on the CVEs most likely to be actively exploited in the next 30 days.
Automated ingress-nginx Version Management and CVE Response
ingress-nginx has had multiple critical CVEs including annotation injection attacks; manual Helm chart version management leaves clusters exposed for weeks; automate detection of new releases, staged canary rollout, and rollback to reduce patch lag to hours.
Securing Kubernetes Sidecar Injection Against Rogue Container Injection
Mutating webhook sidecar injection — used by Istio, Dapr, and custom platform injectors — can be abused to inject rogue containers or modify existing ones; audit injection logic, enforce webhook TLS, restrict injection to approved namespaces, and validate injector output.
Security Validation for AI-Generated Kubernetes Manifests
AI assistants generating Kubernetes Deployment, RBAC, and Service YAML reproduce predictable misconfigurations — privileged containers, missing securityContext, broad ClusterRoleBindings; validate with Polaris, kube-score, and Kyverno before admission.
Hardening the Kubernetes Secrets Store CSI Driver
The Secrets Store CSI Driver mounts external secrets from AWS, Azure, GCP, and Vault into pods via provider plugins; its sync-to-Kubernetes-Secret behaviour, RBAC surface, and provider pod permissions are common misconfiguration sources that expose secrets beyond their intended scope.
Isolating AI Training Batch Jobs in Kubernetes
AI training jobs on Kubernetes have access to large GPU nodes, model weights, and training datasets; isolate them from production namespaces with dedicated node pools, network policy, and RBAC to prevent cross-job data leakage and lateral movement.
Kubernetes Subresource RBAC Escalation: Restricting exec, portforward, and proxy
RBAC permissions on pods/exec, pods/portforward, pods/log, and nodes/proxy are functionally equivalent to cluster compromise yet routinely over-provisioned; audit who holds these grants and replace them with time-limited JIT access.
Securing the Kubernetes API Aggregation Layer Against Privilege Escalation
Extension API servers registered via API aggregation can intercept credentials, bypass RBAC, and escalate to cluster-admin; harden the aggregation layer with mutual TLS, bounded permissions, and routing controls.
Kubernetes Node Kernel Patch Velocity: Draining and Replacing Nodes at Speed After a Critical CVE
When a critical kernel LPE like Dirty Frag (CVE-2026-43284/43500) drops with a public PoC, the window between disclosure and exploitation may be hours. Kubernetes clusters running hundreds of nodes need a systematic, automated approach to kernel patching — identifying vulnerable nodes, draining workloads safely, applying patches, and verifying remediation — without days of manual work.
Azure Workload Identity for AKS: Federated Credential Access to Azure Resources
Azure Workload Identity replaces pod identity (now deprecated) and Managed Identity limitations in AKS by using OIDC federation between the AKS OIDC issuer and Azure AD. Kubernetes pods receive projected service account tokens that can be exchanged for Azure AD access tokens without any stored credentials. This article covers enabling the OIDC issuer on AKS, creating federated credentials, configuring workload identity in pods, and auditing with Azure Monitor.
Container Image Signing Policy Enforcement: From cosign to Admission Control
Signing container images is only useful if admission control verifies the signature before the image runs. This article covers the end-to-end enforcement pipeline: signing images with cosign in CI, verifying signatures with Kyverno ImageVerify and OPA Gatekeeper, configuring signature transparency with Rekor, handling multi-architecture image indexes, and the key distribution problem in enterprise environments.
ContainerSSH Kubernetes Backend: Hardened Pod-per-Session SSH Access
ContainerSSH's Kubernetes backend launches a dedicated Pod for each SSH connection, giving each session its own process namespace, filesystem, and network identity. The security of this model depends entirely on the Pod spec returned by the config webhook: a misconfigured PodSecurityContext, missing NetworkPolicy, or overly broad RBAC for the ContainerSSH service account can turn an isolation mechanism into a cluster escape path.
Automating Container Image Patching in Kubernetes with Copa and Kyverno
Running Copa (Copacetic) as a Kubernetes CronJob continuously scans images in a registry and patches those above a vulnerability severity threshold, while Kyverno admission policies block unpatched images from being scheduled. Together they create a closed-loop container patching system that operates independently of application teams and upstream image publishers.
ETCd Compromise: The Blast Radius of Your Kubernetes Backing Store
ETCd holds every Kubernetes secret, service account token, and config in base64-encoded plaintext. A direct etcd connection bypasses all RBAC — there are no Kubernetes access controls between an etcd client and the data. Attackers who reach etcd (via node compromise, misconfigured backup access, or exposed port) can read every secret and forge service account tokens. This article covers the attack paths, what data is exposed, and how to detect and recover.
External Secrets Operator: Syncing Cloud Secrets Without Storing Them in Kubernetes
The External Secrets Operator (ESO) reconciles secrets from AWS Secrets Manager, Azure Key Vault, GCP Secret Manager, and HashiCorp Vault into Kubernetes Secrets on a defined refresh interval. The Kubernetes Secret is a cache — the authoritative copy lives in the cloud secret store. This article covers ESO's security model, ClusterSecretStore RBAC scoping, detecting sync failures before they become outages, and what happens when cloud credentials are compromised.
BOLA and BFLA in Kubernetes-Hosted APIs: Object-Level Authorisation Gaps in Multi-Tenant Deployments
Broken Object-Level Authorisation (OWASP API1) and Broken Function-Level Authorisation (OWASP API5) are the top two API vulnerability classes. In Kubernetes multi-tenant deployments, namespace isolation creates a false sense of per-tenant authorisation — but the application inside the namespace still needs to enforce that Tenant A cannot access Tenant B's resources. This article implements OPA and Kyverno-based enforcement patterns for request-level authorisation.
Kubernetes Service Account Token Security: Projection, Audience Binding, and Theft Prevention
Kubernetes service account tokens are the primary credential for pod-to-API-server communication and OIDC federation. Long-lived auto-mounted tokens without audience or expiry binding are a persistent source of credential theft risk. This article covers projected service account tokens (TokenRequest API), disabling automounting, audience-bound tokens for OIDC, detecting token theft with audit logs, and migrating from legacy tokens.
Kyverno Controller Security: Hardening the Policy Engine That Enforces Your Security Policies
Kyverno's admission webhook intercepts every pod creation, secret write, and RBAC change in the cluster. Compromising the Kyverno controller — via a CVE in the controller binary, a misconfigured webhook, or a supply chain attack on the Kyverno image — breaks all policy enforcement silently. This article hardens the Kyverno deployment itself and implements monitoring that detects when Kyverno is bypassed or degraded.
Overlayfs Copy-on-Write Container Escape: CVE-2023-0386 and Writeback Race Mitigations
Overlayfs implements copy-on-write by copying files from the lower (image) layer to the upper (writable) layer on first write. Races in this writeback path and privilege copy semantics have enabled container escapes — CVE-2023-0386 allowed setuid files to be copied with preserved capabilities outside a user namespace. This article covers the overlayfs CoW mechanism, the escape chain, kernel patches, and Kubernetes-level mitigations.
Sigstore and Cosign: Keyless Container Image Signing and Verification
Sigstore's keyless signing model uses short-lived certificates bound to OIDC identity, recorded in Rekor's transparency log, eliminating long-lived private keys from the supply chain. This article covers cosign keyless signing in GitHub Actions, Rekor log integration, verifying image signatures in admission controllers, and enforcing signature policy with Kyverno and OPA Gatekeeper.
SPIFFE and SPIRE: Cryptographic Workload Identity for Zero Trust Kubernetes
SPIFFE (Secure Production Identity Framework for Everyone) defines a universal workload identity standard using X.509 SVIDs and JWT-SVIDs. SPIRE implements SPIFFE with a Kubernetes-native attestation model, automatic cert rotation, and federation across trust domains. This article covers deploying SPIRE on Kubernetes, configuring workload attestation, federating across clusters, and integrating SPIFFE identity with Envoy and Istio.
AI-Generated Kubernetes Operators vs. Maintained Open Source: The CVE Response Gap
An LLM can generate a Kubernetes operator with reconciliation logic, CRD definitions, and RBAC in under an hour. That operator has no maintainer, no security advisory channel, no CVE disclosure process, and no patch history. When a vulnerability is found — in its RBAC grants, its webhook handling, or its dependency chain — there is nobody to call and no patch coming.
Custom CodeQL Queries for Kubernetes Security: Scanning for RBAC Misconfigs, Pod Security Gaps, and Helm Secrets
The default CodeQL query packs don't cover Kubernetes-specific vulnerabilities — RBAC wildcard rules in Go controller code, unencrypted Kubernetes Secrets in Helm values, privileged container specs baked into application manifests. This guide writes custom CodeQL queries for Kubernetes controllers, operator code, and Helm chart generation that surface misconfigurations at the source code level.
containerd CVE-2022-23648: Path Traversal That Exposed the Host Filesystem
A crafted OCI image config with an empty Target.Path in a volume mount definition caused containerd to bind-mount the host root filesystem into the container. Every pod on a vulnerable node running any image from an untrusted registry had read access to the complete host filesystem — including kubelet credentials, cloud instance metadata, and secrets from co-located pods.
Agentic Bot Detection at Kubernetes Ingress: Envoy ext_authz Scoring for LLM-Driven Traffic
OpenAI Operator, Claude Computer Use, Microsoft Copilot Browser, and open-source browser-automation agents generate HTTP traffic that passes every CAPTCHA and mimics human timing. Standard WAF rules and bot score APIs fail. Envoy's ext_authz filter enables a multi-signal scoring pipeline at ingress — before requests reach application pods — combining TLS fingerprint, request graph, and inter-request timing signals.
Kubernetes Dynamic Resource Allocation (DRA) Security Hardening
Securing the GA DRA API in Kubernetes 1.32+: ResourceClaim RBAC, driver trust boundaries, GPU/TPU isolation, and multi-tenant DRA threat model.
Kubernetes In-Place Pod Resize Security: Admission Policy and Resource-Cap Enforcement on 1.33+
In-place pod resize went GA in Kubernetes 1.33. The new resize subresource changes how resource limits are enforced at runtime — admission webhooks must update, ResourceQuotas behave differently, and a misconfigured cluster lets a tenant escape its original limits. Production hardening guide.
LLM Agents with kubectl Access: Prompt Injection from Logs and Manifests into Cluster Compromise
LLM SRE and coding agents now ship with Kubernetes API tools equivalent to kubectl. A prompt injection payload embedded in a pod log, ConfigMap, or CRD field is indistinguishable from a legitimate instruction to the agent. When the agent has cluster-admin or namespace-admin RBAC, one injected instruction becomes a cluster-wide compromise.
MCP Servers in Kubernetes: RBAC Scoping and Network Isolation for Agent Tool Backends
MCP servers deployed as Kubernetes services give AI agents programmatic access to cluster resources, databases, and APIs. An MCP server with cluster-admin RBAC or unrestricted network access becomes a fully capable attack pivot when an agent is prompt-injected. Least-privilege service accounts, NetworkPolicy, and admission control gates reduce the blast radius to the minimum required for the tool's legitimate function.
Kubernetes Operator Security Disclosure: Reporting and Responding to Vulnerabilities in Custom Controllers
Kubernetes operators ship to production clusters with elevated RBAC permissions and direct API server access — a vulnerability in an operator can compromise the entire cluster. This guide covers how to report operator vulnerabilities responsibly, how operator maintainers should handle disclosures, CVSS scoring for Kubernetes-specific issues, and what cluster operators should do when a vulnerability is published.
Post-Quantum Certificate Management in Kubernetes: Migrating Cluster PKI to Hybrid Certificates
Kubernetes control plane PKI, service mesh CAs, SPIFFE SVIDs, and Ingress TLS certificates all rely on RSA or ECDSA — algorithms vulnerable to harvest-now-decrypt-later. This guide maps the Kubernetes certificate landscape, implements hybrid PQC certificates with cert-manager and step-ca, and provides a phased migration roadmap for production clusters.
runc CVE-2019-5736: Overwriting the Container Runtime from Inside a Container
CVE-2019-5736 allowed a malicious container to overwrite the host runc binary by exploiting /proc/self/exe during container exec. Any kubectl exec or docker exec into an attacker-controlled container gave root on the host. Every container runtime that used runc was affected.
Argo CD Secret Extraction via Read-Only Access: CVE-2026-42880
CVE-2026-42880 (CVSS 9.6) lets any read-only Argo CD user extract plaintext Kubernetes Secrets via the Server-Side Diffs API when IncludeMutationWebhook=true is annotated. Patch to v3.3.9, audit annotations, and harden Argo CD RBAC.
Hardening Kubernetes Against LLM-Automated Container Escapes
The UK AI Security Institute found LLMs escape containers ~50% of the time, 100% with exposed Docker sockets or privileged pods. Eliminate the specific misconfigurations that make automated escape trivial and harden the remaining attack surface against systematic AI exploitation.
Kubernetes PCI DSS Compliance: Scope Reduction, Network Isolation, and Audit Trails
Running card-processing workloads in Kubernetes requires explicit PCI DSS scope reduction, strict NetworkPolicy isolation, pod-level security controls, and per-node audit logging that satisfies Requirements 1, 2, 7, and 10. This guide maps Kubernetes controls to PCI DSS v4.0 and provides assessor-ready evidence commands.
gRPC-Go HTTP/2 Path Authorization Bypass: CVE-2026-33186
CVE-2026-33186 (CVSS 9.1) allows attackers to bypass path-based gRPC authorization by omitting the leading slash from the :path pseudo-header. Upgrade to gRPC-Go 1.79.3 and audit authorization interceptors for deny-list patterns.
ingress-nginx Annotation Injection 2026: CVE-2026-24512 and the New Hardening Controls
CVE-2026-24512 and related April–May 2026 CVEs allow nginx config injection via Ingress annotations, leading to RCE with cluster-wide Secret access. Patch to v1.13.7+, disable configuration-snippet, and enforce annotation allowlisting.
Kubernetes Incident Response for npm Supply Chain Compromises
If your K8s cluster built or ran containers during the Axios attack window, you need a playbook. Scope affected pods via image provenance, identify exposed credentials, rotate secrets cluster-wide, and use network logs to determine if the RAT reached C2.
Contour Ingress Controller Security
Harden Contour against CVE-2026-41246 Lua code injection via HTTPProxy cookie-rewriting, xDS credential leakage, and tracking silent security fixes in Contour's rapid release cycle.
Kubernetes Defence Against Compromised npm Packages: Lessons from Axios
The Axios supply chain attack hit every CI pipeline running npm install during a 3-hour window. Enforce npm ci --ignore-scripts in Dockerfiles via Kyverno, block build-pod egress, and prevent runtime node_modules mutation in Kubernetes.
Kubernetes at the IT/OT Boundary: Zero Trust for Industrial Edge
CISA's OT Zero Trust guidance places IT-side infrastructure in a DMZ zone. Learn how to use Kubernetes network policy as ISA/IEC 62443 conduit enforcement, isolate OT-adjacent workloads, and prevent K8s from bridging into OT networks.
Kubernetes for OT Security Tooling: Deploying Malcolm and Zeek in the SOC
CISA recommends Malcolm for OT network traffic analysis. Deploy it on Kubernetes for reproducible SOC infrastructure — DaemonSet packet capture, persistent storage for 90-day retention, and RBAC-controlled analyst access.
Kubernetes SPDY Streaming DoS: Hardening Against CVE-2026-35469
CVE-2026-35469 lets an attacker crash kubelet and kube-apiserver via malformed SPDY frames. Learn how the silent-branch pattern works and how to close the window with version pinning, RBAC restrictions, and streaming endpoint controls.
Cluster API Security for Kubernetes Fleet Management
Secure Cluster API (CAPI) deployments by hardening controller RBAC, provider credentials, bootstrap token lifecycle, and Machine provisioning pipelines.
Kubernetes CSI NFS and SMB Driver Security
Harden Kubernetes CSI drivers for NFS and SMB against CVE-2026-3864/3865 subDir path traversal, malicious volume provisioning, and silent fixes in the fast-moving CSI driver ecosystem.
gRPC-Go HTTP/2 Authorization Bypass Hardening
Harden gRPC-Go services against CVE-2026-33186-class authorization bypass via malformed :path pseudo-headers, and track silent fixes in fast-moving google.golang.org/grpc releases.
ingress-nginx Annotation Injection Hardening
Harden ingress-nginx against annotation-based configuration injection attacks—CVE-2026-3288 class—with admission controls, annotation allowlisting, and upstream release monitoring.
KubeVirt VM Security on Kubernetes
Harden KubeVirt virtual machine workloads with virt-launcher pod security, VM isolation, live migration hardening, and tracking KubeVirt's open source CVE disclosure patterns.
OCI Image Volume Security in Kubernetes
Secure OCI image volumes (KEP-4639) in Kubernetes 1.31+ by hardening image pull credentials, mount path validation, and admission controls—and tracking silent fixes in evolving implementations.
CoreDNS Security Hardening: Rebinding Protection, Plugin Configuration, and DNSSEC Forwarding
CoreDNS is the authoritative DNS server for Kubernetes service discovery. Misconfigured plugins, missing rebinding protection, and unauthenticated health endpoints expose the cluster to DNS-based attacks. Locking down CoreDNS limits lateral movement and prevents DNS-based data exfiltration.
Karpenter Node Provisioning Security
Harden Karpenter-managed node provisioning by securing NodePools, EC2NodeClass IAM roles, node registration, and instance metadata access.
kube-bench: CIS Kubernetes Benchmark Automation and Remediation
The CIS Kubernetes Benchmark defines 200+ controls across the API server, etcd, kubelet, and scheduler. kube-bench automates this check and integrates into CI/CD so benchmark regressions are caught before they reach production.
Kubernetes CronJob Security: Least Privilege, Concurrency Controls, and Credential Isolation
CronJobs run privileged operations on a schedule — database backups, report generation, secret rotation. A CronJob that accumulates permissions over time, leaves credentials in completed pods, or runs with unbounded concurrency creates persistent attack surface. Hardening CronJobs applies the same least-privilege principles as long-running workloads.
Kubernetes Operator Security: RBAC Scoping, Webhook Hardening, and Privilege Minimisation
Operators run with elevated Kubernetes permissions to manage custom resources. Overpermissive ClusterRoles, insecure admission webhooks, and unvalidated CRD inputs are common attack vectors. Scoping operator permissions to the minimum required limits blast radius from operator compromise.
Kubernetes Resource Quotas and LimitRanges: Preventing Noisy Neighbour and Denial of Service
Without resource quotas, a single namespace can consume all cluster CPU, memory, and storage — starving other tenants or crashing the control plane. ResourceQuota and LimitRange enforce per-namespace and per-pod resource bounds, making resource exhaustion attacks and accidental runaway workloads containable.
Cilium Network Policy: FQDN Filtering, L7 Policies, and Hubble Observability
Cilium's CiliumNetworkPolicy extends standard Kubernetes NetworkPolicy with DNS-based egress control, HTTP/gRPC L7 rules, and cryptographic identity. Hubble provides flow-level visibility without packet capture.
Kubernetes OIDC Authentication and kubectl Access Control
Static kubeconfigs with long-lived certificates are the norm but not the standard. OIDC authentication gives kubectl short-lived tokens, group-based RBAC, and a full audit trail tied to real identities.
Kyverno Policy Development and Testing: Validate, Mutate, and Generate
Kyverno enforces Kubernetes security policy as YAML. Writing effective validate, mutate, and generate policies — and testing them with Chainsaw — turns admission control from a checkpoint into a continuous guardrail.
Kubernetes Backup Security with Velero: Encryption, RBAC, and Immutable Storage
Velero backups contain every Kubernetes secret, PersistentVolume, and workload configuration. Without encryption and immutable storage, they are a single-shot path to full cluster compromise or ransomware.
cert-manager PKI Hardening: Intermediate CAs, Short-Lived Certificates, and Trust Chain Design
cert-manager manages certificate lifecycle at scale, but its default configuration creates long-lived certs and flat trust hierarchies. Harden the PKI layer your services depend on.
CSI Driver Security: Volume-Mount Hardening, Privileged Drivers, and Inline Ephemeral Volumes
CSI drivers run with broad privileges by design. Their security posture often goes unaudited — until one is the exfil path or the privilege-escalation step.
External Secrets Operator: Pulling Secrets from KMS, Vault, and Cloud Stores into Kubernetes
Native Kubernetes Secrets are visible to anyone with namespace get. External Secrets Operator pulls from your real secret store on schedule, with rotation and audit.
Native Sidecar Containers in Kubernetes 1.29+: Lifecycle, Security, and Mesh Migration
restartPolicy: Always init containers GA'd in 1.29 fix the long-standing init/main race. Bigger security wins for service-mesh and log-shipper deployments.
Kubernetes RuntimeClass: gVisor and Kata Containers for Production Workload Isolation
RuntimeClass lets you select a sandboxed container runtime per workload. gVisor intercepts syscalls in userspace; Kata Containers run workloads in lightweight VMs. Each changes the threat model.
Confidential Containers on Kubernetes: AMD SEV-SNP, Intel TDX, and the Attestation Flow
Confidential Containers move workload isolation from the kernel to the silicon. Encrypted memory, hardware-attested boot, and a different threat model than user namespaces.
User Namespaces for Pods: UID Remapping, Container Escape Defense, and the GA Path in Kubernetes 1.30+
userns: true remaps Pod UIDs into a per-Pod range. A container running as root sees uid 0 inside; the host sees an unprivileged user. Big hardening win, easy to enable.
ValidatingAdmissionPolicy with CEL: Native Kubernetes Admission Without Webhooks
VAP replaces webhook admission for the policies you write most often. No Kyverno, no OPA, no network round-trip, no webhook availability risk.
Gateway API Security Patterns: Multi-Team Routing, ReferenceGrant, and Delegated Trust on Kubernetes
Gateway API replaces Ingress with a multi-role model that separates infrastructure, cluster operator, and application developer concerns. New surface, new threat model.
LLMs on Kubernetes: Understanding the Threat Model and Deploying an LLM Gateway
Kubernetes orchestrates LLM workloads but has no awareness of what those workloads do. An Ollama pod with healthy readiness probes and stable resource usage can still leak secrets, execute prompt injection, and grant models excessive agency over internal services. This article covers the LLM-specific threat model for Kubernetes and implements an LLM gateway as the policy enforcement layer.
Kubernetes Node Hardening: From OS Configuration to kubelet Lockdown
A Kubernetes node is a Linux machine running kubelet, a container runtime, and your workloads.
GPU Workload Isolation: MIG, MPS, and vGPU Security Boundaries
Multi-tenant GPU sharing without isolation risks data leakage between workloads through shared GPU memory.
GPU Cost and Security Monitoring: Detecting Abuse and Optimising Spend
GPU compute costs between $2 and $30 per hour per device. A single unauthorised cryptocurrency mining pod running on an A100 for a weekend generates..
LLM Rate Limiting in Production: Token Budgets, Per-User Quotas, and Abuse Detection
Request-count rate limiting fails for LLM workloads because a single request can consume 100K tokens. Token-based rate limiting with per-user quotas and abuse detection prevents runaway costs and catches prompt injection probing before it escalates.
Runtime Security with Falco on Kubernetes: Rules, Tuning, and Response Automation
Prevention-only security has a binary failure mode: either the control holds and the attacker is stopped, or the control fails and the attacker...
Kubernetes Network Policies That Actually Work: From Default Deny to Microsegmentation
By default, every pod in a Kubernetes cluster can communicate with every other pod across all namespaces. There are no network boundaries.
LLM Cost Controls: Budget Enforcement, Token Metering, and Spend Alerting
Without enforced budgets, a single team can exhaust an organization's entire AI spend in days. Token metering with per-team budgets, automatic request rejection at limits, model routing by cost, and chargeback dashboards turn LLM spending from a surprise into a managed line item.
Kubelet Security Configuration: Authentication, Authorization, and Read-Only Port
The kubelet runs on every node in the cluster with root-level access to the container runtime, all pod specifications, mounted secrets, and the host..
Kubernetes RBAC Design Patterns: Least Privilege Without Paralysing Developers
RBAC sprawl in multi-team Kubernetes clusters grows past 100 role bindings within months.
Kubernetes Secrets Management: External Secrets Operator, Vault, and Sealed Secrets
Kubernetes Secrets are base64-encoded, not encrypted. Anyone with RBAC read access to secrets in a namespace can decode every credential stored there.
AI Incident Forensics: Reconstructing What an AI System Did, Why, and What Data It Accessed
When a traditional application causes an incident, you examine logs, traces, and database queries to reconstruct what happened.
Hardening Model Inference Endpoints: Authentication, Rate Limiting, and Input Validation
Model inference endpoints are GPU-backed and expensive, $2-30 per hour per GPU. A single unprotected endpoint exposed to the internet can accumulate..
Kubernetes Admission Control: From PodSecurity Standards to Custom OPA/Kyverno Policies
Without admission control, any user with deployment permissions can run privileged containers, mount the host filesystem, use the host network, run...
AI Data Leakage Prevention: Input Filtering, Output Scanning, and Audit Trails
AI systems leak data in ways traditional applications do not. A language model trained on customer data can reproduce verbatim customer records in...
Jupyter Notebook Security: Authentication, Isolation, and Data Protection
JupyterHub is a code execution platform. Every notebook cell is arbitrary code running with whatever permissions the notebook server process has.
Multi-Tenancy Hardening in Kubernetes: Namespace Isolation, Resource Quotas, and Network Boundaries
Kubernetes namespaces provide logical separation, not security isolation. By default, pods in namespace A can send network traffic to pods in...
Building a Content Filtering Pipeline for LLM Applications: From Raw Input to Safe Output
A single content filter is not a pipeline. Most LLM deployments add one filter (usually on output) and call it done.
AI Red Teaming Methodology: Structured Adversarial Testing for LLM Applications
Traditional security testing (penetration testing, vulnerability scanning) does not cover AI-specific attack surfaces.
Kubernetes Image Policy Enforcement: Cosign, Notation, and Admission Webhooks
Without image policy enforcement, any container image from any registry can run in a Kubernetes cluster.
Securing RAG Pipelines: Vector Database Access Control, Document Poisoning, and Retrieval Filtering
Retrieval-Augmented Generation (RAG) adds a knowledge base to LLM applications, the model retrieves relevant documents before generating a response.
Pod Security Context Deep Dive: runAsNonRoot, readOnlyRootFilesystem, and Capabilities
Kubernetes SecurityContext has over 15 configurable fields, but most teams only set runAsNonRoot: true and consider the job done.
Vector Database Security: Access Control, Embedding Protection, and Query Isolation
Vector databases are the backbone of RAG (Retrieval-Augmented Generation) systems.
A/B Model Deployment Safety: Canary Rollouts, Traffic Splitting, and Automated Rollback for ML Models
Deploying a new ML model version is not the same as deploying a new application version.
Kubernetes API Server Hardening: Flags, Authentication, and Audit Logging
The API server is the front door to the Kubernetes cluster. Every kubectl command, every controller reconciliation, every pod scheduling decision,...
Seccomp Profiles for Production Workloads: Writing, Testing, and Deploying Custom Profiles
The default container runtime allows approximately 300 syscalls. A compromised container can use unshare to create new namespaces, clone to spawn...
etcd Encryption at Rest: Configuration, Key Rotation, and Performance Impact
Kubernetes Secrets are stored in etcd as base64-encoded plaintext. Base64 is an encoding, not encryption.
Implementing AI Guardrails: Input Validation, Output Filtering, and Safety Classifiers in Production
Deploying an LLM without guardrails is deploying an application where any user can make it say or do anything.
Hardening Kubernetes Ingress Controllers: NGINX, Traefik, and Envoy Compared
The ingress controller is the internet-facing entry point to a Kubernetes cluster.
LLM Observability in Production: Monitoring Latency, Token Usage, Safety Violations, and Drift
Traditional application monitoring (CPU, memory, HTTP status codes, latency) tells you nothing about what an LLM is doing.
Hardening Model Serving Frameworks: TorchServe, Triton, and vLLM Security Configuration
Model serving frameworks ship with defaults optimised for development: management APIs exposed on all interfaces without authentication, model files..
Securing Fine-Tuning Pipelines: Data Isolation, Checkpoint Integrity, and Access Control
Fine-tuning pipelines are high-value targets. They consume expensive GPU hours, process proprietary training data, and produce model checkpoints that...
Hardening the Kubernetes Scheduler: Topology Constraints and Security-Aware Placement
The Kubernetes scheduler places pods on nodes based on resource availability and basic constraints.
Kubernetes Audit Log Analysis: What to Log, How to Query, and What to Alert On
Kubernetes audit logs record every request to the API server: who made the request, what they asked for, and whether it succeeded.
Securing Model Artifact Pipelines: From Training to Serving
Model files are opaque binaries ranging from 1GB to over 1TB. You cannot code-review a set of weights.
RLHF Data Protection: Securing Human Feedback Loops, Preference Data, and Reward Models
Reinforcement Learning from Human Feedback (RLHF) pipelines introduce unique security surfaces that standard ML training workflows do not have.
AI API Key Management: Rotation, Scoping, and Abuse Detection
AI services have turned API keys into direct spending controls. A leaked OpenAI or Anthropic key can generate thousands of dollars in charges within...
Prompt Injection Defence in Production: Input Validation, Output Filtering, and Monitoring
Prompt injection is the SQL injection of AI systems, the most common and most damaging attack class against LLM-powered applications.
Network Segmentation for AI Training Infrastructure
AI training clusters frequently share networks with production services. A training job that can reach the production database is one compromised...
Observability for LLM Applications: Token Usage, Latency Anomalies, and Output Classification
LLM-powered applications have unique observability requirements that standard APM tools do not address: token-based cost tracking (not just request...
Model Registry Access Control: Versioning, Signing, and Promotion Gates
Model registries are the bridge between training and production. A model pushed to the production registry gets served to users.
Kubernetes Service Account Token Security: Bound Tokens, Projected Volumes, and OIDC
Every pod in Kubernetes receives a service account token by default. In clusters running older configurations or without explicit hardening, these...