How AI Is Compressing the Attacker Timeline: What Defenders Need to Change Now
Problem
The gap between vulnerability disclosure and weaponised exploit used to be measured in weeks. In 2020, the median time from CVE publication to first observed exploitation was 42 days. By 2024, it was under 7 days for critical vulnerabilities. AI-assisted exploit development is compressing this further, to hours.
This is not a theoretical projection. It is happening now:
- Automated vulnerability discovery: AI models are finding real, exploitable vulnerabilities in production codebases faster than human security researchers. LLMs directed at source code identify bug classes (buffer overflows, type confusion, race conditions) that traditional static analysis misses, because they understand code semantics, not just patterns.
- AI-generated exploit code: Given a CVE description and a proof-of-concept stub, an LLM can generate a working exploit chain. The barrier to exploitation has dropped from “skilled researcher with weeks of effort” to “anyone with API access and hours of iteration.”
- Polymorphic payloads: AI generates unique payload variants for every attack. Each phishing email is original. Each malware sample has a different signature. Each exploit variant uses different code paths to achieve the same objective. Signature-based detection, WAF CRS rules, static Falco rules, antivirus signatures, was designed for a world where attackers reuse known patterns. That world is ending.
- Scaled social engineering: AI produces personalised phishing at industrial volume, referencing real projects, mimicking the target’s communication style, and creating pretexts from publicly available information. The era of “obviously fake” phishing is over.
The defender’s historical advantage was time. AI is erasing it. Security architectures built on 7-day patch SLAs, signature-based detection, and perimeter trust are now operating on assumptions that no longer hold.
Threat Model
- Adversary: Attacker using AI-assisted tooling. Not a nation-state exclusive, these tools are available to anyone with LLM API access or local model hosting capability.
- Access level: Varies. Network-based attacks (automated scanning + AI-generated exploits) require no initial access. Social engineering attacks (AI-generated phishing) target credential theft. Post-compromise actions (AI-adapted persistence, lateral movement) require initial foothold.
- Objective: Same as traditional adversaries, access, exfiltration, persistence, destruction. AI does not change the objective. It changes the speed, scale, and cost.
- Blast radius: AI does not change what can be compromised. It changes how fast. A vulnerability that previously had a 14-day window for patching now has a 24-48 hour window. An organisation that takes 7 days to deploy a critical patch is now 5 days too slow.
The key shift: Defenders must assume that any publicly disclosed vulnerability will be exploited within 48 hours. Defenders must assume that signature-based detection will miss AI-generated attack variants. Defenders must assume that phishing will be indistinguishable from legitimate communication.
Configuration
Compressing Your Patch Pipeline
Your patch deployment pipeline is now your primary security control. If it takes 7 days to deploy a critical security patch, you have a 5-day window where you are knowingly vulnerable.
Target state: Critical vulnerabilities patched in production within 24 hours of detection.
Automated vulnerability scanning in CI:
# .github/workflows/security-scan.yml
# Runs on every push and on a schedule (catch new CVEs in existing images).
name: Security Scan
on:
push:
branches: [main]
schedule:
- cron: '0 */6 * * *' # Every 6 hours
jobs:
scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t app:${{ github.sha }} .
- name: Scan with Trivy
uses: aquasecurity/trivy-action@0.28.0
with:
image-ref: 'app:${{ github.sha }}'
format: 'json'
output: 'trivy-results.json'
severity: 'CRITICAL,HIGH'
exit-code: '1' # Fail the build on critical/high CVEs
- name: Upload results
if: always()
uses: actions/upload-artifact@v4
with:
name: trivy-results
path: trivy-results.json
Automated dependency update with auto-merge for patch versions:
# .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "docker"
directory: "/"
schedule:
interval: "daily"
# Auto-merge patch version updates (e.g., 1.2.3 → 1.2.4)
# These are almost always security patches.
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "daily"
- package-ecosystem: "npm"
directory: "/"
schedule:
interval: "daily"
# .github/workflows/auto-merge-patches.yml
# Auto-merge Dependabot patch updates that pass all tests.
name: Auto-merge patches
on:
pull_request:
permissions:
pull-requests: write
contents: write
jobs:
auto-merge:
runs-on: ubuntu-latest
if: github.actor == 'dependabot[bot]'
steps:
- uses: dependabot/fetch-metadata@v2
id: metadata
- name: Auto-merge patch updates
if: steps.metadata.outputs.update-type == 'version-update:semver-patch'
run: gh pr merge --auto --squash "$PR_URL"
env:
PR_URL: ${{ github.event.pull_request.html_url }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Staged rollout for security patches:
# Kubernetes deployment with canary rollout.
# Apply the patched image to canary first, verify, then full rollout.
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: app
spec:
strategy:
canary:
steps:
- setWeight: 5 # 5% of traffic to canary
- pause: {duration: 5m} # Monitor for 5 minutes
- setWeight: 25
- pause: {duration: 5m}
- setWeight: 75
- pause: {duration: 5m}
# Automatic rollback if canary metrics degrade
analysis:
templates:
- templateName: success-rate
startingStep: 1
args:
- name: service-name
value: app
Break-glass procedure for zero-day critical patches:
When a zero-day critical vulnerability is disclosed and active exploitation is confirmed:
- Skip normal PR review, deploy directly from a security branch
- Run automated tests only (no manual review gate)
- Deploy to production canary immediately
- Monitor canary for 10 minutes (not the normal 5-minute wait per stage)
- If canary passes, full rollout
- Post-hoc review within 24 hours, the person who deployed reviews the change with a second engineer
- Document in incident log with justification for bypassing normal review
This procedure must be tested quarterly. If the team has never used break-glass, it will fail under the pressure of a real zero-day.
Moving from Signatures to Behavioural Detection
Signature-based detection matches “known bad.” Behavioural detection detects “different from known good.” Against AI-generated polymorphic attacks, only the second approach works.
Establishing process execution baselines with Falco:
# falco-rules-behavioural.yaml
# These rules detect deviation from expected behaviour per container image,
# not generic "bad" patterns.
# Rule: Web server containers should never spawn a shell.
- rule: Shell in Web Container
desc: A shell was spawned inside a container running a web server image.
condition: >
spawned_process
and container
and container.image.repository in (nginx, httpd, caddy, envoy)
and proc.name in (bash, sh, dash, zsh, csh, ksh)
output: >
Shell spawned in web container
(container=%container.name image=%container.image.repository
process=%proc.name parent=%proc.pname user=%user.name)
priority: WARNING
tags: [behavioural, container, shell]
# Rule: Database containers should never make outbound connections
# to the internet (only to known replication peers and monitoring).
- rule: Unexpected Outbound from Database
desc: A database container made a network connection to an unexpected destination.
condition: >
evt.type in (connect)
and container
and container.image.repository in (postgres, mysql, mariadb, mongo, redis)
and fd.sip != "0.0.0.0"
and not fd.sip in (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16, 127.0.0.0/8)
output: >
Database container connecting to external IP
(container=%container.name image=%container.image.repository
dest=%fd.sip:%fd.sport process=%proc.name)
priority: CRITICAL
tags: [behavioural, network, exfiltration]
# Rule: Detect unexpected binary execution.
# After baseline period, any binary not seen in the first 30 days
# of a container's life is suspicious.
- rule: Unexpected Binary Execution
desc: A process was executed that is not in the expected binary set for this image.
condition: >
spawned_process
and container
and not proc.name in (expected_binaries_list)
output: >
Unexpected binary executed in container
(container=%container.name image=%container.image.repository
binary=%proc.name parent=%proc.pname)
priority: NOTICE
tags: [behavioural, process]
Network flow baselines with Prometheus:
# prometheus-recording-rules.yaml
# Create baseline metrics for network connections per service.
groups:
- name: network-baselines
interval: 5m
rules:
# Track unique destination IPs per source service over 24 hours.
- record: security:unique_destinations:count_24h
expr: >
count by (source_workload, destination_ip) (
rate(hubble_flows_processed_total{verdict="FORWARDED"}[24h])
)
# Alert when a service connects to a destination not seen in the
# past 7 days (new destination = potential lateral movement or C2).
- alert: NewNetworkDestination
expr: >
security:unique_destinations:count_24h
unless on (source_workload, destination_ip)
security:unique_destinations:count_7d
for: 5m
labels:
severity: warning
annotations:
summary: "{{ $labels.source_workload }} connected to new destination {{ $labels.destination_ip }}"
runbook: "Verify this is expected. New deployments and scaling events create new connections. Investigate if no deployment occurred."
Detection for AI-Speed Attacks
When attacks happen at machine speed, human-speed response is too slow. Automated response is necessary for high-confidence detections.
# falcosidekick-config.yaml
# Automated response for confirmed threats.
# High-confidence detections trigger immediate containment.
# Medium-confidence detections alert for human investigation.
# Delete the pod running a confirmed crypto miner.
- action: kubernetes
parameters:
event_severity: Critical
rule_name: "Crypto Mining Detected"
action: delete
# Only auto-delete if the detection is high-confidence.
# Crypto mining has distinct signatures (known pool IPs,
# known binary names) that produce very few false positives.
# Isolate a pod exhibiting container escape behaviour.
# Apply a network policy that blocks all egress.
- action: kubernetes
parameters:
event_severity: Critical
rule_name: "Container Escape Attempt"
action: label
labels:
quarantine: "true"
# A separate NetworkPolicy matches quarantine=true
# and blocks all ingress and egress.
# quarantine-network-policy.yaml
# Applied to pods labelled quarantine=true by automated response.
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: quarantine
spec:
podSelector:
matchLabels:
quarantine: "true"
policyTypes:
- Ingress
- Egress
# Empty ingress and egress = deny all traffic.
ingress: []
egress: []
Hardening Authentication Against AI-Powered Phishing
AI-generated phishing is personalised and indistinguishable from legitimate communication. Password-based authentication (regardless of complexity) is now fundamentally broken against targeted phishing.
# Deploy FIDO2/WebAuthn authentication.
# This eliminates phishing entirely - the authentication is bound to
# the origin (domain), so a credential cannot be used on a fake site.
# For SSH: use security keys with OpenSSH 8.2+
# Generate a FIDO2 SSH key:
ssh-keygen -t ed25519-sk -O resident -O verify-required
# For web applications: require WebAuthn for all admin accounts.
# Configuration is application-specific, but the principle is universal:
# FIDO2/WebAuthn > TOTP > SMS > password-only
# For infrastructure access: Tailscale (#40) provides mesh VPN
# with SSO/MFA integration, eliminating password-based VPN access.
Expected Behaviour
After implementing the changes in this article:
- Patch pipeline: Critical vulnerability detected in CI → automated PR created → tests pass → auto-merged (patch version) or human-reviewed (minor/major) → deployed to canary → verified → full production rollout. Total time: under 24 hours for patch versions, under 48 hours with human review.
- Behavioural detection: Baselines established for process execution, network connections, and API call patterns within 30 days. Anomaly alerts fire within 1 minute of deviation. False positive rate below 5 per day after 30-day tuning period.
- Automated response: Confirmed crypto mining pods terminated within 2 minutes. Container escape attempts quarantined (network-isolated) within 2 minutes. Human notification sent simultaneously.
- Authentication: All infrastructure admin accounts use FIDO2/WebAuthn. Phishing attacks against these accounts fail regardless of sophistication.
Trade-offs
| Control | Impact | Risk | Mitigation |
|---|---|---|---|
| Auto-merge patch versions | Fastest patch deployment (minutes) | Patch version breaks backward compatibility (rare but possible) | Comprehensive automated test suite catches regressions. Canary deployment with automated rollback. |
| 24-hour patch SLA | Requires automated testing pipeline; removes human review bottleneck for patches | Insufficient test coverage means broken patches reach production | Invest in test coverage before implementing auto-merge. |
| Behavioural baselines (30-day learning) | No behavioural detection for new workloads during the learning period | Attacker targets new workloads before baseline is established | Use strict allowlists for new workloads; transition to baseline after learning period. |
| Automated pod termination | Instant containment for high-confidence threats | False positive kills a legitimate pod, causing service disruption | Only auto-terminate for detections with near-zero false positive rates (crypto mining, known escape techniques). Alert-only for lower-confidence detections. |
| FIDO2-only authentication | Eliminates phishing entirely | Hardware key cost ($25-90 per user). Key loss requires recovery procedure. | Issue two keys per user (primary + backup). Store backup in a secure location. Recovery process requires in-person identity verification. |
| Break-glass deployment | Bypasses normal review for zero-day response | Insufficient testing could push a breaking change | Mandatory post-hoc review within 24 hours. Automated rollback if canary degrades. Quarterly break-glass drills. |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| Auto-merged patch breaks production | Service degradation after Dependabot auto-merge | Canary metrics degrade; Argo Rollout pauses; Prometheus alerts fire | Automatic rollback via Argo Rollout. Add failing test case. Manually review the patch version that broke. |
| Behavioural baseline too narrow | Every deployment triggers alerts | Alert volume spikes 10x during deployment windows | Add deployment-window suppression (detect ArgoCD sync events, suppress behavioural alerts for 15 minutes post-deploy). |
| Automated response kills legitimate pod | Service outage from false positive pod termination | Service monitoring detects pod disappearance; Falcosidekick log shows auto-action | Pod restarts automatically (Deployment controller). Tune the detection rule that fired. Add exception for the specific workload if the detection is not applicable. |
| Baseline not established for new workload | No behavioural detection for 30 days after deployment | Gap in detection coverage visible in security dashboard (workloads without baselines) | Use strict process/network allowlists for new workloads (more restrictive than baseline, but immediate). |
| Break-glass procedure fails under pressure | Team does not know the process during a real zero-day | Zero-day response is chaotic; patch deployment takes 3 days instead of 24 hours | Quarterly break-glass drills. Document the procedure with step-by-step and assign roles (who triggers, who deploys, who monitors). |
| FIDO2 key lost | User locked out of all admin systems | User reports inability to authenticate | Issue replacement from pre-registered backup key. Revoke lost key. If no backup: in-person identity verification + new key registration. |
When to Consider a Managed Alternative
Transition point: Behavioural detection at scale requires 30-90 days of stored historical data, ML-capable anomaly analysis, and cross-signal correlation across network, process, and API layers. Self-managed Falco and Prometheus can handle small deployments (under 10 nodes, under 1000 events per second). Beyond that, the storage, query, and analysis requirements exceed what open-source tooling provides without significant infrastructure investment.
Recommended providers:
- Sysdig (#122): Built on Falco with ML-powered behavioural detection. Managed detection rules updated for emerging AI-generated attack techniques. Runtime vulnerability detection knows whether vulnerable code is actually executing (not just present in the image). Multi-cluster visibility.
- Cloudflare (#29): AI-powered WAF that adapts to novel attack patterns at the edge. Bot detection distinguishes AI-generated automated attacks from legitimate traffic. Edge rate limiting absorbs volumetric attacks before they reach your infrastructure.
- Panther (#127): Detection-as-code SIEM with Python-based behavioural rules. Enables security teams to write correlation rules that combine signals across network, process, and API layers.
- Elastic Security (#129): ML anomaly detection across logs and metrics. Useful for teams already running Elasticsearch for log aggregation.
What you still control: Patch pipeline design and automation. Falco rule writing for application-specific behavioural detection. Authentication policy (FIDO2 enforcement). Automated response thresholds (when to auto-kill vs when to alert). These are your security decisions, managed providers give you better data and faster detection, but the response strategy is yours.