Log Integrity and Tamper Detection: Ensuring Your Audit Trail Is Trustworthy
Problem
An attacker’s first post-compromise action is covering their tracks. On a Linux host, this means deleting /var/log/audit/audit.log, clearing journal entries, and modifying application logs to remove evidence. If your audit logs are stored on the same host the attacker compromised, they are worthless for investigation.
Most organisations ship logs centrally, but few verify integrity. A sophisticated attacker who compromises the log shipper, the aggregation layer, or the storage backend can modify logs in transit or at rest. Without integrity verification, you cannot prove your audit trail is complete and unmodified.
This matters for incident response (was the evidence tampered with?) and compliance (SOC 2 requires provable log integrity).
Target systems: Any Linux host with auditd. Vector or Fluentd for shipping. S3-compatible object storage for immutable archival.
Threat Model
- Adversary: Post-compromise attacker with root access on a host, attempting to erase forensic evidence. Or: attacker who has compromised the log pipeline (shipper, aggregator, or storage backend).
- Objective: Delete, modify, or suppress log entries that would reveal the attack timeline, techniques, and scope.
- Blast radius: Without log integrity, complete loss of forensic evidence. The incident cannot be investigated, root cause cannot be determined, and compliance audits fail.
Configuration
Ship Logs Off-Host Before the Attacker Can Delete Them
The most important control: minimise the time window between log generation and off-host shipment.
# /etc/vector/vector.yaml
# Ship audit logs via Unix socket - lowest possible latency.
# auditd writes to the socket; Vector reads immediately.
sources:
auditd_socket:
type: unix_datagram
path: /var/run/audispd_events
max_length: 8192
transforms:
parse:
type: remap
inputs: [auditd_socket]
source: |
.host = get_hostname!()
.shipped_at = now()
.source = "auditd"
sinks:
# Primary: queryable storage
axiom:
type: axiom
inputs: [parse]
dataset: "audit-logs"
token: "${AXIOM_API_TOKEN}"
# Secondary: immutable archival
s3_immutable:
type: aws_s3
inputs: [parse]
bucket: "audit-logs-immutable"
region: "eu-west-1"
key_prefix: "{{ host }}/{{ timestamp }}"
encoding:
codec: json
batch:
max_bytes: 10485760
timeout_secs: 60
Configure auditd to write to the Unix socket via audisp:
# /etc/audit/plugins.d/vector.conf
active = yes
direction = out
path = /var/run/audispd_events
type = builtin
format = string
Latency target: Log entries should arrive in off-host storage within 5 seconds of generation. With the Unix socket approach, typical latency is under 1 second.
Immutable Storage with S3 Object Lock
# Create bucket with Object Lock (WORM - Write Once Read Many)
aws s3api create-bucket \
--bucket audit-logs-immutable \
--region eu-west-1 \
--object-lock-enabled-for-bucket \
--create-bucket-configuration LocationConstraint=eu-west-1
# Set default retention: 365 days in Compliance mode
# Compliance mode: even the root account cannot delete or shorten retention.
aws s3api put-object-lock-configuration \
--bucket audit-logs-immutable \
--object-lock-configuration '{
"ObjectLockEnabled": "Enabled",
"Rule": {
"DefaultRetention": {
"Mode": "COMPLIANCE",
"Days": 365
}
}
}'
IAM policy for the log shipper, write-only, no read, no delete:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:PutObject"],
"Resource": "arn:aws:s3:::audit-logs-immutable/*"
},
{
"Effect": "Deny",
"Action": [
"s3:DeleteObject",
"s3:DeleteObjectVersion",
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::audit-logs-immutable",
"arn:aws:s3:::audit-logs-immutable/*"
]
}
]
}
The shipper can write new log objects but cannot read, list, or delete existing ones. Even if the attacker compromises the shipper’s credentials, they cannot modify or delete already-shipped logs.
For non-AWS: Backblaze (#161) B2 with Object Lock, or Wasabi (#162) with immutable bucket policy.
Hash-Chaining for Tamper Detection
Each log batch includes the SHA-256 hash of the previous batch, creating an append-only chain. Any modification, insertion, or deletion breaks the chain.
#!/usr/bin/env python3
# hash-chain-shipper.py
# Wraps log batches with hash-chain integrity before shipping.
import hashlib
import json
import time
import sys
previous_hash = "GENESIS" # First batch has no predecessor
def create_batch(log_entries: list) -> dict:
global previous_hash
batch = {
"timestamp": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
"host": "web-01.example.com",
"previous_hash": previous_hash,
"entries": log_entries,
"entry_count": len(log_entries),
}
# Hash the entire batch content (excluding the batch_hash field itself)
batch_content = json.dumps(batch, sort_keys=True)
batch["batch_hash"] = hashlib.sha256(batch_content.encode()).hexdigest()
# Update the chain
previous_hash = batch["batch_hash"]
return batch
# Example usage:
# entries = [line.strip() for line in sys.stdin]
# batch = create_batch(entries)
# print(json.dumps(batch))
# Ship batch to S3 and Axiom
Verification script:
#!/usr/bin/env python3
# verify-hash-chain.py
# Verifies the integrity of a hash-chained log archive.
# Detects: insertions, deletions, modifications, and reordering.
import json
import hashlib
import sys
import glob
def verify_chain(batch_files: list) -> bool:
"""Verify hash chain integrity across a series of batch files."""
expected_previous = "GENESIS"
errors = []
for i, filepath in enumerate(sorted(batch_files)):
with open(filepath) as f:
batch = json.load(f)
# Verify previous_hash links correctly
if batch["previous_hash"] != expected_previous:
errors.append(
f"CHAIN BROKEN at batch {i} ({filepath}): "
f"expected previous_hash={expected_previous}, "
f"got {batch['previous_hash']}"
)
# Verify batch_hash is correct for the content
stored_hash = batch.pop("batch_hash")
recomputed = hashlib.sha256(
json.dumps(batch, sort_keys=True).encode()
).hexdigest()
batch["batch_hash"] = stored_hash # Restore
if stored_hash != recomputed:
errors.append(
f"CONTENT MODIFIED in batch {i} ({filepath}): "
f"stored hash={stored_hash}, computed={recomputed}"
)
expected_previous = stored_hash
if errors:
for e in errors:
print(f"FAIL: {e}", file=sys.stderr)
return False
else:
print(f"OK: {len(batch_files)} batches verified, chain intact.")
return True
# Usage: python3 verify-hash-chain.py /path/to/batch-*.json
files = glob.glob(sys.argv[1]) if len(sys.argv) > 1 else []
if not files:
print("Usage: verify-hash-chain.py '/path/to/batch-*.json'")
sys.exit(1)
sys.exit(0 if verify_chain(files) else 1)
Log Gap Detection
Monitor the expected log rate per host. If a host stops sending logs, it may be compromised.
# Prometheus alert: log rate drops to zero for a host
groups:
- name: log-integrity
rules:
- alert: LogGapDetected
expr: >
absent_over_time(
vector_component_events_out_total{component_id="auditd_socket"}[5m]
)
labels:
severity: critical
annotations:
summary: "No audit logs received from {{ $labels.host }} for 5 minutes"
runbook: |
CRITICAL: A host has stopped sending audit logs.
Possible causes:
1. Host is down (check uptime monitoring)
2. Vector/auditd crashed (check process status)
3. Host is compromised and attacker killed the log shipper
If cause is unknown: treat as potential compromise.
Do NOT SSH to the host, use out-of-band console access.
- alert: LogRateAnomaly
expr: >
rate(vector_component_events_out_total{component_id="auditd_socket"}[5m])
< 0.1 * avg_over_time(
rate(vector_component_events_out_total{component_id="auditd_socket"}[5m])[7d:5m]
)
for: 10m
labels:
severity: warning
annotations:
summary: "Audit log rate from {{ $labels.host }} dropped to 10% of normal"
description: "May indicate selective log suppression by an attacker."
Expected Behaviour
- Audit logs arrive in off-host storage within 5 seconds of generation (typically <1 second with Unix socket)
- Immutable S3 storage prevents deletion or modification of stored logs for 365 days
- Log shipper credentials are write-only, compromising the shipper does not allow reading or deleting existing logs
- Hash-chain verification script detects any tampered, inserted, or deleted batch
- Log gap alert fires within 5 minutes of a host going silent
- Log rate anomaly alert detects selective log suppression within 10 minutes
Trade-offs
| Control | Impact | Risk | Mitigation |
|---|---|---|---|
| Immediate off-host shipping | Network bandwidth: 1-5GB/host/day | Shipper failure creates a gap | Vector’s disk buffer preserves events during network outages; replays when connection restores |
| S3 Object Lock (Compliance mode) | Cannot delete logs even if you want to (365-day lock) | Accidental sensitive data in logs cannot be removed | Filter PII before shipping. Review log content in staging before enabling immutable shipping in production. |
| Hash-chaining | Adds processing overhead to each batch | Verification requires downloading all batches in sequence | Run verification on a schedule (daily) rather than real-time. Keep batch sizes manageable (1-10MB). |
| Write-only IAM | Shipper cannot verify its own uploads | Upload failures are silent from the shipper’s perspective | Monitor S3 PutObject success rate via CloudWatch. Vector reports delivery success/failure metrics. |
| Log gap detection (5-minute window) | Attacker has up to 5 minutes to operate before gap is detected | Not instant. some log suppression goes undetected for 5 min | Reduce alert window to 2 minutes for high-security hosts. Accept the increased false positive rate from brief network glitches. |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| Vector crashes on host | Logs buffer locally but don’t ship | Log gap alert fires within 5 minutes; Vector restart count increases | Vector auto-restarts via systemd. Disk buffer replays missed events. Investigate why Vector crashed. |
| S3 IAM policy misconfigured | Attacker with shipper credentials can delete logs | Hash-chain verification fails; log count mismatch between primary and archival | Fix IAM policy. Restore from secondary storage if available. This is why dual-shipping (Axiom + S3) matters. |
| auditd killed by attacker | No new log entries generated on the compromised host | Log gap alert fires; auditd process not running | Do NOT SSH to the host. Use out-of-band console access. Capture forensic image. The already-shipped logs are the primary evidence. |
| Hash chain broken | Verification script reports chain break at a specific batch | verify-hash-chain.py identifies the exact batch with the break |
Investigate: is the break from a legitimate issue (shipper restart, missed batch) or tampering? Compare primary (Axiom) and archival (S3) copies. |
| Network outage prevents shipping | Logs accumulate locally; gap in centralized storage | Vector backlog metrics increase; log gap alert fires | Vector disk buffer holds events (configure buffer size for expected outage duration). Events ship automatically when network restores. |
When to Consider a Managed Alternative
Transition point: Building tamper-proof log infrastructure requires: immutable storage configuration, hash-chaining implementation, gap detection alerting, and dual-shipping pipeline. For teams without dedicated security engineering, this is 20-30 hours of initial setup.
- Axiom (#112): Immutable storage by design. All data in Axiom is append-only. No configuration needed for immutability. 500GB/month free tier. Serverless query. This is the simplest path to tamper-proof log storage.
- Grafana Cloud (#108): Managed Loki for log storage. Not inherently immutable, but provider-managed infrastructure is significantly harder for an attacker to compromise than self-managed.
- Better Stack (#113): Integrated logging + incident management. Managed storage with retention controls.
- Backblaze (#161) B2 / Wasabi (#162): Cheapest immutable object storage ($0.006/GB/month) for long-term archival alongside a queryable primary (Axiom or Grafana Cloud).
Premium content pack: Log integrity toolkit. Vector pipeline configs for dual-shipping (queryable + immutable), hash-chaining scripts (Python), verification scripts, S3 Object Lock configuration templates, Prometheus alert rules for gap detection, and IAM policy templates for write-only shipping.