Securing Message Queues in Production: Kafka, RabbitMQ, and NATS Hardening
Problem
Message brokers carry some of the most sensitive data in any architecture, payment events, user actions, system commands, PII in event streams. Yet Kafka, RabbitMQ, and NATS often run with no authentication between producers and consumers, no TLS for inter-broker communication, and no topic/queue-level access control. A compromised service with message queue access can read every message in the system, inject fraudulent events, or disrupt processing by consuming messages from critical queues.
Threat Model
- Adversary: Compromised application with message queue credentials, or network attacker who can reach the broker port.
- Objective: Read messages from all topics/queues (data theft). Inject messages into processing queues (fraud, command injection). Consume messages from critical queues (denial of service to downstream consumers).
- Blast radius: Without access control, all topics, all queues, all messages. With per-topic ACLs, limited to the topics the compromised credential is authorised for.
Configuration
Kafka Hardening
# server.properties - Kafka broker security
# Authentication: SASL/SCRAM (password-based, no Kerberos dependency)
listeners=SASL_SSL://0.0.0.0:9093
advertised.listeners=SASL_SSL://kafka-0.example.com:9093
security.inter.broker.protocol=SASL_SSL
# SASL mechanism
sasl.mechanism.inter.broker.protocol=SCRAM-SHA-512
sasl.enabled.mechanisms=SCRAM-SHA-512
# TLS
ssl.keystore.location=/etc/kafka/kafka.keystore.jks
ssl.keystore.password=${KEYSTORE_PASSWORD}
ssl.truststore.location=/etc/kafka/kafka.truststore.jks
ssl.truststore.password=${TRUSTSTORE_PASSWORD}
ssl.protocol=TLSv1.3
# Authorization: ACLs
authorizer.class.name=kafka.security.authorizer.AclAuthorizer
super.users=User:admin
allow.everyone.if.no.acl.found=false
# CRITICAL: set to false. Default (true) allows any authenticated user to access any topic.
# Create SCRAM credentials for each service
kafka-configs.sh --bootstrap-server kafka:9093 --command-config admin.properties \
--alter --entity-type users --entity-name payment-service \
--add-config 'SCRAM-SHA-512=[password=strong-password-here]'
# Set ACLs: payment-service can produce to 'payments' topic only
kafka-acls.sh --bootstrap-server kafka:9093 --command-config admin.properties \
--add --allow-principal User:payment-service \
--producer --topic payments
# payment-service can consume from 'payment-results' topic only
kafka-acls.sh --bootstrap-server kafka:9093 --command-config admin.properties \
--add --allow-principal User:payment-service \
--consumer --topic payment-results --group payment-service-group
# Verify ACLs:
kafka-acls.sh --bootstrap-server kafka:9093 --command-config admin.properties --list
RabbitMQ Hardening
# rabbitmq.conf - hardened configuration
# TLS
listeners.tcp = none # Disable unencrypted port
listeners.ssl.default = 5671 # TLS-only
ssl_options.cacertfile = /etc/rabbitmq/tls/ca.crt
ssl_options.certfile = /etc/rabbitmq/tls/server.crt
ssl_options.keyfile = /etc/rabbitmq/tls/server.key
ssl_options.versions.1 = tlsv1.3
ssl_options.verify = verify_peer
ssl_options.fail_if_no_peer_cert = false
# Set to true for mTLS (require client certificates)
# Management UI: TLS and separate port
management.ssl.port = 15671
management.ssl.cacertfile = /etc/rabbitmq/tls/ca.crt
management.ssl.certfile = /etc/rabbitmq/tls/server.crt
management.ssl.keyfile = /etc/rabbitmq/tls/server.key
# Default user: change or delete
default_user = admin
default_pass = strong-admin-password
# Better: delete default guest user after creating admin
# Create per-service users with specific permissions
rabbitmqctl add_user payment_service 'strong-password'
# Set permissions: configure=none, write=payments.*, read=payment-results.*
rabbitmqctl set_permissions -p / payment_service \
"" \ # configure: cannot create/delete queues
"^payments\\..*" \ # write: can publish to payments.* exchanges
"^payment-results\\..*" # read: can consume from payment-results.* queues
# Set user tags (no management access)
rabbitmqctl set_user_tags payment_service
# No tags = no management UI access
# Delete default guest user
rabbitmqctl delete_user guest
NATS Hardening
# nats-server.conf - hardened configuration
# TLS
tls {
cert_file: "/etc/nats/tls/server.crt"
key_file: "/etc/nats/tls/server.key"
ca_file: "/etc/nats/tls/ca.crt"
verify: true
timeout: 2
}
# Authentication: NKey (Ed25519 public key)
authorization {
users = [
{
nkey: "UDXB3GFMFPSTXIBQJYRJY5Z3MXHQ7HOP2OUXVP5FJHNDQ3S4KLENQ"
permissions: {
publish: {
allow: ["payments.>"]
}
subscribe: {
allow: ["payment-results.>"]
}
}
}
]
}
# JetStream: enable with encryption at rest
jetstream {
store_dir: "/data/nats"
max_mem: 1G
max_file: 10G
# Encryption at rest requires NATS 2.10+
# encryption {
# key: "${NATS_ENCRYPTION_KEY}"
# }
}
# Cluster TLS
cluster {
tls {
cert_file: "/etc/nats/tls/cluster.crt"
key_file: "/etc/nats/tls/cluster.key"
ca_file: "/etc/nats/tls/ca.crt"
verify: true
}
}
Common Patterns: Network Isolation
# Kubernetes NetworkPolicy for any message broker
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: broker-access
namespace: messaging
spec:
podSelector:
matchLabels:
app: kafka # or rabbitmq, nats
policyTypes:
- Ingress
ingress:
# Only allow specific service namespaces to connect
- from:
- namespaceSelector:
matchLabels:
messaging-access: "allowed"
ports:
- port: 9093 # Kafka SASL_SSL
protocol: TCP
# - port: 5671 # RabbitMQ TLS
# - port: 4222 # NATS TLS
Monitoring
groups:
- name: message-queue-security
rules:
- alert: BrokerAuthFailure
expr: increase(kafka_server_authentication_failed_total[5m]) > 5
labels:
severity: warning
annotations:
summary: "Kafka authentication failures: {{ $value }} in 5 minutes"
- alert: UnauthorizedTopicAccess
expr: increase(kafka_server_authorization_denied_total[5m]) > 0
labels:
severity: warning
annotations:
summary: "Kafka ACL denial, unauthorized topic access attempt"
- alert: ConsumerLagAnomaly
expr: >
kafka_consumergroup_lag > 10000
and kafka_consumergroup_lag > 5 * avg_over_time(kafka_consumergroup_lag[7d])
for: 15m
labels:
severity: warning
annotations:
summary: "Consumer group {{ $labels.group }} lag is 5x above baseline"
Expected Behaviour
- All broker connections encrypted with TLS 1.3
- Each service authenticates with its own credentials (no shared passwords)
- Per-topic/queue ACLs restrict which services can produce to and consume from each topic
- Default/guest users disabled or deleted
- Inter-broker communication encrypted and authenticated
- Network policies restrict broker access to authorized namespaces only
Trade-offs
| Control | Impact | Risk | Mitigation |
|---|---|---|---|
| TLS on all connections | 5-15% throughput reduction for TLS handshakes; negligible for persistent connections | Certificate management for all clients | Use cert-manager for automatic lifecycle. |
| Per-service credentials | More credentials to manage | Credential sprawl | Use Vault (#65) for dynamic broker credentials. |
| Per-topic ACLs | Restrict access to authorised topics | New services need ACL updates before they can produce/consume | Include ACL creation in the service deployment checklist. |
| Disable KEYS/EVAL equivalent | Prevents admin abuse | Operations tooling may use admin commands | Restrict admin commands to admin users only. |
Failure Modes
| Failure | Symptom | Detection | Recovery |
|---|---|---|---|
| TLS certificate expired | All new broker connections fail | cert-manager alerts; connection error logs from all clients | Renew certificate. Kafka/RabbitMQ require restart for new certs. NATS supports hot reload. |
| ACL misconfigured | Service can’t produce/consume | Application logs show authorization error; Kafka ACL denial metric | Fix ACL with kafka-acls.sh / rabbitmqctl set_permissions / nats config. |
| Credential rotated without updating client | Client authentication fails | Auth failure metrics spike; application logs show SASL error | Update client credentials. Use Vault (#65) for automatic rotation. |
| Consumer group hijacked | Legitimate consumer starves (attacker consumes messages) | Consumer lag spikes; monitoring shows unexpected consumer in the group | Remove unauthorized consumer. Review and tighten ACLs. |
When to Consider a Managed Alternative
Self-managed broker HA is complex (Kafka ZooKeeper/KRaft, RabbitMQ quorum queues, NATS JetStream clustering). TLS certificate rotation and ACL management across many services is ongoing operational burden.
- Confluent (#165): Managed Kafka with built-in RBAC, TLS, and Schema Registry. Free tier available.
- Redpanda (#166): Kafka-compatible, no ZooKeeper, simpler operations. Free OSS + managed cloud.
- CloudAMQP (#167): Managed RabbitMQ with TLS and user management. Free tier (Little Lemur).
- Synadia (#168): Managed NATS with JetStream.
Premium content pack: Message queue hardening configurations. Kafka SASL/TLS + ACL setup, RabbitMQ TLS + per-service permissions, NATS NKey + subject permissions, Kubernetes network policies, and Prometheus monitoring rules for all three brokers.