Apache Druid can be deployed on Kubernetes using Docker containers and the druid-operator for simplified cluster management.
Docker Images
Official Druid Docker images are available on Docker Hub and can be pulled directly.
Pull Druid Image
Latest Release
Specific Version
Verify Image
Druid Operator
The druid-operator provides Kubernetes-native management of Druid clusters.
Features
Declarative Management Define cluster state with Kubernetes CRDs
Automatic Scaling Scale Druid components independently
Rolling Updates Zero-downtime upgrades and configuration changes
High Availability Built-in HA for all Druid components
Installation
Install the Operator
kubectl create namespace druid-operator
kubectl apply -f https://raw.githubusercontent.com/datainfrahq/druid-operator/master/config/manager/manager.yaml
Verify Installation
kubectl get pods -n druid-operator
Create Druid Cluster
Apply a Druid cluster manifest: kubectl apply -f druid-cluster.yaml
Example Cluster Configuration
druid-cluster.yaml
persistent-volumes.yaml
apiVersion : druid.apache.org/v1alpha1
kind : Druid
metadata :
name : druid-cluster
namespace : druid
spec :
image : apache/druid:28.0.0
startScript : /druid.sh
# Metadata storage
metadataStore :
type : postgresql
host : postgres.druid.svc.cluster.local
port : 5432
database : druid
# Deep storage
deepStorage :
type : s3
bucket : my-druid-bucket
baseKey : druid/segments
# ZooKeeper
zookeeper :
zkHosts : zk-cs.druid.svc.cluster.local:2181
# Common runtime properties
commonRuntimeProperties : |
druid.extensions.loadList=["druid-kafka-indexing-service", "druid-s3-extensions"]
druid.startup.logging.logProperties=true
# Node specifications
nodes :
coordinators :
nodeType : coordinator
druid.port : 8081
replicas : 2
resources :
requests :
memory : "4Gi"
cpu : "2"
limits :
memory : "4Gi"
cpu : "2"
runtime.properties : |
druid.coordinator.startDelay=PT30S
druid.coordinator.period=PT30S
brokers :
nodeType : broker
druid.port : 8082
replicas : 3
resources :
requests :
memory : "8Gi"
cpu : "4"
limits :
memory : "8Gi"
cpu : "4"
runtime.properties : |
druid.broker.http.numConnections=10
druid.server.http.numThreads=40
historicals :
nodeType : historical
druid.port : 8083
replicas : 3
resources :
requests :
memory : "16Gi"
cpu : "8"
limits :
memory : "16Gi"
cpu : "8"
runtime.properties : |
druid.processing.numThreads=7
druid.processing.buffer.sizeBytes=536870912
druid.segmentCache.locations=[{"path":"/druid/data/segments","maxSize":"100g"}]
volumeMounts :
- mountPath : /druid/data
name : data-volume
volumes :
- name : data-volume
persistentVolumeClaim :
claimName : historical-data
middleManagers :
nodeType : middleManager
druid.port : 8091
replicas : 2
resources :
requests :
memory : "8Gi"
cpu : "4"
limits :
memory : "8Gi"
cpu : "4"
runtime.properties : |
druid.worker.capacity=4
druid.indexer.runner.javaOpts=-Xms2g -Xmx2g
volumeMounts :
- mountPath : /druid/data
name : data-volume
volumes :
- name : data-volume
persistentVolumeClaim :
claimName : middlemanager-data
routers :
nodeType : router
druid.port : 8888
replicas : 2
resources :
requests :
memory : "1Gi"
cpu : "1"
limits :
memory : "1Gi"
cpu : "1"
runtime.properties : |
druid.router.http.numConnections=50
druid.router.http.numMaxThreads=100
ZooKeeper-less Deployment
Druid can run on Kubernetes without ZooKeeper by using the druid-kubernetes-extensions .
Enable Kubernetes Extensions
Load Extension
druid.extensions.loadList =[ "druid-kubernetes-extensions" , ...]
Configure Discovery
# Use Kubernetes for service discovery
druid.serverview.type =http
druid.coordinator.loadqueuepeon.type =http
# Kubernetes-specific settings
druid.discovery.type =k8s
druid.discovery.k8s.clusterIdentifier =druid-cluster
Leader Election
# Use Kubernetes for leader election (no ZooKeeper needed)
druid.leader.election.type =k8s
druid.leader.election.k8s.namespace =druid
druid.leader.election.k8s.lockResourceName =druid-leader-election
ZooKeeper-less mode requires Kubernetes 1.19+ and uses Kubernetes ConfigMaps for coordination.
Service Exposure
Internal Services
Create Kubernetes Services for inter-pod communication:
apiVersion : v1
kind : Service
metadata :
name : druid-broker
namespace : druid
spec :
type : ClusterIP
ports :
- port : 8082
targetPort : 8082
name : broker
selector :
nodeType : broker
---
apiVersion : v1
kind : Service
metadata :
name : druid-router
namespace : druid
spec :
type : LoadBalancer
ports :
- port : 80
targetPort : 8888
name : router
selector :
nodeType : router
External Access
LoadBalancer
Ingress
NodePort
spec :
type : LoadBalancer
ports :
- port : 80
targetPort : 8888
Automatically provisions a cloud load balancer (AWS ELB, GCP LB, etc.) apiVersion : networking.k8s.io/v1
kind : Ingress
metadata :
name : druid-ingress
namespace : druid
annotations :
cert-manager.io/cluster-issuer : letsencrypt-prod
spec :
tls :
- hosts :
- druid.example.com
secretName : druid-tls
rules :
- host : druid.example.com
http :
paths :
- path : /
pathType : Prefix
backend :
service :
name : druid-router
port :
number : 8888
spec :
type : NodePort
ports :
- port : 8888
nodePort : 30888
targetPort : 8888
Access via http://<node-ip>:30888
Resource Management
Resource Requests and Limits
Set appropriate requests and limits to ensure Kubernetes schedules pods efficiently and prevents resource contention.
resources :
requests :
memory : "8Gi" # Guaranteed memory
cpu : "4" # Guaranteed CPU
limits :
memory : "8Gi" # Maximum memory (should match requests for production)
cpu : "4" # Maximum CPU
For production, set memory limits equal to requests to avoid OOM kills and ensure consistent performance.
Quality of Service Classes
Guaranteed (Recommended)
Burstable
resources :
requests :
memory : "8Gi"
cpu : "4"
limits :
memory : "8Gi" # Same as request
cpu : "4" # Same as request
Best for production workloads. Pods won’t be evicted unless they exceed limits. resources :
requests :
memory : "4Gi"
cpu : "2"
limits :
memory : "8Gi" # Higher than request
cpu : "4" # Higher than request
Good for variable workloads. Can burst above requests.
Monitoring and Observability
Prometheus Integration
Enable Prometheus Emitter
druid.extensions.loadList =[ "prometheus-emitter" , ...]
druid.emitter =prometheus
druid.emitter.prometheus.strategy =exporter
druid.emitter.prometheus.port =8000
Create ServiceMonitor
apiVersion : monitoring.coreos.com/v1
kind : ServiceMonitor
metadata :
name : druid-metrics
namespace : druid
spec :
selector :
matchLabels :
app : druid
endpoints :
- port : metrics
interval : 30s
Health Checks
livenessProbe :
httpGet :
path : /status/health
port : 8082
initialDelaySeconds : 60
periodSeconds : 10
timeoutSeconds : 5
failureThreshold : 3
readinessProbe :
httpGet :
path : /status/health
port : 8082
initialDelaySeconds : 30
periodSeconds : 5
timeoutSeconds : 3
failureThreshold : 3
Best Practices
Use Init Containers for Dependencies
Ensure dependencies (ZooKeeper, metadata store) are ready: initContainers :
- name : wait-for-postgres
image : busybox:1.35
command :
- sh
- -c
- |
until nc -z postgres.druid.svc.cluster.local 5432; do
echo "Waiting for PostgreSQL..."
sleep 2
done
Configure Pod Disruption Budgets
Spread replicas across different nodes: affinity :
podAntiAffinity :
requiredDuringSchedulingIgnoredDuringExecution :
- labelSelector :
matchExpressions :
- key : nodeType
operator : In
values :
- broker
topologyKey : kubernetes.io/hostname
Configure Storage Classes
Use Horizontal Pod Autoscaler for dynamic scaling: apiVersion : autoscaling/v2
kind : HorizontalPodAutoscaler
metadata :
name : druid-broker-hpa
spec :
scaleTargetRef :
apiVersion : apps/v1
kind : Deployment
name : druid-broker
minReplicas : 2
maxReplicas : 10
metrics :
- type : Resource
resource :
name : cpu
target :
type : Utilization
averageUtilization : 70
Troubleshooting
View Logs kubectl logs -f < pod-nam e > -n druid
Describe Pod kubectl describe pod < pod-nam e > -n druid
Shell into Pod kubectl exec -it < pod-nam e > -n druid -- /bin/bash
Check Events kubectl get events -n druid --sort-by= '.lastTimestamp'
Additional Resources
Druid Operator Official Kubernetes operator repository
Docker Hub Official Druid Docker images
Kubernetes Extensions ZooKeeper-less deployment guide
Helm Charts Community Helm charts