Architecture overview
The following diagram shows the services that make up the Druid architecture, their typical arrangement across servers, and how queries and data flow through this architecture.Druid services
Druid has several types of services, each responsible for specific aspects of data processing and query execution:Coordinator
Manages data availability on the cluster
Overlord
Controls the assignment of data ingestion workloads
Broker
Handles queries from external clients
Router
Routes requests to Brokers, Coordinators, and Overlords
Historical
Stores queryable data
Middle Manager
Ingests data through Peon processes
Indexer
Alternative to Middle Manager + Peon task execution
Server types
For ease of deployment, we recommend organizing Druid services into three server types: Master, Query, and Data.Master server
A Master server manages data ingestion and availability. It is responsible for starting new ingestion jobs and coordinating availability of data on the Data server. Master servers divide operations between Coordinator and Overlord services.- Coordinator service
- Overlord service
The Coordinator service watches over the Historical services on the Data servers. It is responsible for:
- Segment assignment: Assigning segments to specific servers
- Load balancing: Ensuring segments are well-balanced across Historicals
- Replication: Managing segment replication across multiple nodes
- Segment lifecycle: Loading new segments and dropping outdated ones
How segment assignment works
Before any unassigned segments are serviced by Historical services, the Historical services for each tier are first sorted in terms of capacity, with least capacity servers having the highest priority. Unassigned segments are always assigned to the services with least capacity to maintain a level of balance between services.The Coordinator does not directly communicate with a Historical service when assigning it a new segment. Instead, the Coordinator creates temporary information about the new segment under the load queue path of the Historical service in ZooKeeper. Once this request is seen, the Historical service loads the segment and begins servicing it.Balancing strategies
The Coordinator employs different strategies to balance segments across Historicals:- cost (default): Picks the server with the minimum “cost” of placing a segment. The cost is a function of the data interval of the segment and the data intervals of all segments already present on the candidate server. This strategy tries to avoid placing segments with adjacent or overlapping data intervals on the same server.
- diskNormalized: A derivative of the cost strategy that weights the cost with the disk usage ratio of the server
- random: Distributes segments randomly across servers (experimental)
All strategies prioritize moving segments from the Historical with the least available disk space.
Automatic compaction
The Coordinator manages the automatic compaction system. Each run, the Coordinator compacts segments by merging small segments or splitting large ones. This is useful when the size of your segments is not optimized, which may degrade query performance.You can run the Coordinator and Overlord services as a single combined service by setting the
druid.coordinator.asOverlord.enabled property.Query server
A Query server provides the endpoints that users and client applications interact with, routing queries to Data servers or other Query servers (and optionally proxied Master server requests). Query servers divide operations between Broker and Router services.- Broker service
- Router service
The Broker service receives queries from external clients and forwards those queries to Data servers. When Brokers receive results from those subqueries, they merge those results and return them to the caller.
Query routing
To determine which services to forward queries to, the Broker service first builds a view of the world from information in ZooKeeper. ZooKeeper maintains information about Historical and streaming ingestion Peon services and the segments they are serving.For every datasource in ZooKeeper, the Broker service builds a timeline of segments and the services that serve them. When queries are received for a specific datasource and interval, the Broker service performs a lookup into the timeline associated with the query datasource for the query interval and retrieves the services that contain data for the query.Caching
Broker services employ a cache with an LRU cache invalidation strategy. The Broker cache stores per-segment results. The cache can be local to each Broker service or shared across multiple services using an external distributed cache such as memcached.Typically, you query Brokers rather than querying Historical or Middle Manager services on Data servers directly.
Data server
A Data server executes ingestion jobs and stores queryable data. Data servers divide operations between Historical and Middle Manager services.- Historical service
- Middle Manager service
- Indexer service (optional)
The Historical service handles storage and querying on historical data, including any streaming data that has been in the system long enough to be committed. Historical services download segments from deep storage and respond to queries about these segments. They don’t accept writes.
Loading and serving segments
Each Historical service copies or pulls segment files from deep storage to local disk in an area called the segment cache. The Coordinator controls the assignment of segments to Historicals and the balance of segments between Historicals.Historical services do not communicate directly with each other, nor do they communicate directly with the Coordinator. Instead, the Coordinator creates ephemeral entries in ZooKeeper in a load queue path. Each Historical service maintains a connection to ZooKeeper, watching those paths for segment information.Retrieve metadata
If not cached, Historical retrieves metadata from ZooKeeper about the segment, including where it’s located in deep storage
Memory-mapped caching
The segment cache uses memory mapping. The cache consumes memory from the underlying operating system so Historicals can hold parts of segment files in memory to increase query performance at the data level.At query time:- If the required part of a segment file is available in the memory mapped cache (“page cache”), the Historical reads it directly from memory
- If it is not in the memory-mapped cache, the Historical reads that part of the segment from disk
This memory-mapped segment cache is in addition to other query-level caches. To make data available for querying as soon as possible, Historical services search the local segment cache upon startup and advertise the segments found there.
Service colocation
Colocating Druid services by server type generally results in better utilization of hardware resources for most clusters. For very large scale clusters, it can be desirable to split the Druid services such that they run on individual servers to avoid resource contention.Coordinators and Overlords
The workload on the Coordinator service tends to increase with the number of segments in the cluster. The Overlord’s workload also increases based on the number of segments in the cluster, but to a lesser degree than the Coordinator.In clusters with very high segment counts, it can make sense to separate the Coordinator and Overlord services to provide more resources for the Coordinator’s segment balancing workload.
Historicals and Middle Managers
With higher levels of ingestion or query load, it can make sense to deploy the Historical and Middle Manager services on separate hosts to avoid CPU and memory contention. The Historical service also benefits from having free memory for memory mapped segments, which can be another reason to deploy the Historical and Middle Manager services separately.External dependencies
In addition to its built-in service types, Druid also has three external dependencies. These are intended to be able to leverage existing infrastructure, where present.Deep storage
Druid uses deep storage to store any data that has been ingested into the system. Deep storage is shared file storage accessible by every Druid server. In a clustered deployment, this is typically a distributed object store like S3 or HDFS, or a network mounted filesystem. In a single-server deployment, this is typically local disk.- Purpose
- Performance considerations
- Sizing
Druid uses deep storage for the following purposes:
- Data persistence: To store all the data you ingest. Segments that get loaded onto Historical services for low latency queries are also kept in deep storage for backup purposes.
- Query from deep storage: Segments that are only in deep storage can be used for queries from deep storage
- Data transfer: As a way to transfer data in the background between Druid services
Deep storage is an important part of Druid’s elastic, fault-tolerant design. Druid bootstraps from deep storage even if every single data server is lost and re-provisioned.
Metadata storage
The metadata storage holds various shared system metadata such as segment usage information and task information. In a clustered deployment, this is typically a traditional RDBMS like PostgreSQL or MySQL. In a single-server deployment, it is typically a locally-stored Apache Derby database.ZooKeeper
ZooKeeper is used for internal service discovery, coordination, and leader election. It maintains critical information about:- Historical and Peon services
- Segment distribution and availability
- Service health and status
- Coordination between services
Learn more
See the following topics for more information:Storage components
Learn about data storage in Druid
Segments
Learn about segment files
Query processing
High-level overview of how Druid processes queries
Web console
Learn about the Druid web console UI