Key Responsibilities
Query Routing
Determines which Historical and real-time services to forward queries to based on segment distribution
Result Consolidation
Merges and consolidates results from multiple services into a single response
Cache Management
Manages per-segment result caching with LRU invalidation strategy
Metadata Interpretation
Reads segment distribution information from ZooKeeper to build query plans
Configuration
For Apache Druid Broker service configuration, see the configuration reference:Running the Broker
Forwarding Queries
Most Druid queries contain an interval object that indicates a span of time for which data is requested. Similarly, Druid partitions segments to contain data for some interval of time and distributes the segments across a cluster.How Query Forwarding Works
Build view from ZooKeeper
The Broker service first builds a view of the world from information in ZooKeeper. ZooKeeper maintains information about Historical and streaming ingestion Peon services and the segments they are serving.
Build segment timeline
For every datasource in ZooKeeper, the Broker service builds a timeline of segments and the services that serve them.
Perform timeline lookup
When queries are received for a specific datasource and interval, the Broker service performs a lookup into the timeline associated with the query datasource for the query interval and retrieves the services that contain data for the query.
Example Scenario
Consider a simple datasource with seven segments where each segment contains data for a given day of the week. Any query issued to the datasource for more than one day of data will hit more than one segment. These segments will likely be distributed across multiple services, and hence, the query will likely hit multiple services.
Caching
Broker services employ a cache with an LRU (Least Recently Used) cache invalidation strategy. The Broker cache stores per-segment results.Cache Architecture
- Local Cache
- Distributed Cache
The cache can be local to each Broker service, providing fast access with no network overhead.
Cache Behavior
Each time a Broker service receives a query:Check cache for existing results
A subset of these segment results may already exist in the cache and the results can be directly pulled from the cache.
Forward uncached queries
For any segment results that do not exist in the cache, the Broker service will forward the query to the Historical services.
HTTP Endpoints
For a list of API endpoints supported by the Broker, see:Architecture Integration
The Broker service integrates with other Druid components:With ZooKeeper
- Maintains connection to ZooKeeper cluster for current cluster information
- Reads segment distribution metadata
- Monitors Historical and Peon service availability
With Historical Services
- Forwards queries to Historical services that hold relevant segments
- Receives and consolidates query results
- Does not manage segment loading/unloading (that’s the Coordinator’s job)
With Real-time Services
- Forwards queries for real-time data to streaming ingestion tasks
- Never caches real-time results due to data mutability