Skip to main content
The Overlord service is responsible for accepting tasks, coordinating task distribution, creating locks around tasks, and returning statuses to callers. The Overlord can be configured to run in one of two modes: local or remote (local being default).

Operating Modes

In local mode, the Overlord is also responsible for creating Peons for executing tasks. When running the Overlord in local mode, all Middle Manager and Peon configurations must be provided as well.
Local mode is typically used for simple workflows.

Key Responsibilities

Task Acceptance

Receives and validates incoming indexing tasks

Task Distribution

Coordinates distribution of tasks to Middle Managers or Peons

Lock Management

Creates and manages locks around tasks to prevent conflicts

Status Reporting

Returns task statuses and progress to callers

Worker Blacklisting

Monitors task failures and blacklists problematic workers

Autoscaling

Automatically scales Middle Manager capacity based on task load

Configuration

For Apache Druid Overlord service configuration, see:

Running the Overlord

org.apache.druid.cli.Main server overlord

HTTP Endpoints

For a list of API endpoints supported by the Overlord, see:

Blacklisted Workers

If a Middle Manager has task failures above a threshold, the Overlord will blacklist these Middle Managers. This prevents problematic workers from receiving new tasks while they’re experiencing issues.

Blacklisting Behavior

1

Monitor task failures

The Overlord tracks task failures per Middle Manager against the configured threshold.
2

Blacklist problematic workers

When a Middle Manager exceeds the failure threshold, it is added to the blacklist.
3

Respect blacklist limits

No more than 20% of the Middle Managers can be blacklisted at any given time.
4

Periodic whitelisting

Blacklisted Middle Managers will be periodically whitelisted to allow recovery.

Configuration Variables

The following variables can be used to set the threshold and blacklist timeouts:
druid.indexer.runner.maxRetriesBeforeBlacklist
integer
Maximum number of task failures before a worker is blacklisted
druid.indexer.runner.workerBlackListBackoffTime
duration
Time period a worker remains blacklisted before being reconsidered
druid.indexer.runner.workerBlackListCleanupPeriod
duration
How often the Overlord checks for workers that can be removed from the blacklist
druid.indexer.runner.maxPercentageBlacklistWorkers
float
Maximum percentage of workers that can be blacklisted (default: 0.2 or 20%)

Example Configuration

druid.indexer.runner.maxRetriesBeforeBlacklist=5
druid.indexer.runner.workerBlackListBackoffTime=PT15M
druid.indexer.runner.workerBlackListCleanupPeriod=PT5M
druid.indexer.runner.maxPercentageBlacklistWorkers=0.2
Blacklisting is a protective mechanism. If too many workers are failing, it may indicate a broader issue with your cluster configuration or data rather than individual worker problems.

Autoscaling

The autoscaling mechanisms currently in place are tightly coupled with specific deployment infrastructure but the framework should be in place for other implementations. The Druid community is highly open to new implementations or extensions of the existing mechanisms.
If autoscaling is enabled, the Overlord can automatically adjust the number of Middle Managers based on task load:
New Middle Managers may be added when a task has been in pending state for too long.
This ensures that task queues don’t grow indefinitely during periods of high ingestion load.

Deployment-Specific Autoscaling

In some deployments, Middle Manager services are provisioned as cloud instances (e.g., Amazon AWS EC2 nodes) and they are provisioned to register themselves in a deployment environment like Galaxy.
When implementing autoscaling for your deployment:
  1. Monitor task queue depths
  2. Set appropriate scale-up thresholds to handle spikes
  3. Set conservative scale-down thresholds to avoid thrashing
  4. Test scaling behavior under load
  5. Consider cost vs. performance tradeoffs

Task Distribution Workflow

Here’s how the Overlord distributes tasks to workers:

Architecture Integration

With Middle Managers

  • Assigns tasks to available Middle Managers
  • Monitors task execution and progress
  • Tracks worker health and blacklists problematic workers
  • Manages autoscaling of worker capacity

With Metadata Store

  • Stores task metadata and status
  • Maintains task lock information
  • Persists task history and audit logs

With Clients

  • Receives task submission requests
  • Returns task IDs and status information
  • Provides APIs for task management

With ZooKeeper

  • Coordinates distributed locking
  • Manages leader election for high availability
  • Publishes worker availability information

High Availability

The Overlord supports high availability through leader election. You can run multiple Overlord instances, and they will coordinate through ZooKeeper to ensure only one is active at a time. If the active Overlord fails, another will automatically take over.

Performance Considerations

To optimize Overlord performance:
  1. Size appropriately: The Overlord is typically lightweight but needs enough memory for task metadata
  2. Monitor task queues: High queue depths may indicate insufficient worker capacity
  3. Tune blacklisting parameters: Balance between giving workers chances to recover and protecting against bad workers
  4. Configure autoscaling: Set thresholds that match your workload patterns
  5. Use remote mode for production: Separating Overlord and Middle Managers provides better isolation and scalability

Build docs developers (and LLMs) love