Skip to main content
The Tasks API allows you to retrieve, submit, and manage ingestion tasks in Apache Druid. Tasks are individual jobs that perform operations like data ingestion, querying, and compaction.

Task information and retrieval

Get an array of tasks

Retrieves all tasks in the Druid cluster with information on ID, status, datasource, and metadata.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/tasks/?state=complete&datasource=wikipedia_api&createdTimeInterval=2015-09-12_2015-09-13&max=10&type=query_worker"
state
string
Filter by task state: running, complete, waiting, or pending.
datasource
string
Filter tasks by datasource name.
createdTimeInterval
string
ISO-8601 interval for task creation time. Use _ as delimiter (e.g., 2023-06-27_2023-06-28).
max
integer
Maximum number of complete tasks to return. Only applies when state=complete.
type
string
Filter by task type (e.g., index_parallel, query_worker).
[
    {
        "id": "query-223549f8-b993-4483-b028-1b0d54713cad-worker0_0",
        "groupId": "query-223549f8-b993-4483-b028-1b0d54713cad",
        "type": "query_worker",
        "createdTime": "2023-06-22T22:11:37.012Z",
        "statusCode": "SUCCESS",
        "duration": 17897,
        "location": {
            "host": "localhost",
            "port": 8101,
            "tlsPort": -1
        },
        "dataSource": "wikipedia_api",
        "errorMsg": null
    }
]

Get task payload

Retrieves the complete task configuration and specifications for a given task ID.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/index_parallel_wikipedia_short_iajoonnd_2023-07-07T17:53:12.174Z"
taskId
string
required
The unique identifier of the task.
task
string
The task ID.
payload
object
Complete task configuration including type, spec, dataSchema, ioConfig, and tuningConfig.

Get task status

Retrieves the current status of a task including state, duration, and error information.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-223549f8-b993-4483-b028-1b0d54713cad/status"
taskId
string
required
The task ID to check status for.
statusCode
string
Current task status: SUCCESS, RUNNING, FAILED, or PENDING.
duration
integer
Task duration in milliseconds (-1 if still running).
location
object
Host, port, and TLS port where the task is running.

Get task log

Retrieves the event log for a task, showing execution details, errors, and warnings.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/index_kafka_social_media_0e905aa31037879_nommnaeg/log"
taskId
string
required
The task ID to retrieve logs for.
offset
integer
Exclude the first N entries from the response.

Get task completion report

Retrieves the task completion report with ingestion statistics and parse exceptions.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-52a8aafe-7265-4427-89fe-dc51275cc470/reports"
ingestionStatsAndErrors
object
Contains ingestion state, row statistics, and error information.
ingestionStatsAndErrors.payload.rowStats
object
Statistics on processed, unparseable, and thrown away rows.
{
    "ingestionStatsAndErrors": {
        "taskId": "query-52a8aafe-7265-4427-89fe-dc51275cc470",
        "payload": {
            "ingestionState": "COMPLETED",
            "rowStats": {
                "buildSegments": {
                    "processed": 39244,
                    "processedBytes": 17106256,
                    "processedWithError": 0,
                    "thrownAway": 0,
                    "unparseable": 0
                }
            }
        }
    }
}

Task operations

Submit a task

Submits a JSON-based ingestion spec to the Overlord. Returns the task ID.
For most batch ingestion, use the SQL-based ingestion API instead.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task" \
--header 'Content-Type: application/json' \
--data '{
  "type" : "index_parallel",
  "spec" : {
    "dataSchema" : {
      "dataSource" : "wikipedia_auto",
      "timestampSpec": {
        "column": "time",
        "format": "iso"
      },
      "dimensionsSpec" : {
        "useSchemaDiscovery": true
      },
      "granularitySpec" : {
        "type" : "uniform",
        "segmentGranularity" : "day",
        "intervals" : ["2015-09-12/2015-09-13"]
      }
    },
    "ioConfig" : {
      "type" : "index_parallel",
      "inputSource" : {
        "type" : "local",
        "baseDir" : "quickstart/tutorial/",
        "filter" : "wikiticker-2015-09-12-sampled.json.gz"
      },
      "inputFormat" : {
        "type" : "json"
      }
    }
  }
}'
type
string
required
Task type (e.g., index_parallel, index_hadoop).
spec
object
required
Task specification containing dataSchema, ioConfig, and tuningConfig.
{
    "task": "index_parallel_wikipedia_odofhkle_2023-06-23T21:07:28.226Z"
}

Shut down a task

Shuts down a running task by ID.
curl --request POST \
  "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/task/query-52as8aafe-7265-4427-89fe-dc51275cc470/shutdown"
taskId
string
required
The ID of the task to shut down.
{
    "task": "query-577a83dd-a14e-4380-bd01-c942b781236b"
}

Shut down all tasks for a datasource

Shuts down all tasks associated with a specific datasource.
curl --request POST \
  "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/datasources/wikipedia_auto/shutdownAllTasks"
datasource
string
required
The datasource whose tasks should be shut down.
{
    "dataSource": "wikipedia_api"
}

Task management

Retrieve status objects for tasks

Retrieves status objects for a list of task IDs provided in the request body.
curl "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/taskStatus" \
--header 'Content-Type: application/json' \
--data '["index_parallel_wikipedia_auto_jndhkpbo_2023-06-26T17:23:05.308Z","index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z"]'
taskIds
array
required
Array of task ID strings to retrieve status for.
{
    "index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z": {
        "id": "index_parallel_wikipedia_auto_jbgiianh_2023-06-26T23:17:56.769Z",
        "status": "SUCCESS",
        "duration": 10630
    }
}

Clean up pending segments

Manually cleans up the pending segments table in metadata storage for a datasource.
curl --request DELETE \
  "http://ROUTER_IP:ROUTER_PORT/druid/indexer/v1/pendingSegments/wikipedia_api"
datasource
string
required
The datasource to clean up pending segments for.
{
    "numDeleted": 2
}

Build docs developers (and LLMs) love