Deletion Overview
Druid supports two levels of deletion:- Soft delete (mark unused): Data unavailable for queries but remains in deep storage
- Hard delete (kill): Permanently removes data from deep storage and metadata store
Delete by Time Range
Deleting data by time range is a fast, metadata-only operation that happens in two steps:Mark Segments as Unused
Segments are marked “unused” via drop rules or manual API calls. This is a soft delete - data becomes unavailable for queries but remains in deep storage.
Manual Time Range Deletion
Use the Coordinator API to mark segments as unused:This is a soft delete. Segments remain in deep storage until you run a kill task.
Automatic Deletion with Drop Rules
Use retention rules to automatically mark segments as unused based on time:- Loads segments from the last 30 days
- Drops all other segments
Dropped segments remain in deep storage. Enable auto-kill or use kill tasks for permanent deletion.
Delete Specific Records
Druid doesn’t support deleting individual records directly. Instead, use reindexing with a filter to exclude unwanted data.Native Batch Reindex with Filter
Filter out records during reindex:transformSpec.filterwithtype: "not"excludes matching recordsinputSource.type: "druid"reads from existing datasourceappendToExisting: falsereplaces existing segments
SQL REPLACE with Filter
Use SQL to exclude specific records:Delete Multiple Values
Exclude multiple values using NOT IN or complex filters:Delete by Condition
Remove records matching complex conditions:Reindexed data marked as unused still remains in deep storage. Run a kill task for permanent deletion.
Delete Entire Datasource
To delete all data in a datasource:Via Web Console
- Navigate to Datasources
- Click the datasource name
- Select Actions > Mark all segments as unused
- Optionally, submit a kill task for permanent deletion
Via API
Permanent Deletion with Kill Tasks
Kill tasks permanently delete unused segments from deep storage and the metadata store.Kill Task Syntax
Kill Task Parameters
| Parameter | Default | Description |
|---|---|---|
type | - | Must be "kill" |
dataSource | - | Datasource name |
interval | - | Time range of segments to kill |
versions | null (all) | Specific segment versions to delete |
batchSize | 100 | Segments deleted per batch to avoid blocking Overlord |
limit | null (no limit) | Maximum number of segments to delete |
maxUsedStatusLastUpdatedTime | null (no cutoff) | Only kill segments marked unused before this timestamp |
Submit Kill Task
Via API:- Navigate to Tasks
- Click Submit task
- Paste kill task JSON
- Click Submit
Kill Specific Versions
Delete only certain segment versions:Auto-Kill Unused Segments
Automate permanent deletion of unused segments.Auto-Kill on Coordinator
Enable auto-kill in Coordinator runtime properties:kill.on: Enable auto-kill (default: false)kill.period: How often to run kill taskskill.durationToRetain: Keep unused segments this long before killingkill.maxSegments: Maximum segments to kill per invocation
- Identifies unused segments older than
durationToRetain - Submits kill tasks for eligible intervals
- Processes up to
maxSegmentsper run
Auto-Kill on Overlord (Experimental)
Requires segment metadata caching enabled on Overlord:- No REST API overhead between tasks and Overlord
- Kills segments as soon as they become eligible
- Runs on Overlord, doesn’t consume task slots
- Faster execution (no task process launch overhead)
- Skips locked intervals to avoid blocking
- Handles large numbers of unused segments efficiently
Deletion Best Practices
Test First
Test deletion logic on a small time range before applying to production data.
Use Batch Size
Set appropriate
batchSize in kill tasks to avoid blocking Overlord operations.Retention Buffer
Set
durationToRetain to allow time to recover from accidental soft deletes.Monitor Kill Tasks
Watch kill task metrics and logs to ensure deletions complete successfully.
Common Deletion Patterns
Delete Old Data Periodically
Combine retention rules with auto-kill:- Drops segments older than 90 days
- Permanently deletes them after 7 days
Delete Test Data
Remove test data by filter:Gradual Datasource Deletion
Delete a large datasource in batches:Delete PII After Retention Period
Reindex to remove sensitive fields:Troubleshooting
Segments Not Deleting
Check:- Are segments marked “unused”? Query metadata
- Do kill tasks have correct interval?
- Is
maxUsedStatusLastUpdatedTimefiltering segments? - Check Overlord logs for kill task errors
Kill Tasks Timing Out
ReducebatchSize to process fewer segments per batch:
Accidental Deletion
If segments marked unused by mistake:- Quickly mark them as “used” via API before kill task runs
- Restore from deep storage backups if already killed
- Re-ingest from source data if available
Learn More
Deletion Tutorial
Step-by-step guide to deleting data
Retention Rules
Configure automatic data retention
Data Updates
Reindex data to modify or filter records
Metadata API
API reference for segment management