Overview
Retention rules control the lifecycle of segments in your Druid cluster:- Load rules: Define which segments to keep on Historical servers and how many replicas
- Drop rules: Mark segments as unused based on time periods or intervals
- Broadcast rules: Load segments onto Broker nodes (for testing only)
Retention rules are persistent and stored in Druid’s metadata store. They remain in effect until you change them.
Rule Types
You can specify data retention in three ways:- Forever: All data in the segment
- Period: Segment data specified as an offset from the present time
- Interval: A fixed time range
Setting Retention Rules
Using the Web Console
Using the Coordinator API
Set default rules for all datasources:Rule Structure and Order
Rule order is critical. The Coordinator:- Reads rules in the order they appear
- Cycles through all used segments
- Matches each segment with the first applicable rule
- Each segment can only match a single rule
In the web console, use the up and down arrows to reorder rules.
Load Rules
Load rules define how Druid assigns segments to Historical process tiers and set replica counts.Forever Load Rule
Assigns all datasource segments to specified tiers:tieredReplicants: Map of tier names to number of replicas (0 or positive integer)useDefaultTierForNull: Determines default value iftieredReplicantsis null (default:true)
Period Load Rule
Assigns segment data in a specific period to a tier:period: ISO 8601 period from past to present (or future ifincludeFutureis true)includeFuture: Match segments that start after the rule interval starts (default:true)tieredReplicants: Map of tier names to replica counts
Interval Load Rule
Assigns a specific time range to a tier:interval: ISO 8601 time range encoded as a stringtieredReplicants: Map of tier names to replica counts
Query from Deep Storage
Configure segments to be queryable from deep storage without loading to Historicals:Setting
tieredReplicants to an empty object and useDefaultTierForNull to false allows queries from deep storage without Historical tier loading.Drop Rules
Drop rules mark segments as unused, removing them from the cluster. Data remains in deep storage unless you run a kill task.Forever Drop Rule
Drops all segment data from the cluster:Period Drop Rule
Drops segments within a specific period (drops recent data):period: ISO 8601 period from past to present/futureincludeFuture: Match segments starting after the rule interval (default:true)
Period Drop Before Rule
Drops segments before a specific period (drops old data):period: ISO 8601 period
The rule combination
dropBeforeByPeriod + loadForever is equivalent to loadByPeriod(includeFuture = true) + dropForever.Interval Drop Rule
Drops segments in a specific time range:interval: ISO 8601 time range
Broadcast Rules
Requiresdruid.segmentCache.locations configured on both Brokers and Historicals.
Forever Broadcast Rule
Period Broadcast Rule
Interval Broadcast Rule
Common Retention Patterns
Hot-Warm-Cold Architecture
Keep recent data hot, older data warm, archive oldest:Retain Last N Days
Keep only the last 30 days of data:High Availability for Recent Data
More replicas for recent data:Managing Dropped Data
Permanently Delete Data
Dropped segments remain in deep storage. To permanently delete:- Segments are marked “unused” via drop rules or manual action
- Submit a kill task to delete from deep storage
- Or enable auto-kill on the Coordinator
Reload Dropped Data
Viewing Retention Rules
Retrieve all rules:Best Practices
Set Default Rules
Configure default rules to prevent unlimited data retention across all datasources.
Use Period Rules
Prefer period-based rules over interval-based for dynamic retention that adapts as time progresses.
Test Rule Order
Verify rule order carefully - segments match the first applicable rule only.
Enable Auto-Kill
Configure auto-kill to automatically clean up unused segments from deep storage.
Learn More
Retention Tutorial
Step-by-step guide to configuring retention rules
Data Deletion
Permanently delete data with kill tasks
Retention Rules API
Complete API reference for managing rules
Mixed Workloads
Configure tiering for different workload types