Primary Timestamp
Druid schemas must always include a primary timestamp. The primary timestamp is fundamental to how Druid operates:Partitioning & Sorting
Druid uses the primary timestamp to partition and sort your data
Query Performance
Enables rapid identification and retrieval of data within time ranges
Data Management
Powers time-based operations like dropping chunks and retention rules
Storage Column
Always stored in the
__time column in your datasourceConfiguration
Druid parses the primary timestamp based on thetimestampSpec configuration at ingestion time.
Regardless of the source field for the primary timestamp, Druid always stores the timestamp in the
__time column.Additional Timestamp Operations
You can control other important timestamp-based operations in thegranularitySpec:
Controls how data is partitioned into time chunks (e.g., HOUR, DAY, WEEK, MONTH)
Controls the timestamp resolution in query results (e.g., NONE, SECOND, MINUTE)
If you have more than one timestamp column, you can store the others as secondary timestamps.
Dimensions
Dimensions are columns that Druid stores “as-is”. You can use dimensions for any purpose at query time.Dimension Capabilities
- Filtering
- Grouping
- Aggregation
Filter your data based on dimension values:
Dimension Types
Druid supports several dimension types:String Dimensions
String Dimensions
The default and most common dimension type.
Long Dimensions
Long Dimensions
For integer values.
Double Dimensions
Double Dimensions
For floating-point values.
Float Dimensions
Float Dimensions
For single-precision floating-point values.
Multi-Value Dimensions
Multi-Value Dimensions
Arrays of strings for one-to-many relationships.
Rollup and Dimensions
If you disable rollup, Druid treats the set of dimensions like a set of columns to ingest. The dimensions behave exactly as you would expect from any database that does not support a rollup feature.
dimensionsSpec:
Metrics
Metrics are columns that Druid stores in an aggregated form. Metrics are most useful when you enable rollup.When to Use Metrics
Rollup
Collapse dimensions while aggregating metric values to reduce row count
Performance
Pre-compute aggregations at ingestion time for faster queries
How Metrics Work with Rollup
When you enable rollup, Druid combines multiple rows with the same timestamp and dimension values:Common Metric Types
- Count
- Sum
- Min/Max
- First/Last
Count the number of rows:
Example: NetFlow Data with Rollup
Metrics Configuration
At ingestion time, you configure metrics in themetricsSpec:
Benefits of Pre-Aggregation
Even without rollup, pre-aggregating metrics at ingestion time can provide benefits:Query Performance
Query Performance
Some aggregators, especially approximate ones like HyperLogLog and Theta sketches, compute faster at query time if they are partially computed at ingestion time.
Storage Efficiency
Storage Efficiency
Approximate aggregators often use less storage than the raw data they represent.
Flexibility
Flexibility
You can still compute additional aggregations at query time on pre-aggregated data.
Schema Configuration Example
Next Steps
Schema Design
Learn best practices for designing your schema
Data Rollup
Understand rollup in detail
Partitioning
Learn about partitioning strategies
Ingestion Spec
Complete ingestion specification reference