Aggregations - Apache Druid

Aggregations in Apache Druid can be used during ingestion to summarize data before it enters Druid, and at query time to summarize result data.

This document describes native query aggregations. For SQL aggregations, see SQL aggregation functions.

Exact Aggregations

Count Aggregator

Computes the count of Druid rows matching the filters:

{"type": "count", "name": "count"}

Property	Description	Required
`type`	Must be “count”	Yes
`name`	Output name	Yes

The count aggregator counts Druid rows, not raw ingested events. With rollup enabled, these may differ. To count raw events, include a count aggregator at ingestion time and a longSum aggregator at query time.

Sum Aggregators

longSum

Computes the sum of values as a 64-bit signed integer:

{"type": "longSum", "name": "sumLong", "fieldName": "aLong"}

doubleSum

Computes the sum as a 64-bit floating point value:

{"type": "doubleSum", "name": "sumDouble", "fieldName": "aDouble"}

floatSum

Computes the sum as a 32-bit floating point value:

{"type": "floatSum", "name": "sumFloat", "fieldName": "aFloat"}

Common properties:

Property	Description	Required
`type`	”longSum”, “doubleSum”, or “floatSum”	Yes
`name`	Output name	Yes
`fieldName`	Input column name	No*
`expression`	Inline expression	No*

*Must specify either fieldName or expression

Min and Max Aggregators

doubleMin / doubleMax

{"type": "doubleMin", "name": "minDouble", "fieldName": "aDouble"}
{"type": "doubleMax", "name": "maxDouble", "fieldName": "aDouble"}

floatMin / floatMax

{"type": "floatMin", "name": "minFloat", "fieldName": "aFloat"}
{"type": "floatMax", "name": "maxFloat", "fieldName": "aFloat"}

longMin / longMax

{"type": "longMin", "name": "minLong", "fieldName": "aLong"}
{"type": "longMax", "name": "maxLong", "fieldName": "aLong"}

Common properties:

Property	Description	Required
`type`	Min/max type (e.g., “doubleMin”)	Yes
`name`	Output name	Yes
`fieldName`	Input column name	No*
`expression`	Inline expression	No*

*Must specify either fieldName or expression

doubleMean Aggregator

Computes the arithmetic mean as a 64-bit float:

{"type": "doubleMean", "name": "aMean", "fieldName": "aDouble"}

Query time only - not available for ingestion. For ingestion-time mean, use DataSketches Quantiles aggregator.

First and Last Aggregators

Return values corresponding to earliest/latest time column values.

With rollup enabled at ingestion, these return rolled-up values, not original raw data values.

Numeric First/Last

doubleFirst / doubleLast:

{
  "type": "doubleFirst",
  "name": "firstDouble",
  "fieldName": "aDouble"
}

floatFirst / floatLast:

{
  "type": "floatLast",
  "name": "lastFloat",
  "fieldName": "aFloat",
  "timeColumn": "longTime"
}

longFirst / longLast:

{
  "type": "longFirst",
  "name": "firstLong",
  "fieldName": "aLong"
}

Property	Description	Required
`type`	First/last type	Yes
`name`	Output name	Yes
`fieldName`	Input column	Yes
`timeColumn`	Time column (LONG type)	No (defaults to `__time`)

String First/Last

stringFirst:

{
  "type": "stringFirst",
  "name": "firstString",
  "fieldName": "aString",
  "maxStringBytes": 2048,
  "timeColumn": "longTime"
}

stringLast:

{
  "type": "stringLast",
  "name": "lastString",
  "fieldName": "aString"
}

Property	Description	Required
`type`	”stringFirst” or “stringLast”	Yes
`name`	Output name	Yes
`fieldName`	Input column	Yes
`timeColumn`	Time column (LONG type)	No (defaults to `__time`)
`maxStringBytes`	Max string size to accumulate	No (defaults to 1024)

ANY Aggregators

Return any encountered value (including null). Query time only.

Numeric ANY

doubleAny:

{"type": "doubleAny", "name": "anyDouble", "fieldName": "aDouble"}

floatAny:

{"type": "floatAny", "name": "anyFloat", "fieldName": "aFloat"}

longAny:

{"type": "longAny", "name": "anyLong", "fieldName": "aLong"}

stringAny

{
  "type": "stringAny",
  "name": "anyString",
  "fieldName": "aString",
  "maxStringBytes": 2048,
  "aggregateMultipleValues": true
}

Property	Description	Required
`type`	”stringAny”	Yes
`name`	Output name	Yes
`fieldName`	Input column	Yes
`maxStringBytes`	Max string size	No (defaults to 1024)
`aggregateMultipleValues`	If true, returns stringified array for multi-value dims	No (defaults to true)

Approximate Aggregations

Count Distinct

DataSketches Theta Sketch

Provides distinct count estimates with set operations (union, intersection, difference):

DataSketches Theta Sketch

DataSketches HLL Sketch

HyperLogLog-based distinct counting. More space-efficient than Theta but no set operations:

DataSketches HLL Sketch

Legacy: Cardinality and hyperUnique

For new use cases, use DataSketches Theta or HLL instead. Legacy aggregators kept for backwards compatibility.

Cardinality and HyperUnique

Histograms and Quantiles

DataSketches Quantiles Sketch (Recommended)

Provides quantile estimates and histogram approximations with formal error bounds:

DataSketches Quantiles Sketch

Moments Sketch (Experimental)

Optimized for merge speed but accuracy is distribution-dependent:

Moments Sketch

Fixed Buckets Histogram

Simple histogram with fixed range and bucket count:

Fixed Buckets Histogram

Approximate Histogram (Deprecated)

Deprecated due to accuracy issues. Use DataSketches Quantiles instead.

Expression Aggregations

Expression Aggregator

Custom aggregations using Druid expressions (query time only):

{
  "type": "expression",
  "name": "expression_sum",
  "fields": ["column_a"],
  "accumulatorIdentifier": "__acc",
  "initialValue": "0",
  "fold": "__acc + column_a",
  "combine": "__acc + expression_sum"
}

Property	Description	Required
`type`	Must be “expression”	Yes
`name`	Output name	Yes
`fields`	Input columns	Yes
`accumulatorIdentifier`	Variable for accumulator	No (defaults to `__acc`)
`fold`	Expression to accumulate values	Yes
`combine`	Expression to merge fold results	No (defaults to fold)
`compare`	Comparator expression (inputs: `o1`, `o2`)	No
`finalize`	Final transformation (input: `o`)	No
`initialValue`	Initial accumulator value	Yes
`initialCombineValue`	Initial combine value	No (defaults to initialValue)
`isNullUnlessAggregated`	Return null if no rows processed	No (defaults to true)
`shouldAggregateNullInputs`	Process null inputs in fold	No (defaults to true)
`shouldCombineAggregateNullInputs`	Process null inputs in combine	No (defaults to shouldAggregateNullInputs)
`maxSizeBytes`	Max size for variable-sized outputs	No (defaults to 8192)

Example: Count Aggregator

{
  "type": "expression",
  "name": "expression_count",
  "fields": [],
  "initialValue": "0",
  "fold": "__acc + 1",
  "combine": "__acc + expression_count"
}

Example: Distinct Array with Ordering

{
  "type": "expression",
  "name": "expression_array_agg_distinct",
  "fields": ["column_a"],
  "initialValue": "[]",
  "fold": "array_set_add(__acc, column_a)",
  "combine": "array_set_add_all(__acc, expression_array_agg_distinct)",
  "compare": "if(array_length(o1) > array_length(o2), 1, if(array_length(o1) == array_length(o2), 0, -1))"
}

JavaScript Aggregator

Custom JavaScript functions for aggregation:

{
  "type": "javascript",
  "name": "sum(log(x)*y) + 10",
  "fieldNames": ["x", "y"],
  "fnAggregate": "function(current, a, b) { return current + (Math.log(a) * b); }",
  "fnCombine": "function(partialA, partialB) { return partialA + partialB; }",
  "fnReset": "function() { return 10; }"
}

JavaScript is disabled by default. See JavaScript programming guide for how to enable.

Miscellaneous Aggregations

Filtered Aggregator

Wraps any aggregator to only aggregate matching rows:

{
  "type": "filtered",
  "filter": {
    "type": "selector",
    "dimension": "someColumn",
    "value": "abcdef"
  },
  "aggregator": {
    "type": "longSum",
    "name": "sumLong",
    "fieldName": "aLong"
  }
}

If you only need filtered results, put the filter on the query itself for better performance.

Grouping Aggregator

For GroupBy queries with subtotals - returns a number indicating which dimensions are in the sub-grouping:

{
  "type": "grouping",
  "name": "someGrouping",
  "groupings": ["dim1", "dim2"]
}

With subtotals [["dim1", "dim2"], ["dim1"], ["dim2"], []]:

Subtotal	Output	Binary
`["dim1", "dim2"]`	0	(00)
`["dim1"]`	1	(01)
`["dim2"]`	2	(10)
`[]`	3	(11)

Bit at position X is 0 if dimension at position X is included in sub-grouping, else 1.

Getting Started

Design & Architecture

Data Ingestion

Querying

Data Management

Operations

Configuration

​Exact Aggregations

​Count Aggregator

​Sum Aggregators

​longSum

​doubleSum

​floatSum

​Min and Max Aggregators

​doubleMin / doubleMax

​floatMin / floatMax

​longMin / longMax

​doubleMean Aggregator

​First and Last Aggregators

​Numeric First/Last

​String First/Last

​ANY Aggregators

​Numeric ANY

​stringAny

​Approximate Aggregations

​Count Distinct

​DataSketches Theta Sketch

​DataSketches HLL Sketch

​Legacy: Cardinality and hyperUnique

​Histograms and Quantiles

​DataSketches Quantiles Sketch (Recommended)

​Moments Sketch (Experimental)

​Fixed Buckets Histogram

​Approximate Histogram (Deprecated)

​Expression Aggregations

​Expression Aggregator

​Example: Count Aggregator

​Example: Distinct Array with Ordering

​JavaScript Aggregator

​Miscellaneous Aggregations

​Filtered Aggregator

​Grouping Aggregator

Build docs developers (and LLMs) love

Exact Aggregations

Count Aggregator

Sum Aggregators

longSum

doubleSum

floatSum

Min and Max Aggregators

doubleMin / doubleMax

floatMin / floatMax

longMin / longMax

doubleMean Aggregator

First and Last Aggregators

Numeric First/Last

String First/Last

ANY Aggregators

Numeric ANY

stringAny

Approximate Aggregations

Count Distinct

DataSketches Theta Sketch

DataSketches HLL Sketch

Legacy: Cardinality and hyperUnique

Histograms and Quantiles

DataSketches Quantiles Sketch (Recommended)

Moments Sketch (Experimental)

Fixed Buckets Histogram

Approximate Histogram (Deprecated)

Expression Aggregations

Expression Aggregator

Example: Count Aggregator

Example: Distinct Array with Ordering

JavaScript Aggregator

Miscellaneous Aggregations

Filtered Aggregator

Grouping Aggregator