Skip to main content
Aggregations in Apache Druid can be used during ingestion to summarize data before it enters Druid, and at query time to summarize result data.
This document describes native query aggregations. For SQL aggregations, see SQL aggregation functions.

Exact Aggregations

Count Aggregator

Computes the count of Druid rows matching the filters:
{"type": "count", "name": "count"}
PropertyDescriptionRequired
typeMust be “count”Yes
nameOutput nameYes
The count aggregator counts Druid rows, not raw ingested events. With rollup enabled, these may differ. To count raw events, include a count aggregator at ingestion time and a longSum aggregator at query time.

Sum Aggregators

longSum

Computes the sum of values as a 64-bit signed integer:
{"type": "longSum", "name": "sumLong", "fieldName": "aLong"}

doubleSum

Computes the sum as a 64-bit floating point value:
{"type": "doubleSum", "name": "sumDouble", "fieldName": "aDouble"}

floatSum

Computes the sum as a 32-bit floating point value:
{"type": "floatSum", "name": "sumFloat", "fieldName": "aFloat"}
Common properties:
PropertyDescriptionRequired
type”longSum”, “doubleSum”, or “floatSum”Yes
nameOutput nameYes
fieldNameInput column nameNo*
expressionInline expressionNo*
*Must specify either fieldName or expression

Min and Max Aggregators

doubleMin / doubleMax

{"type": "doubleMin", "name": "minDouble", "fieldName": "aDouble"}
{"type": "doubleMax", "name": "maxDouble", "fieldName": "aDouble"}

floatMin / floatMax

{"type": "floatMin", "name": "minFloat", "fieldName": "aFloat"}
{"type": "floatMax", "name": "maxFloat", "fieldName": "aFloat"}

longMin / longMax

{"type": "longMin", "name": "minLong", "fieldName": "aLong"}
{"type": "longMax", "name": "maxLong", "fieldName": "aLong"}
Common properties:
PropertyDescriptionRequired
typeMin/max type (e.g., “doubleMin”)Yes
nameOutput nameYes
fieldNameInput column nameNo*
expressionInline expressionNo*
*Must specify either fieldName or expression

doubleMean Aggregator

Computes the arithmetic mean as a 64-bit float:
{"type": "doubleMean", "name": "aMean", "fieldName": "aDouble"}
Query time only - not available for ingestion. For ingestion-time mean, use DataSketches Quantiles aggregator.

First and Last Aggregators

Return values corresponding to earliest/latest time column values.
With rollup enabled at ingestion, these return rolled-up values, not original raw data values.

Numeric First/Last

doubleFirst / doubleLast:
{
  "type": "doubleFirst",
  "name": "firstDouble",
  "fieldName": "aDouble"
}
floatFirst / floatLast:
{
  "type": "floatLast",
  "name": "lastFloat",
  "fieldName": "aFloat",
  "timeColumn": "longTime"
}
longFirst / longLast:
{
  "type": "longFirst",
  "name": "firstLong",
  "fieldName": "aLong"
}
PropertyDescriptionRequired
typeFirst/last typeYes
nameOutput nameYes
fieldNameInput columnYes
timeColumnTime column (LONG type)No (defaults to __time)

String First/Last

stringFirst:
{
  "type": "stringFirst",
  "name": "firstString",
  "fieldName": "aString",
  "maxStringBytes": 2048,
  "timeColumn": "longTime"
}
stringLast:
{
  "type": "stringLast",
  "name": "lastString",
  "fieldName": "aString"
}
PropertyDescriptionRequired
type”stringFirst” or “stringLast”Yes
nameOutput nameYes
fieldNameInput columnYes
timeColumnTime column (LONG type)No (defaults to __time)
maxStringBytesMax string size to accumulateNo (defaults to 1024)

ANY Aggregators

Return any encountered value (including null). Query time only.

Numeric ANY

doubleAny:
{"type": "doubleAny", "name": "anyDouble", "fieldName": "aDouble"}
floatAny:
{"type": "floatAny", "name": "anyFloat", "fieldName": "aFloat"}
longAny:
{"type": "longAny", "name": "anyLong", "fieldName": "aLong"}

stringAny

{
  "type": "stringAny",
  "name": "anyString",
  "fieldName": "aString",
  "maxStringBytes": 2048,
  "aggregateMultipleValues": true
}
PropertyDescriptionRequired
type”stringAny”Yes
nameOutput nameYes
fieldNameInput columnYes
maxStringBytesMax string sizeNo (defaults to 1024)
aggregateMultipleValuesIf true, returns stringified array for multi-value dimsNo (defaults to true)

Approximate Aggregations

Count Distinct

DataSketches Theta Sketch

Provides distinct count estimates with set operations (union, intersection, difference):

DataSketches HLL Sketch

HyperLogLog-based distinct counting. More space-efficient than Theta but no set operations:

Legacy: Cardinality and hyperUnique

For new use cases, use DataSketches Theta or HLL instead. Legacy aggregators kept for backwards compatibility.

Histograms and Quantiles

Provides quantile estimates and histogram approximations with formal error bounds:

Moments Sketch (Experimental)

Optimized for merge speed but accuracy is distribution-dependent:

Fixed Buckets Histogram

Simple histogram with fixed range and bucket count:

Approximate Histogram (Deprecated)

Deprecated due to accuracy issues. Use DataSketches Quantiles instead.

Expression Aggregations

Expression Aggregator

Custom aggregations using Druid expressions (query time only):
{
  "type": "expression",
  "name": "expression_sum",
  "fields": ["column_a"],
  "accumulatorIdentifier": "__acc",
  "initialValue": "0",
  "fold": "__acc + column_a",
  "combine": "__acc + expression_sum"
}
PropertyDescriptionRequired
typeMust be “expression”Yes
nameOutput nameYes
fieldsInput columnsYes
accumulatorIdentifierVariable for accumulatorNo (defaults to __acc)
foldExpression to accumulate valuesYes
combineExpression to merge fold resultsNo (defaults to fold)
compareComparator expression (inputs: o1, o2)No
finalizeFinal transformation (input: o)No
initialValueInitial accumulator valueYes
initialCombineValueInitial combine valueNo (defaults to initialValue)
isNullUnlessAggregatedReturn null if no rows processedNo (defaults to true)
shouldAggregateNullInputsProcess null inputs in foldNo (defaults to true)
shouldCombineAggregateNullInputsProcess null inputs in combineNo (defaults to shouldAggregateNullInputs)
maxSizeBytesMax size for variable-sized outputsNo (defaults to 8192)

Example: Count Aggregator

{
  "type": "expression",
  "name": "expression_count",
  "fields": [],
  "initialValue": "0",
  "fold": "__acc + 1",
  "combine": "__acc + expression_count"
}

Example: Distinct Array with Ordering

{
  "type": "expression",
  "name": "expression_array_agg_distinct",
  "fields": ["column_a"],
  "initialValue": "[]",
  "fold": "array_set_add(__acc, column_a)",
  "combine": "array_set_add_all(__acc, expression_array_agg_distinct)",
  "compare": "if(array_length(o1) > array_length(o2), 1, if(array_length(o1) == array_length(o2), 0, -1))"
}

JavaScript Aggregator

Custom JavaScript functions for aggregation:
{
  "type": "javascript",
  "name": "sum(log(x)*y) + 10",
  "fieldNames": ["x", "y"],
  "fnAggregate": "function(current, a, b) { return current + (Math.log(a) * b); }",
  "fnCombine": "function(partialA, partialB) { return partialA + partialB; }",
  "fnReset": "function() { return 10; }"
}
JavaScript is disabled by default. See JavaScript programming guide for how to enable.

Miscellaneous Aggregations

Filtered Aggregator

Wraps any aggregator to only aggregate matching rows:
{
  "type": "filtered",
  "filter": {
    "type": "selector",
    "dimension": "someColumn",
    "value": "abcdef"
  },
  "aggregator": {
    "type": "longSum",
    "name": "sumLong",
    "fieldName": "aLong"
  }
}
If you only need filtered results, put the filter on the query itself for better performance.

Grouping Aggregator

For GroupBy queries with subtotals - returns a number indicating which dimensions are in the sub-grouping:
{
  "type": "grouping",
  "name": "someGrouping",
  "groupings": ["dim1", "dim2"]
}
With subtotals [["dim1", "dim2"], ["dim1"], ["dim2"], []]:
SubtotalOutputBinary
["dim1", "dim2"]0(00)
["dim1"]1(01)
["dim2"]2(10)
[]3(11)
Bit at position X is 0 if dimension at position X is included in sub-grouping, else 1.

Build docs developers (and LLMs) love