Druid SQL Overview

Apache Druid supports two query languages: Druid SQL and native queries. This page describes the SQL language and its syntax.

Druid translates SQL queries into its native query language for execution. To learn about translation and optimize performance, see SQL query translation.

Overview

You can query data in Druid datasources using Druid SQL. SQL planning occurs on the Broker, so configure Broker runtime properties to adjust query planning and JDBC settings. For information on permissions needed to make SQL queries, see Defining SQL permissions.

SQL Query Syntax

Druid SQL supports SELECT queries with the following structure:

[ EXPLAIN PLAN FOR ]
[ WITH tableName [ ( column1, column2, ... ) ] AS ( query ) ]
SELECT [ ALL | DISTINCT ] { * | exprs }
FROM { <table> | (<subquery>) | <o1> [ INNER | LEFT ] JOIN <o2> ON condition }
[PIVOT (aggregation_function(column_to_aggregate) FOR column_with_values_to_pivot IN (pivoted_column1 [, pivoted_column2 ...]))]
[UNPIVOT (values_column FOR names_column IN (unpivoted_column1 [, unpivoted_column2 ... ]))]
[ CROSS JOIN UNNEST(source_expression) as table_alias_name(column_alias_name) ]
[ WHERE expr ]
[ GROUP BY [ exprs | GROUPING SETS ( (exprs), ... ) | ROLLUP (exprs) | CUBE (exprs) ] ]
[ HAVING expr ]
[ ORDER BY expr [ ASC | DESC ], expr [ ASC | DESC ], ... ]
[ LIMIT limit ]
[ OFFSET offset ]
[ UNION ALL <another query> ]

FROM Clause

The FROM clause can refer to:

Table datasources from the druid schema (default). Reference as druid.dataSourceName or simply dataSourceName
Lookups from the lookup schema, for example lookup.countries
Subqueries enclosed in parentheses
Joins between datasources (except between native datasources and system tables)
Metadata tables from INFORMATION_SCHEMA or sys schemas

For more information about datasources, refer to the Datasources documentation.

Example Queries

Basic SELECT

SELECT 
  channel,
  user,
  SUM(added) AS total_added
FROM wikipedia
WHERE channel LIKE '#en%'
GROUP BY channel, user
ORDER BY total_added DESC
LIMIT 10

PIVOT Example

Transform row values into columns:

SELECT user, channel, ba_sum_deleted, ny_sum_deleted
FROM wikipedia
PIVOT (
  SUM(deleted) AS "sum_deleted" 
  FOR "cityName" IN ('Buenos Aires' AS ba, 'New York' AS ny)
)
WHERE ba_sum_deleted IS NOT NULL OR ny_sum_deleted IS NOT NULL
LIMIT 15

UNPIVOT Example

Transform columns into rows:

SELECT channel, user, action, SUM(changes) AS total_changes
FROM wikipedia 
UNPIVOT (
  changes FOR action IN ("added", "deleted")
)
WHERE channel LIKE '#ar%'
GROUP BY channel, user, action
LIMIT 15

UNNEST

The UNNEST clause unnests ARRAY typed values. Use MV_TO_ARRAY for multi-value dimensions:

SELECT column_alias_name
FROM datasource
CROSS JOIN UNNEST(source_expression1) AS table_alias_name1(column_alias_name1)

Key points:

Works directly on ARRAY columns
Use MV_TO_ARRAY(dimension) for multi-value VARCHAR columns
CROSS JOIN is required (except for inline arrays)
Preserves array ordering
Does not remove duplicates or nulls

WHERE Clause

The WHERE clause filters rows and is translated to native filters. You can reference subqueries:

WHERE col1 IN (SELECT foo FROM ...)

Type comparison tips:

Strings and numbers can be compared through implicit conversion
For best performance, explicitly cast numbers to strings when comparing against string dimensions:

WHERE stringDim = '1'  -- Better than stringDim = 1
WHERE stringDim IN ('1', '2', '3')  -- Cast array elements

GROUP BY

The GROUP BY clause aggregates data. You can:

Group by expressions or column ordinals: GROUP BY 2
Use GROUPING SETS for multiple grouping levels:

GROUP BY GROUPING SETS ((country, city), ())

Use ROLLUP for hierarchical aggregation:

GROUP BY ROLLUP (country, city)
-- Equivalent to: GROUPING SETS ((country, city), (country), ())

Use CUBE for all combinations:

GROUP BY CUBE (country, city)
-- Equivalent to: GROUPING SETS ((country, city), (country), (city), ())

UNION ALL

Combine results from multiple queries:

Top-level UNION ALL

SELECT COUNT(*) FROM tbl WHERE my_column = 'value1'
UNION ALL
SELECT COUNT(*) FROM tbl WHERE my_column = 'value2'

Table-level UNION ALL

SELECT col1, COUNT(*)
FROM (
  SELECT col1, col2, col3 FROM tbl1
  UNION ALL
  SELECT col1, col2, col3 FROM tbl2
)
GROUP BY col1

Or use the TABLE(APPEND()) syntax:

SELECT col1, COUNT(*) FROM TABLE(APPEND('tbl1', 'tbl2'))
GROUP BY col1

EXPLAIN PLAN

View how queries are translated to native queries:

EXPLAIN PLAN FOR
SELECT COUNT(*) FROM datasource WHERE column = 'value'

SET Statement

Configure query context parameters:

SET useApproximateTopN = false;
SET sqlTimeZone = 'America/Los_Angeles';
SET timeout = 90000;
SELECT some_column, COUNT(*) 
FROM druid.foo 
WHERE other_column = 'foo' 
GROUP BY 1 
ORDER BY 2 DESC

Dynamic Parameters

Use ? placeholders for parameterized queries:

{
  "query": "SELECT * FROM druid.foo WHERE dim1 = ?",
  "parameters": [
    {"type": "VARCHAR", "value": "bar"}
  ]
}

Array Parameters

{
  "query": "SELECT * FROM table WHERE ARRAY_CONTAINS(arrayColumn, ?)",
  "parameters": [
    {"type": "ARRAY", "value": [-25.7, null, 36.85]}
  ]
}

Type Casting

Explicitly cast parameters when needed:

SELECT * FROM druid.foo WHERE dim1 like CONCAT('%', CAST(? AS VARCHAR), '%')

Identifiers and Literals

Identifiers:

Optionally quote with double quotes: "columnName"
Escape quotes by doubling: "My ""column"" name"
Case-sensitive with no implicit conversions

Literals:

Strings: 'foo'
Unicode strings: U&'fo\00F6'
Numbers: 100, 100.0, 1.0e5
Timestamps: TIMESTAMP '2000-01-01 00:00:00'
Intervals: INTERVAL '1' HOUR, INTERVAL '1-2' YEAR TO MONTH

Reserved Keywords

Druid inherits reserved keywords from Apache Calcite, plus:

CLUSTERED
PARTITIONED

Enclose reserved keywords in double quotes to use them:

SELECT "PARTITIONED" FROM druid.table

Data types - Supported data types
Aggregation functions - Available aggregations
Scalar functions - String, numeric, and other functions
Multi-value string functions - Operations on multi-value dimensions
Query translation - How SQL maps to native queries
SQL API - HTTP API reference
JDBC driver - JDBC driver documentation

Getting Started

Design & Architecture

Data Ingestion

Querying

Data Management

Operations

Configuration

Overview

SQL Query Syntax

FROM Clause

Example Queries

Basic SELECT

PIVOT Example

UNPIVOT Example

UNNEST

WHERE Clause

GROUP BY

UNION ALL

Top-level UNION ALL

Table-level UNION ALL

EXPLAIN PLAN

SET Statement

Dynamic Parameters

Array Parameters

Type Casting

Identifiers and Literals

Reserved Keywords

Build docs developers (and LLMs) love

Getting Started

Design & Architecture

Data Ingestion

Querying

Data Management

Operations

Configuration

​Overview

​SQL Query Syntax

​FROM Clause

​Example Queries

​Basic SELECT

​PIVOT Example

​UNPIVOT Example

​UNNEST

​WHERE Clause

​GROUP BY

​UNION ALL

​Top-level UNION ALL

​Table-level UNION ALL

​EXPLAIN PLAN

​SET Statement

​Dynamic Parameters

​Array Parameters

​Type Casting

​Identifiers and Literals

​Reserved Keywords

​Related Pages

Build docs developers (and LLMs) love

Overview

SQL Query Syntax

FROM Clause

Example Queries

Basic SELECT

PIVOT Example

UNPIVOT Example

UNNEST

WHERE Clause

GROUP BY

UNION ALL

Top-level UNION ALL

Table-level UNION ALL

EXPLAIN PLAN

SET Statement

Dynamic Parameters

Array Parameters

Type Casting

Identifiers and Literals

Reserved Keywords

Related Pages