Skip to main content

Apache Druid Documentation

Build high-performance analytics applications with sub-second queries on streaming and batch data at any scale.

Query example

Quick start

Get Apache Druid running locally in minutes and start querying your first dataset.

1

Download and extract Druid

Download the latest Apache Druid release and extract it to your local system.
wget https://dlcdn.apache.org/druid/XX.X.X/apache-druid-XX.X.X-bin.tar.gz
tar -xzf apache-druid-XX.X.X-bin.tar.gz
cd apache-druid-XX.X.X
Replace XX.X.X with the latest version number. Druid requires Java 17 or Java 21.
2

Start Druid services

Start Druid using the single-server quickstart configuration.
./bin/start-micro-quickstart
This starts all Druid services on your local machine. Wait for the services to fully initialize (about 30-60 seconds).
3

Access the web console

Open your browser and navigate to the Druid web console:
http://localhost:8888
The console provides a complete UI for data ingestion, query execution, and cluster management.
4

Load sample data

Use the data loader in the web console to ingest your first dataset. Navigate to Load data and select one of the built-in sample datasets like Wikipedia edits or NYC taxi trips.
The quickstart includes example datasets to help you explore Druid’s capabilities immediately.
5

Run your first query

Once data is loaded, navigate to the Query view and run a SQL query:
SELECT 
  TIME_FLOOR(__time, 'PT1H') AS hour,
  COUNT(*) AS events
FROM wikipedia
WHERE __time >= CURRENT_TIMESTAMP - INTERVAL '24' HOUR
GROUP BY 1
ORDER BY 1 DESC
You should see results in sub-second response times.

Explore by topic

Learn about Druid’s architecture, data ingestion, querying capabilities, and operations.

Architecture

Understand Druid’s distributed architecture and core services

Data ingestion

Load streaming and batch data from Kafka, Kinesis, files, and more

SQL queries

Query your data using standard SQL with Druid’s query engine

API reference

Integrate with Druid using HTTP APIs and JDBC

Data management

Manage segments, compaction, retention, and updates

Operations

Deploy, monitor, and scale your Druid cluster

Key features

Apache Druid is optimized for real-time analytics at scale.

Sub-second queries

Columnar storage and indexing deliver interactive query performance on billions of rows

Real-time ingestion

Stream data from Kafka and Kinesis with exactly-once semantics and immediate query availability

Scalable architecture

Scale to petabytes of data across hundreds of nodes with horizontal scalability

High availability

Built-in replication and automated recovery ensure continuous operation

Ready to start building?

Follow our quickstart guide to get Druid running locally, or explore deployment options for production environments.