Architecture

TiDB is a distributed database built from four components. The compute layer (TiDB Server) is stateless and can scale independently from the storage layer (TiKV and TiFlash). PD (Placement Driver) acts as the cluster brain, managing metadata and scheduling data movement.

┌─────────────────────────────────────────────────┐
│                  MySQL Clients                   │
└────────────────────┬────────────────────────────┘
                     │ MySQL protocol (port 4000)
         ┌───────────▼───────────┐
         │      TiDB Server      │  ← Stateless SQL layer
         │  (parse, plan, exec)  │
         └──────┬────────────────┘
                │ gRPC
    ┌───────────┼───────────────┐
    │           │               │
┌───▼───┐  ┌───▼───┐      ┌────▼────┐
│  PD   │  │ TiKV  │      │TiFlash  │
│       │  │(row)  │      │(columnar│
│       │  │       │      │ store)  │
└───────┘  └───────┘      └─────────┘
 Metadata   OLTP data      OLAP data
 Scheduling  (Raft)       (replicated
                           from TiKV)

Components

TiDB Server

The stateless SQL layer. Parses queries, builds and optimizes execution plans, and coordinates execution across TiKV and TiFlash. Multiple TiDB Server nodes can run behind a load balancer; they share no state with each other.

TiKV

The distributed row-based storage engine. Data is divided into Regions (default 96 MB each). Each Region is replicated across three TiKV nodes using the Raft consensus protocol. TiKV handles all transactional (OLTP) reads and writes.

TiFlash

The columnar storage engine for analytical queries. TiFlash nodes receive real-time data replication from TiKV using the Multi-Raft Learner protocol. TiDB Server chooses TiFlash automatically for queries that benefit from columnar access, using MPP (Massively Parallel Processing) execution.

PD (Placement Driver)

The cluster management component. PD stores cluster metadata in etcd, allocates globally unique timestamps for transactions, monitors Region health, and schedules Region movement between TiKV nodes to balance load and recover from failures.

How SQL queries flow through the system

Client sends SQL

A client connects to any TiDB Server node on port 4000 using the MySQL protocol and sends a SQL statement.

TiDB Server parses and plans

TiDB Server parses the SQL into an AST, runs the optimizer to build an execution plan, and determines which storage engines to read from (TiKV, TiFlash, or both).

Execution is dispatched to storage

For point queries and small range scans, TiDB sends coprocessor requests to TiKV nodes that own the relevant Regions. For large analytical queries, TiDB dispatches MPP tasks to TiFlash nodes and collects the results.

Results are returned to the client

TiDB Server assembles the results from one or more storage nodes and returns them to the client in MySQL protocol format.

Distributed transactions

TiDB implements distributed transactions using a two-phase commit (2PC) protocol. When a transaction writes data across multiple Regions:

Prewrite phase — TiDB writes locks and data to all affected Regions.
Commit phase — After a majority of replicas acknowledge the prewrite, TiDB commits the transaction and removes the locks.

PD provides a globally monotonically increasing timestamp (TSO) to each transaction. This timestamp serves as both the transaction start time and the commit version, enabling snapshot isolation across the entire cluster.

Transactions in TiDB are ACID-compliant. Strong consistency is guaranteed even when some replicas fail, because writes are only committed after the Raft majority acknowledges them.

Raft consensus and data replication

TiKV divides all data into contiguous key ranges called Regions. Each Region is approximately 96 MB and has three replicas by default, each on a different TiKV node. The replicas for a Region form a Raft group.

One replica is the leader and handles all reads and writes for that Region.
The other two replicas are followers and apply log entries replicated from the leader.
If the leader fails, Raft elects a new leader from the followers automatically.

This means TiDB can tolerate the loss of any minority of TiKV nodes without data loss or manual intervention.

Region-based data sharding

TiDB does not use hash-based or range-based sharding that you configure manually. Instead, TiKV automatically splits Regions when they grow too large and merges them when they shrink. PD monitors Region sizes and schedules splits, merges, and migrations to keep data evenly distributed across TiKV nodes. From the SQL layer, this is transparent: you interact with a single logical table and TiDB handles all Region routing internally.

Separation of compute and storage

Because TiDB Server nodes are stateless, you can:

Scale compute independently by adding or removing TiDB Server nodes with no data movement.
Scale storage independently by adding TiKV or TiFlash nodes; PD automatically rebalances Regions onto the new nodes.

This separation is what allows TiDB to scale both horizontally (more nodes) and vertically (larger nodes) without downtime.

TiFlash and HTAP

TiFlash uses the Multi-Raft Learner protocol to replicate data from TiKV in real time. A Raft Learner receives log entries from TiKV leaders but does not participate in leader elections, so it adds no overhead to the OLTP write path. TiDB Server tracks which tables have TiFlash replicas. For a query involving a table with a TiFlash replica, the optimizer can choose to:

Push the full query to TiFlash (columnar scan, aggregation via MPP).
Split execution: scan from TiFlash for the analytical part, fetch small lookup results from TiKV.

Data consistency between TiKV and TiFlash is guaranteed by the replication protocol; queries always see the same committed data regardless of which engine they read from.

# Control which engines TiDB is allowed to read from:
[isolation-read]
engines = ["tikv", "tiflash", "tidb"]

Get Started

Core Concepts

Deployment

Data Management

Operations

Architecture

Components

TiDB Server

TiKV

TiFlash

PD (Placement Driver)

How SQL queries flow through the system

Distributed transactions

Raft consensus and data replication

Region-based data sharding

Separation of compute and storage

TiFlash and HTAP

Next steps

Quick start

Introduction

Build docs developers (and LLMs) love

Get Started

Core Concepts

Deployment

Data Management

Operations

Documentation Index

​Components

TiDB Server

TiKV

TiFlash

PD (Placement Driver)

​How SQL queries flow through the system

​Distributed transactions

​Raft consensus and data replication

​Region-based data sharding

​Separation of compute and storage

​TiFlash and HTAP

​Next steps

Quick start

Introduction

Build docs developers (and LLMs) love

Components

How SQL queries flow through the system

Distributed transactions

Raft consensus and data replication

Region-based data sharding

Separation of compute and storage

TiFlash and HTAP

Next steps