parser/src/parser.rs and is approximately 11,000 lines of recursive descent parsing code.
Background
Turso’s parser is an in-tree fork of lemon-rs, which is itself a port of the SQLite parser (originally written using the Lemon parser generator) into Rust. The parser produces the same AST shape that SQLite’s grammar defines, so Turso stays compatible with SQLite’s SQL dialect.Source files
| File | Purpose |
|---|---|
parser/src/parser.rs | Main recursive descent parser (~11k LOC) |
parser/src/lexer.rs | Tokenizer: bytes → TokenType stream |
parser/src/token.rs | TokenType enum — all SQL tokens |
parser/src/ast.rs | AST node types for all statements and expressions |
parser/src/error.rs | Parse error types |
parser/src/lib.rs | Public crate entry point |
The lexer
The lexer (parser/src/lexer.rs) converts a byte slice into a stream of typed tokens. It:
- Recognizes SQL keywords case-insensitively (
SELECT,FROM,WHERE, etc.) via a lookup table. - Emits a
TK_IDtoken for unrecognized identifiers. - Handles string literals (single-quoted), blob literals (
x'...'), numeric literals, and operators. - Reports the byte offset of each token so parse errors can include position information.
The parser
The parser is a hand-written recursive descent parser. Each grammar production has a corresponding function. The entry points handle the full statement grammar:cmd()— parses a single SQL command and returns aCmd.Cmd::Stmt(stmt)— a DML/DDL/query statement.Cmd::Explain(stmt)/Cmd::ExplainQueryPlan(stmt)—EXPLAINvariants.
AST structure
All AST node types are defined inparser/src/ast.rs. The root type for a complete statement is Stmt. Some key variants:
Expr enum, which covers literals, column references, binary/unary operators, function calls, subqueries, and more:
SQLite grammar compatibility
The parser targets full SQLite grammar compatibility. It recognizes all SQLite statement types including:- DML:
SELECT,INSERT,UPDATE,DELETE,UPSERT - DDL:
CREATE/DROP/ALTER TABLE,CREATE/DROP INDEX,CREATE VIEW,CREATE TRIGGER - Transactions:
BEGIN,COMMIT,ROLLBACK,SAVEPOINT,RELEASE - Administrative:
PRAGMA,ATTACH,DETACH,ANALYZE,VACUUM - Special:
EXPLAIN,EXPLAIN QUERY PLAN,WITH(CTEs), window functions
Turso also adds a
CONCURRENT keyword for use with BEGIN CONCURRENT, which is specific to the MVCC journal mode.Error handling
Parse errors are returned asError values from parser/src/error.rs. The parser does not attempt recovery — on a syntax error it returns immediately with a description of the unexpected token and the byte offset where parsing failed.
The design follows the SQLite approach: prepare-time errors (including parse errors) are surfaced before any execution begins, so no partial state is left behind.
How the AST flows downstream
Once the parser produces aCmd::Stmt(stmt), the Connection passes it to core/translate/ for code generation:
Plan representation, before bytecode is emitted.
See Query optimizer and Virtual machine for what happens next.