What are Syntax Trees?
In Loretta, a syntax tree is an immutable representation of Lua source code. Every piece of Lua code you parse is transformed into a tree structure where each node represents a syntactic construct like an expression, statement, or token.
Syntax trees in Loretta are:
- Immutable: Once created, they cannot be modified. Any changes create new tree instances.
- Lossless: They preserve all information from the source, including whitespace and comments (as trivia).
- Hierarchical: Each node knows its parent and children, forming a complete tree structure.
Core Types
LuaSyntaxTree
The LuaSyntaxTree class represents the parsed representation of a Lua source document. It’s the entry point for working with syntax trees.
using Loretta.CodeAnalysis.Lua;
using Loretta.CodeAnalysis.Lua.Syntax;
// Parse Lua code into a syntax tree
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
Key members of LuaSyntaxTree:
GetRoot() - Returns the root LuaSyntaxNode (typically a CompilationUnitSyntax)
Options - The LuaParseOptions used to parse the tree
GetDiagnostics() - Returns all parsing errors and warnings
FilePath - The optional path associated with the tree
LuaSyntaxNode
The LuaSyntaxNode class is the base class for all syntax nodes in the tree. It represents non-terminal nodes (nodes that have children).
Key properties:
Parent - The parent node in the tree
SyntaxTree - The tree this node belongs to
Span - The text span this node covers
Kind() - Returns the SyntaxKind enum value
CompilationUnitSyntax
CompilationUnitSyntax is the root node of a complete Lua file. It represents the entire source file and contains all statements and an end-of-file token.
var tree = LuaSyntaxTree.ParseText("local x = 10\nprint(x)");
var root = tree.GetCompilationUnitRoot();
// Access statements in the file
foreach (var statement in root.Statements)
{
Console.WriteLine($"Statement: {statement.Kind()}");
}
// Access the EOF token
var eofToken = root.EndOfFileToken;
Tree Hierarchy
Every syntax tree follows this hierarchy:
LuaSyntaxTree
└── CompilationUnitSyntax (root node)
├── Statements (SyntaxList<StatementSyntax>)
│ ├── LocalVariableDeclarationStatementSyntax
│ ├── ExpressionStatementSyntax
│ └── ...
└── EndOfFileToken (SyntaxToken)
Navigating the Tree
Accessing Children
You can navigate down the tree by accessing node properties:
var code = @"
local x = 10
local y = x + 5
";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
// Get the first statement
var firstStatement = root.Statements[0];
if (firstStatement is LocalVariableDeclarationStatementSyntax localDecl)
{
// Access the 'local' keyword
var localKeyword = localDecl.LocalKeyword;
// Access the variable names
foreach (var name in localDecl.Names)
{
Console.WriteLine($"Variable: {name}");
}
// Access the initializer values
if (localDecl.EqualsValues != null)
{
foreach (var value in localDecl.EqualsValues.Values)
{
Console.WriteLine($"Value: {value}");
}
}
}
Accessing Parents
You can navigate up the tree using the Parent property:
var token = root.FindToken(5); // Find token at position 5
var parent = token.Parent; // Get the parent node
var grandparent = parent?.Parent;
Walking the Tree
For more complex traversals, use LuaSyntaxWalker or LINQ methods:
using Loretta.CodeAnalysis;
// Get all descendant nodes
var allNodes = root.DescendantNodes();
// Get all tokens
var allTokens = root.DescendantTokens();
// Find specific node types
var functionCalls = root.DescendantNodes()
.OfType<FunctionCallExpressionSyntax>();
foreach (var call in functionCalls)
{
Console.WriteLine($"Function call at {call.Span}");
}
Tokens and Trivia
SyntaxToken
Tokens are the leaves of the syntax tree - they represent individual keywords, identifiers, operators, and literals.
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
// Find a token at a specific position
var token = root.FindToken(6); // The 'x' identifier
Console.WriteLine($"Token: {token.Text}");
Console.WriteLine($"Kind: {token.Kind()}");
Console.WriteLine($"Span: {token.Span}");
SyntaxTrivia
Trivia represents whitespace, comments, and other non-syntactic elements. They are attached to tokens:
var code = @"
-- This is a comment
local x = 10 -- inline comment
";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
var localKeyword = root.DescendantTokens()
.First(t => t.Kind() == SyntaxKind.LocalKeyword);
// Leading trivia (before the token)
foreach (var trivia in localKeyword.LeadingTrivia)
{
if (trivia.Kind() == SyntaxKind.SingleLineCommentTrivia)
{
Console.WriteLine($"Comment: {trivia.ToFullString()}");
}
}
// Trailing trivia (after the token)
var equalsToken = root.DescendantTokens()
.First(t => t.Kind() == SyntaxKind.EqualsToken);
foreach (var trivia in equalsToken.TrailingTrivia)
{
Console.WriteLine($"Trivia: {trivia}");
}
Syntax trees are immutable. To make changes, you create new trees using methods like ReplaceNode, WithXxx, or SyntaxFactory:
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
// Find a node to replace
var literalExpr = root.DescendantNodes()
.OfType<NumericalLiteralExpressionSyntax>()
.First();
// Create a new literal
var newLiteral = SyntaxFactory.LiteralExpression(
SyntaxKind.NumericalLiteralExpression,
SyntaxFactory.Literal(20)
);
// Replace the node (creates a new tree)
var newRoot = root.ReplaceNode(literalExpr, newLiteral);
Console.WriteLine(newRoot.ToFullString()); // "local x = 20"
The original root and tree remain unchanged. ReplaceNode returns a new root with the modification applied.
Example: Analyzing a Complete Tree
Here’s a complete example showing tree structure:
using Loretta.CodeAnalysis.Lua;
using Loretta.CodeAnalysis.Lua.Syntax;
var code = @"
local function add(a, b)
return a + b
end
local result = add(5, 10)
print(result)
";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
Console.WriteLine($"File has {root.Statements.Count} statements");
// Analyze the function declaration
var funcDecl = root.Statements[0] as LocalFunctionDeclarationStatementSyntax;
if (funcDecl != null)
{
Console.WriteLine($"Function name: {funcDecl.Name.Text}");
Console.WriteLine($"Parameter count: {funcDecl.Parameters.Parameters.Count}");
// Analyze function body
var returnStmt = funcDecl.Body.Statements[0] as ReturnStatementSyntax;
if (returnStmt != null)
{
Console.WriteLine($"Returns {returnStmt.Expressions.Count} value(s)");
}
}
// Count all identifiers in the file
var identifiers = root.DescendantTokens()
.Where(t => t.Kind() == SyntaxKind.IdentifierToken)
.Select(t => t.Text)
.Distinct();
Console.WriteLine($"Unique identifiers: {string.Join(", ", identifiers)}");
Common Patterns
Type Testing
Use pattern matching to work with specific node types:
foreach (var statement in root.Statements)
{
switch (statement)
{
case LocalVariableDeclarationStatementSyntax local:
Console.WriteLine($"Local variable: {local.Names[0]}");
break;
case FunctionDeclarationStatementSyntax func:
Console.WriteLine($"Function: {func.Name}");
break;
case ExpressionStatementSyntax expr:
Console.WriteLine($"Expression statement");
break;
}
}
Finding Nodes by Position
// Find the node at a specific position
var position = 10;
var node = root.FindNode(new TextSpan(position, 0));
Console.WriteLine($"Node at position {position}: {node.Kind()}");
// Find token at position
var token = root.FindToken(position);
Console.WriteLine($"Token at position {position}: {token.Text}");
See Also
- Parsing - Learn how to create syntax trees from text
- Diagnostics - Working with parse errors and warnings
- Scoping - Analyzing variable scope using Script