Skip to main content

What are Syntax Trees?

In Loretta, a syntax tree is an immutable representation of Lua source code. Every piece of Lua code you parse is transformed into a tree structure where each node represents a syntactic construct like an expression, statement, or token. Syntax trees in Loretta are:
  • Immutable: Once created, they cannot be modified. Any changes create new tree instances.
  • Lossless: They preserve all information from the source, including whitespace and comments (as trivia).
  • Hierarchical: Each node knows its parent and children, forming a complete tree structure.

Core Types

LuaSyntaxTree

The LuaSyntaxTree class represents the parsed representation of a Lua source document. It’s the entry point for working with syntax trees.
using Loretta.CodeAnalysis.Lua;
using Loretta.CodeAnalysis.Lua.Syntax;

// Parse Lua code into a syntax tree
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
Key members of LuaSyntaxTree:
  • GetRoot() - Returns the root LuaSyntaxNode (typically a CompilationUnitSyntax)
  • Options - The LuaParseOptions used to parse the tree
  • GetDiagnostics() - Returns all parsing errors and warnings
  • FilePath - The optional path associated with the tree

LuaSyntaxNode

The LuaSyntaxNode class is the base class for all syntax nodes in the tree. It represents non-terminal nodes (nodes that have children). Key properties:
  • Parent - The parent node in the tree
  • SyntaxTree - The tree this node belongs to
  • Span - The text span this node covers
  • Kind() - Returns the SyntaxKind enum value

CompilationUnitSyntax

CompilationUnitSyntax is the root node of a complete Lua file. It represents the entire source file and contains all statements and an end-of-file token.
var tree = LuaSyntaxTree.ParseText("local x = 10\nprint(x)");
var root = tree.GetCompilationUnitRoot();

// Access statements in the file
foreach (var statement in root.Statements)
{
    Console.WriteLine($"Statement: {statement.Kind()}");
}

// Access the EOF token
var eofToken = root.EndOfFileToken;

Tree Hierarchy

Every syntax tree follows this hierarchy:
LuaSyntaxTree
└── CompilationUnitSyntax (root node)
    ├── Statements (SyntaxList<StatementSyntax>)
    │   ├── LocalVariableDeclarationStatementSyntax
    │   ├── ExpressionStatementSyntax
    │   └── ...
    └── EndOfFileToken (SyntaxToken)

Accessing Children

You can navigate down the tree by accessing node properties:
var code = @"
local x = 10
local y = x + 5
";

var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();

// Get the first statement
var firstStatement = root.Statements[0];

if (firstStatement is LocalVariableDeclarationStatementSyntax localDecl)
{
    // Access the 'local' keyword
    var localKeyword = localDecl.LocalKeyword;
    
    // Access the variable names
    foreach (var name in localDecl.Names)
    {
        Console.WriteLine($"Variable: {name}");
    }
    
    // Access the initializer values
    if (localDecl.EqualsValues != null)
    {
        foreach (var value in localDecl.EqualsValues.Values)
        {
            Console.WriteLine($"Value: {value}");
        }
    }
}

Accessing Parents

You can navigate up the tree using the Parent property:
var token = root.FindToken(5); // Find token at position 5
var parent = token.Parent;     // Get the parent node
var grandparent = parent?.Parent;

Walking the Tree

For more complex traversals, use LuaSyntaxWalker or LINQ methods:
using Loretta.CodeAnalysis;

// Get all descendant nodes
var allNodes = root.DescendantNodes();

// Get all tokens
var allTokens = root.DescendantTokens();

// Find specific node types
var functionCalls = root.DescendantNodes()
    .OfType<FunctionCallExpressionSyntax>();

foreach (var call in functionCalls)
{
    Console.WriteLine($"Function call at {call.Span}");
}

Tokens and Trivia

SyntaxToken

Tokens are the leaves of the syntax tree - they represent individual keywords, identifiers, operators, and literals.
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();

// Find a token at a specific position
var token = root.FindToken(6); // The 'x' identifier

Console.WriteLine($"Token: {token.Text}");
Console.WriteLine($"Kind: {token.Kind()}");
Console.WriteLine($"Span: {token.Span}");

SyntaxTrivia

Trivia represents whitespace, comments, and other non-syntactic elements. They are attached to tokens:
var code = @"
-- This is a comment
local x = 10  -- inline comment
";

var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();
var localKeyword = root.DescendantTokens()
    .First(t => t.Kind() == SyntaxKind.LocalKeyword);

// Leading trivia (before the token)
foreach (var trivia in localKeyword.LeadingTrivia)
{
    if (trivia.Kind() == SyntaxKind.SingleLineCommentTrivia)
    {
        Console.WriteLine($"Comment: {trivia.ToFullString()}");
    }
}

// Trailing trivia (after the token)
var equalsToken = root.DescendantTokens()
    .First(t => t.Kind() == SyntaxKind.EqualsToken);

foreach (var trivia in equalsToken.TrailingTrivia)
{
    Console.WriteLine($"Trivia: {trivia}");
}

Immutability and Tree Transformations

Syntax trees are immutable. To make changes, you create new trees using methods like ReplaceNode, WithXxx, or SyntaxFactory:
var code = "local x = 10";
var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();

// Find a node to replace
var literalExpr = root.DescendantNodes()
    .OfType<NumericalLiteralExpressionSyntax>()
    .First();

// Create a new literal
var newLiteral = SyntaxFactory.LiteralExpression(
    SyntaxKind.NumericalLiteralExpression,
    SyntaxFactory.Literal(20)
);

// Replace the node (creates a new tree)
var newRoot = root.ReplaceNode(literalExpr, newLiteral);

Console.WriteLine(newRoot.ToFullString()); // "local x = 20"
The original root and tree remain unchanged. ReplaceNode returns a new root with the modification applied.

Example: Analyzing a Complete Tree

Here’s a complete example showing tree structure:
using Loretta.CodeAnalysis.Lua;
using Loretta.CodeAnalysis.Lua.Syntax;

var code = @"
local function add(a, b)
    return a + b
end

local result = add(5, 10)
print(result)
";

var tree = LuaSyntaxTree.ParseText(code);
var root = tree.GetCompilationUnitRoot();

Console.WriteLine($"File has {root.Statements.Count} statements");

// Analyze the function declaration
var funcDecl = root.Statements[0] as LocalFunctionDeclarationStatementSyntax;
if (funcDecl != null)
{
    Console.WriteLine($"Function name: {funcDecl.Name.Text}");
    Console.WriteLine($"Parameter count: {funcDecl.Parameters.Parameters.Count}");
    
    // Analyze function body
    var returnStmt = funcDecl.Body.Statements[0] as ReturnStatementSyntax;
    if (returnStmt != null)
    {
        Console.WriteLine($"Returns {returnStmt.Expressions.Count} value(s)");
    }
}

// Count all identifiers in the file
var identifiers = root.DescendantTokens()
    .Where(t => t.Kind() == SyntaxKind.IdentifierToken)
    .Select(t => t.Text)
    .Distinct();

Console.WriteLine($"Unique identifiers: {string.Join(", ", identifiers)}");

Common Patterns

Type Testing

Use pattern matching to work with specific node types:
foreach (var statement in root.Statements)
{
    switch (statement)
    {
        case LocalVariableDeclarationStatementSyntax local:
            Console.WriteLine($"Local variable: {local.Names[0]}");
            break;
        case FunctionDeclarationStatementSyntax func:
            Console.WriteLine($"Function: {func.Name}");
            break;
        case ExpressionStatementSyntax expr:
            Console.WriteLine($"Expression statement");
            break;
    }
}

Finding Nodes by Position

// Find the node at a specific position
var position = 10;
var node = root.FindNode(new TextSpan(position, 0));
Console.WriteLine($"Node at position {position}: {node.Kind()}");

// Find token at position
var token = root.FindToken(position);
Console.WriteLine($"Token at position {position}: {token.Text}");

See Also

  • Parsing - Learn how to create syntax trees from text
  • Diagnostics - Working with parse errors and warnings
  • Scoping - Analyzing variable scope using Script

Build docs developers (and LLMs) love