Creating a console project
Since our program will be invoked from the command line, we’ll need to create a console project.First let’s create a new console project by running And now let’s add a reference to the Loretta nuget package with And then save the initial code that was presented in the introduction (the one without all the
dotnet new console in an empty directory:dotnet add package Loretta.CodeAnalysis.Lua:locals at the top) in a file named sample.lua.Implementing file loading
Now it is time for us to make our program load lua files and parse them. We’ll start by loading the files into a
Then we actually load the file into a The responsibility
SourceText first by using SourceText.From.
Also note that since we’re using new templates.First we’ll add the using for Loretta.CodeAnalysis.Text so that we can access
SourceText and then we add validation for the provided file path:SourceText:SourceText has is to store the code in memory in a format that won’t
make it end up in the LOH (Large Object Heap) as well as allowing us to obtain specific characters from it and/or
substrings of the code as well as splitting the code into multiple TextLines.We can also use the SourceText to map a TextSpan to a LinePositionSpan
which can then be used for error reporting.SourceText is also important for obtaining the checksum of a file as well as calculating the
changes between two versions of a file with SourceText.GetTextChanges
or simply applying a set of changes to a file with SourceText.WithChanges.Our code up to this point
Parsing code into a tree
Now it is time for us to parse the file we just loaded into a tree so we can manipulate it.First we’ll need to add another using for Here we do 2 things: we define a In Loretta (as in Roslyn), errors, warnings and infos are called diagnostics.
A diagnostic contains important information about an error such as:
Loretta.CodeAnalysis.Lua so we can access LuaSyntaxTree
so we can call its ParseText method. Then we’ll need to pick a preset for the files
we’ll be loading.The “preset” we’ll be choosing is a set of options in the LuaSyntaxOptions class which defines which
errors the parser will be generating as well as which constructs the parser will accept or not (such as integers, C comments
and C boolean operators for Garry’s Mod Lua, Typed Lua for Luau/Roblox Lua and others).The presets that are currently available are the following:LuaSyntaxOptions.Lua51: The preset for Lua 5.1LuaSyntaxOptions.Lua52: The preset for Lua 5.2LuaSyntaxOptions.Lua53: The preset for Lua 5.3LuaSyntaxOptions.Lua54: The preset for Lua 5.4LuaSyntaxOptions.LuaJIT20: The preset for LuaJIT 2.0LuaSyntaxOptions.LuaJIT21: The preset for LuaJIT 2.1LuaSyntaxOptions.FiveM: The preset for FiveM’s flavor of Lua 5.3LuaSyntaxOptions.GMod: The preset for Garry’s Mod’s flavor of LuaJIT 2.0
-
LuaSyntaxOptions.All: The preset for accepting the most Lua without integers -
LuaSyntaxOptions.AllWithIntegers: The preset for accepting the most Lua with integers The side effect of accepting integers is that this preset will not accept C comment syntax.
LuaParseOptions using the
LuaSyntaxOptions.Lua51 preset and then call
LuaSyntaxTree.ParseText with the text we loaded earlier as well
as the parse options and the file name through (args[0]).Now, we need to check that the parsed code contains no errors. We’ll do that by using
SyntaxTree.GetDiagnostics and checking that the list of diagnostics
has no errors:Diagnostic.Id: The diagnostic’s ID. As an example,LUA0001is the diagnostic ID for an invalid string escape.Diagnostic.Location: The location the diagnostic was reported at. This is important for being able to point to the user where an error or warning is in their text editor or to output it to the command line.Diagnostic.Severity: The diagnostic’s severity (whether it’s an error, warning, info or suggestion). The value is a member of theDiagnosticSeverityenum.Diagnostic.Descriptor: This is the instance of the diagnostic’s definition which we call aDiagnosticDescriptor.
DiagnosticDescriptor as a class’ definition and the
Diagnostic as the class’ instance.Our code up to this point
Collecting function calls
Now that we have the parsed tree from the file and have confirmed it does not have any errors, it is time for us
to start extracting the function calls from the tree so that we can create local variables for them.For that we’ll use one of the fundamental building blocks of working with trees in Loretta:
Then we need to actually create our class and make it inherit from The constructor is private because we’ll be exposing the functionality of this class as a public static method and it being
a Now, we have to actually add the function calls to the list. We’ll do that by overriding the
And lastly, let’s add our public static method at the top of our class so that we can actually use this walker:Now that we have our function call collector done, it’s time for us to actually use it back in
LuaSyntaxWalker.
The walker allows us to go through every node of the tree recursively and only act upon the nodes we’re interested
in, which in our case is the FunctionCallExpressionSyntax.Since we’ll be implementing a new class for the walker we’ll create a new file called FunctionCallCollector.cs
which will start out with 3 usings for namespaces which we’ll need as well as our namespace:LuaSyntaxWalker as well
as add a proper constructor for it passing SyntaxWalkerDepth.Node
to the LuaSyntaxWalker constructor as we are not interested in anything below nodes for this
walker:LuaSyntaxWalker will be an internal implementation detail of the class.But you might’ve noticed we’re missing something. That’s right! We’re missing a list so we can store the function calls we’ll
be collecting!For that we’ll be using an ImmutableArray<FunctionCallExpressionSyntax>.Builder so that later we can return an
ImmutableArray<FunctionCallExpressionSyntax>:VisitFunctionCallExpression method so that we can do something whenever it finds a
function call.
It’s also important to keep in mind we’ll have to call the base method otherwise other function calls that might be contained inside the
one we’re visiting will not be visited.Program.cs.Our code so far
- Program.cs
- FunctionCallCollector.cs
Function call processing
Now that we have all the function calls in the script, we need to deduplicate them and group them up so we can map
each function call to a local variable.First we need to filter the list of function calls to the ones that we can process.
For simplicity’s sake, we’ll only be accepting function calls on identifiers and members of identifiers (e.g.: Then afterwards, we’ll make a function that will convert a node into its local name:And then finally, we’ll use a bit of LINQ to glue everything together:Then finally, we can print out the results of the array so we can see something in the console
for the first time!Which results in the following output:
print
or math.ceil).For that, we’ll be implementing a method to check a function call’s Expression
to see if it is an IdentifierNameSyntax or a MemberAccessExpressionSyntax
whose base Expression is a IdentifierNameSyntax.We can check if a node is of a certain type by using the IsKind method:Our code so far
- Program.cs
- FunctionCallCollector.cs
Rewriting the input file
For this last step, we’ll rewrite the input file to add the Then we’ll create our rewriter which will inherit from Then for the first step of our rewriter, we’ll override Now we need to create the
Now we can go back toWhich prints out the rewritten node to the console, resulting in the following output:Lastly, we now need to rewrite the function calls to use the locals instead of the globals by
overriding And now when we run the program again, we get the following output in the console:
local declaration at the top of the file as
well as rewriting all function calls to their local counterparts.For this we’ll be using another fundamental building block of working with Loretta trees: LuaSyntaxRewriter.
The rewriter allows you to replace certain nodes in the tree without having to modify all parents yourself
as well as making your life easier since you can handle only the nodes you’re interested in.Another important component of this will be the SyntaxFactory which is the static
class that’s used to create everything related to nodes including the nodes themselves but also the
SyntaxList and the SeparatedSyntaxList.With the LuaSyntaxRewriter and SyntaxFactory introduction out of the way, let’s get started on writing the
code with the following usings as well as the namespace in a file named Rewriter.cs:LuaSyntaxRewriter and
have 2 private fields:- A field to store the groups we generated when mapping the function calls to their local name counterparts;
- And another field to store the mapping of the strings to the
IdentifierNameSyntax.
VisitCompilationUnit
so that we can add the local variable declaration at the top of it.
The CompilationUnitSyntax represents a parsed file and contains only the
list of statements at the root of the file as well as the EOF token.But first we’ll make it visit every statement in the compilation unit and update the compilation unit with
the results of it:LocalVariableDeclarationStatementSyntax node:A lot is being done in the code above so take a while to read it through carefully.
Introduction to Trivia
In the code above, we usedWithTrailingTrivia as well as
GetTrailingTrivia to manipulate the trivia of the
local variable declaration node we created. But what is trivia?In Loretta (as in Roslyn), we call extraneous syntax that doesn’t necessarily impact parsing
such as whitespaces, line breaks, comments and shebangs (the #!/bin/bash you see at the
start of some linux scripts) trivia and they are stored as part of the token preceding or
following them:- Leading trivia is all trivia located since the first line break after the previous token.
- Trailing trivia is all trivia after a token up to (and including) the first line break.
Now we can go back to
Program.cs to replace the loop where we print the groups with the
following:VisitFunctionCallExpression:Final code
- Program.cs
- FunctionCallCollector.cs
- Rewriter.cs