The Origin parser (Documentation Index
Fetch the complete documentation index at: https://mintlify.com/boblio-max/origin/llms.txt
Use this file to discover all available pages before exploring further.
parser.py) transforms the flat list[Token] produced by lex() into a structured Abstract Syntax Tree (AST). It is a hand-written, single-pass recursive-descent parser with no external grammar tool dependency. Every syntactic form in Origin — from simple arithmetic expressions to class definitions and parallel {} blocks — is handled by a dedicated method on the Parser class. The resulting AST is a tree of node instances from classes.py, rooted at a ProgramNode, which the Interpreter then walks to emit Python source.
Parser(tokens)
Instantiate the parser by passing the token list returned by lex():
tokens and sets self.pos = 0. The parser advances pos by consuming tokens through eat().
Top-Level Entry Points
parser.program() -> ProgramNode
Parses a complete Origin program. Calls statement() in a loop until the current token is EOF, then returns a ProgramNode whose .statements list contains every top-level ASTNode. This is the method you call to parse a full .or file.
parser.statement() -> ASTNode
Parses a single statement. It first calls skip_newlines() to discard leading blank lines, records the current token’s line number for error reporting, then delegates to the internal _statement() dispatcher. The line number is attached to the returned node via _set_line(node, line).
parser.block() -> BlockNode
Parses a braced block: { statement* }. Calls skip_newlines(), consumes the opening { bracket, then repeatedly calls statement() until the closing } is reached. Returns a BlockNode containing the collected statements.
Internal Token Helpers
eat(type_) -> Token
Consumes and returns the current token when its type matches type_. Advances self.pos by one. Raises SyntaxError if the types don’t match:
skip_newlines()
Skips zero or more consecutive NEWLINE tokens. Origin is not whitespace-sensitive within blocks — newlines are optional statement separators and are stripped before each statement parse.
Expression Precedence
Expressions are parsed through a layered call chain. Each level can call the next-higher-precedence level, so the chain enforces standard operator precedence without an explicit precedence table. From lowest to highest:| Level | Method | Operators handled |
|---|---|---|
| 1 (lowest) | special_expr() | ??, ->, =>, <=>, :: |
| 2 | logic() | &&, ||, and, or |
| 3 | comparison() | ===, !==, ==, !=, <, >, <=, >=, <> |
| 4 | expr() | +, - |
| 5 | term() | *, /, //, %, ** |
| 6 | unary() | - (negation), not, !, ++, -- |
| 7 (highest) | factor() | literals, identifiers, calls, indexing, built-ins |
BinOpNode or LogicOpNode tree.
parser.factor() — Atomic Expressions
factor() handles the highest-precedence syntactic atoms. It recognizes:
- Integer literals (
INT) →NumberNode(int(value), "int") - Hexadecimal literals (
HEX) →NumberNode(int(value, 16), "int") - Float literals (
FLOAT) →NumberNode(float(value), "float") - String literals (
STRING) →StringNode(value[1:-1], "str")(strips quotes) - Boolean keywords
true/false→BoolNode(True)/BoolNode(False) - Built-in functions
range(s, e),sqrt(v),rand_num(s, e),len(v),call[list, pos],int(v),str(v),float(v),bool(v)→ the corresponding AST nodes input(with optional string prompt) →InputNode- Identifiers — followed by optional chains of:
[index]→IndexNode(args...)→CallNode.attr→AttributeNode
- Hardware primitives — identifiers
i2c,spi, oruartfollowed by.method(args)→HardwarePrimitiveNode - Parenthesized expressions
(expr)— or a comma-separated list →TupleNode - List literals
[...]→ListNode(vialist_literal()) - Dict literals
{key: val, ...}→DictNode(viadict_literal())
factor() raises SyntaxError(f"Unexpected token {tok}").
Statement Dispatch
_statement() is the internal method that actually dispatches on the current token to produce an ASTNode. It handles two broad categories:
Identifier-led statements — when the current token is IDENT, _statement() optimistically parses a special_expr() and then checks whether the next token is ASSIGN (=) or ASSIGN_OP (+=, -=, …). If it is, it produces an AssignNode, IndexAssignNode, AttributeAssignNode, or CompoundAssignNode. If the expression parse or assignment check fails with a SyntaxError, self.pos is reset to its saved value and the fallback path re-parses as a bare expression (call statement, etc.).
This limited backtrack — resetting
self.pos on SyntaxError — is the only place the parser is not fully deterministic. It exists solely to distinguish target = value assignments from bare expression statements when the left-hand side begins with an identifier.KEYWORD, _statement() matches on tok.value and calls the appropriate handler:
| Keyword | Returns |
|---|---|
let | AssignNode(name, value, optional_type) |
const | ConstAssignNode(name, value) |
set | SetNode(name, num, type_, params) |
print | PrintNode(expr) |
if | IfNode(...) via if_stmt() |
elif | parsed inside if_stmt() → ElifNode |
else | parsed inside if_stmt() → attached to IfNode.else_body |
while | WhileNode(condition, body) |
for | ForNode(var_name, iterable, body) |
def | FuncNode(name, params, body) |
class | ClassNode(name, fields, body) |
try | TryNode(try_body, except_nodes, else_body) |
except | parsed inside try handler → BlockNode list |
parallel | ParallelNode(body, threads) |
import | ImportNode(name) or ImportAsNode(name, alias) |
from | ImportFromNode(name, library) |
return | ReturnNode(value) |
break | BreakNode() |
continue | ContinueNode() |
pass | PassNode() |
exec | ExecNode(string_literal) |
py | PyNode(raw_python) — consumes tokens until matching } |
IDENT nor one of the above keywords falls through to special_expr(), which allows bare expression statements (such as a standalone function call) to appear at the statement level.
Type Annotations
The parser recognizes optional type annotations onlet and const declarations using a colon:
AssignNode.type and later used by Interpreter.get_type() for type-mismatch detection at code-generation time. The parser itself does not enforce types — it only records them.