02_grammar

Grammar Specification

The ARES Grammar

Purpose

The ARES Grammar is the set of structural rules that define the ARES language. It is a system that specifies exactly how words (tokens) must be combined to form valid commands, calculations, and programs. It ensures that the compiler always understands exactly what you mean when you write a line of code.

Why it exists

Without a strict grammar, a programming language would be chaotic. The computer wouldn't know if a + b * c means you should add first or multiply first. The grammar exists to provide this order. it defines the "precedence" of operations—the rules that tell the compiler which part of a math problem to solve first. It also ensures that every statement has a clear beginning and end, so your intent is never misunderstood. It provides the "blueprint" that allows the compiler to turn your text into a perfectly structured tree.

How it works

The grammar is built using a method called Recursive Descent.

  1. Top-down organization. The system starts with the biggest part of your program (the whole script) and recursively breaks it down into smaller and smaller pieces, like statements and expressions. Technically, this follows an LL(k) parsing strategy where the "L" stands for scanning from Left to Right and the second "L" stands for producing a Leftmost derivation. The parser begins at the root "Program" node and descends into sub-rules such as "StatementList" and "Expression", matching the input tokens against the expected grammar productions.
  2. The Precedence Ladder. Math and logic are organized into levels. The compiler looks at the higher levels (like multiplication) before the lower levels (like addition). This ensures your calculations are always solved correctly. The grammar implements this hierarchy through the Backus-Naur Form (BNF) structure by layering rules. For example, an "AddExpression" rule calls a "MultExpression" rule, which in turn calls a "PrimaryExpression" rule. This nesting naturally enforces the order of operations without requiring complex post-parsing transformations.
  3. Statement priority. Standard commands like read, use, and let are given priority. When the system sees a new line, it checks if it matches one of these important commands first to ensure your intent is captured accurately. The parser uses a prioritized choice operator to evaluate different statement types. It attempts to match more specific keyword-based structures, such as a "RegistryCommand", before falling back to more general patterns like a "VariableAssignment". This ensures that high-level ARES features are always identified correctly.
  4. Lookahead. To tell the difference between two similar commands, the system "looks ahead" at the next few words in your script. This allows it to distinguish between a variable creation and a simple math statement. ARES employs a k=1k=1 token lookahead to resolve grammar branch points. By inspecting the next symbol in the token stream without consuming it, the parser can jump to the correct logic path. This lookahead capability is essential for handling flexible syntax where multiple valid rules might start with the same token.

Intuition

Think of the grammar like the blueprint for a building. The blueprint defines exactly where the walls (statements) must be placed and how they must connect to the foundation (the main program). It also specifies which parts must be built first for the house to be stable. You can't put the roof on before the walls, just like you can't join two values before you have identified what they are. The grammar ensures that the final building (your program) is strong and follows all the rules of architecture.

Implementation details

The ARES grammar is defined in src/parser/parser.ts. It uses a professional tool called Chevrotain to turn these rules into a working parser.

  • Expressions: The grammar handles everything from simple numbers to complex function calls like matrix[i][j](args).
  • Unifying syntax: The system understands multiple ways of writing the same thing. For example, it treats let x = 1 and int x is 1 as the same type of command internally.

Complexity

Parsing a script is very fast (O(N)O(N)). This means the time it takes to understand your code grows at the same rate as the size of your script. ARES can parse even the largest projects in a fraction of a second.

Trace example

This is what happens when the system reads x = 1 + 2 * 3:

  1. Multiplication first: The system looks at the "Multiplicative" level and groups 2 * 3 together first.
  2. Addition next: It then moves to the "Additive" level and adds 1 to the result of the multiplication.
  3. Assignment last: Finally, it identifies the "Let" or "Is" command and stores the final result in the variable x.

Related entities

  • 01_lexical_rules.md: Defines the individual words that the grammar combines into statements.
  • 03_ast_nodes.md: Explains the tree structure that the grammar builds once it understands your commands.
ARES