Rust-Based Expression Compiler with AST, DAG, and LLVM IR Generation
This is a Rust command-line compiler that tokenizes, parses, and translates arithmetic expressions into LLVM Intermediate Representation (IR). It extends a recursive-descent parser to generate both the Abstract Syntax Tree (AST) and a Directed Acyclic Graph (DAG) for common subexpression elimination. Finally, it emits executable-ready LLVM IR, which can be compiled and run through llc and clang
-
Breadth-first traversal printed with one level per line for easy visualization.
-
Recursive-Descent Parser implementing a top-down grammar with left-recursion removed:
EXPR → TERM EXPRDASH
EXPRDASH → + TERM EXPRDASH | ε
TERM → FACTOR TERMDASH
TERMDASH → * FACTOR TERMDASH | ε
FACTOR → IDENTIFIER | NUMBER | ( EXPR )
-
Scanner recognizes identifiers ([A-Za-z]+), numbers ([0-9]+), +, *, (, ), and skips whitespace.
-
LLVM IR Codegen:
- Dynamically detects all unique variables and emits a function signature such as:
define i64 @foo(i64 %a, i64 %b, i64 %c) { ... }- Emits typed 3-address instructions (add i64, mul i64, etc.)
- Produces .ll files that can be compiled with:
llc -march=arm64 test1.ll clang test.c test1.s -o test1.out ./test1.out
-
Programming Languages/Technologies: Rust, Cargo, C, Makefile
-
Compiler Backend: LLVM v21
-
Build Automation: Makefile (multi-test automation + SDK detection)
-
Key Modules:
scanner.rs → Lexical analysis parser.rs → Recursive-descent parsing ast.rs → AST data structure + BFS printer dag.rs → DAG construction for optimization llvm_codegen.rs → LLVM IR emitter
# Build project
cd pa2
cargo build
# Run single test
cargo run ../test1.exp
# Generate assembly & execute via clang
llc -march=arm64 ../test1.ll
clang ../test.c ../test1.s -o test1.out
./test1.out
# Run all regression tests
make all
-
Implement full value-numbering optimization in DAG generation
-
Add expression evaluator or constant folding pass
-
Visualize AST/DAG with Graphviz for compiler debugging
- Kelvin Ihezue