Part 2 of the Compiler Design & Implementation series. Parsing is where the compiler determines the validity of an input program’s syntax. Conceptually, the parser takes the flat stream of tokens produced by the scanner and gives it structure — a tree that captures the grammatical relationships between language constructs.
What This Post Covers
- Context-Free Grammars — Productions, derivations, and the Chomsky hierarchy
- Parse Trees vs. Abstract Syntax Trees — What information we keep and what we discard
- Top-Down Parsing — Recursive descent, LL(1) grammars, and FIRST/FOLLOW sets
- Bottom-Up Parsing — Shift-reduce, LR parsing, and why most production compilers use LALR(1)
- Error Recovery — Panic mode, phrase-level recovery, and producing useful error messages
- Implementation — Building a recursive descent parser in Rust for the C0 language
The Parser’s Job
The scanner doesn’t understand structure. It sees if, (, x, >, 0, ), { as a flat sequence of tokens. The parser’s job is to recognize that this sequence forms an if-statement, with a condition and a body — and to reject sequences that don’t form valid programs.
In progress — full implementation coming soon.