Skip to content

Relational parsing

In 2020, Herman introduced his new parsing technique called relational parsing Herman (2020). Similar ANTLR and other parsing techniques, relational parsing simulates possible runs of PDAs. It inductively computes the relations between languages of PDA configurations and the ways of reaching them when reading consecutive input symbols. These languages can be computed from the previous one by a constant number of simple, well-known language operations: union, concatenation and Brzozowski derivatives. These languages, called atomic languages, are precomputed from the grammar. They are regular and are represented as NFAs. For the languages computed during parsing, a DAG structure is used. As all cyclicity is embedded within the NFAs of atomic languages themselves, we never need to add edges to existing DAG nodes.

In addition to introducing the relational parsing algorithm, Herman also introduces a technique for context-free memoization. It exploits the typical structure of the DAG representation, splitting it into a stack of smaller components. Each input symbol can be handled by only examining the top element of this stack and can therefore be memoized. This simplifies the parsing of many input symbols to a dictionary lookup and a few operations on a stack

Consider it Parsed, 2023

Related:

Papers