Parsers timeline
Algorithms
List of algorithms (based on this page ↗):
- CYK: Cocke–Younger–Kasami
- bottom-up, Dynamic programming, chart parser
- supports any context-free grammar in Chomsky Normal Form
- Earley
- Dynamic programming, chart parser
- supports any context-free grammar
LL
left-to-right, leftmost derivation (top-down), “recursive descent”
- LL(k)
k
tokens of lookahead- very expensive (when introduced)
- LL(1)
- exactly one token lookahead
- much more efficient
- uses
FIRST()
andFOLLOW()
to build parse tables
- Efficient LL(k)
k
tokens of lookaheadk
is gradually increased w/ backtracking as a fallback- basis of original
ANTLR
- GLL: Generalised LL
LL
analogue toGLR
- maintains multiple parsing descriptors on a stack
- LL(*)
LL
analogue toLAR(m)
- regular lookaheads w/ cyclic DFAs
- basis of
ANTLR 3
- ALL(*): Adaptive LL(*)
- parallel regular lookaheads
- dynamic; adapts to input sentences
- simulates augmented recursive transition networks (ATNs)
- basis of
ANTLR 4
- LLLR: a Combination of LL and LR
LR
left-to-right, rightmost derivation (“bottom-up”), “shift/reduce”
- LR(k)
- allows duplicated states
- fewer conflicts; much larger parse tables
- SLR: Simple LR
- simple
LR
:LR(0)
states and their transitions FOLLOW()
used to compute lookaheads
- simple
- LALR
- similar to
SLR
but w/ smaller lookaheads - equivalently, a simplified version of canonical
LR
- more complicated lookahead calculation
- basis of yacc/bison
- similar to
- GLR: Generalised LR(k)
- generalized
LR(k)
: returns all valid parses - spawns parallel parsing processes on conflicts
- graph-structured stack (
GSS
) - more efficient than Earley
- generalized
- LAR(m)
- regular lookaheads w/ cyclic DFAs
- GLR*
- SGLR: Scannerless GLR
- RIGLR: Reduction Incorporated GLR
- RNGLR: Right nulled GLR
- BRNGLR: Binary Right Nulled GLR
- IGLR: Incremental GLR
- MSGLR: Modular SGLR
- ISGLR: Incremental Scannerless GLR
PEG
- PEG: parsing expression grammar
- similar to CFG-based grammars, but alternation is inherently ordered
- often implemented with “packrat” parsers that memoize partial results
- recursive descent with backtracking
- PEGLL: PEG + GLL