Skip to content

Scanerless parser

A scannerless parser, is a type of parser that doesn’t use a separate scanning (or lexical analysis) phase to tokenize the input before parsing it. Instead, it performs tokenization and syntax parsing in a single step. Traditional parsing involves a two-step process: first, a scanner (or lexer) breaks the input into tokens, and then a parser analyzes the token sequence according to a set of grammar rules.

Examples

Here are a few examples of scannerless parsers:

  1. Generalized LR (GLR) Parsers.

  2. SDF (Syntax Definition Formalism): This is a formalism used for defining syntax. It’s often associated with the Meta-Environment, an environment for generating programming environments. The SDF promotes scannerless parsing by allowing the integration of lexical and context-free syntax into a single specification.

  3. Rascal: It’s a meta-programming language that uses SDF for syntax definition and promotes scannerless parsing.

  4. PEG (Parsing Expression Grammar): While PEGs can be used with lexers, they are also inherently capable of scannerless parsing.

  5. ANTLR: While ANTLR is primarily known as a parser generator that produces a separate lexer and parser, it can be configured for scannerless parsing if needed.

  6. Elkhound: It is a parser generator that is based on the GLR parsing algorithm and can be used for scannerless parsing.