# About

- Playground parsing algorithm is based on PwZ: Parsing with zippers.
- There are extensions to the orginal algorithm, for example, Kleene star, Lookaheads etc.
- There are may be bugs

### Notation

Status | Note | Notation | Expression type | |
---|---|---|---|---|

π’ | concatenation | ` ` (space) | Seq | |

π’ | unordered choice | | | Alt | |

π’ | token | `"x"` | Tok | |

π’ | 1 | symbol | `x` | - |

π’ | Kleene star | `*` | Rep | |

π‘ | 2 | Kleene plus | `+` | |

π‘ | 3 | optional | `?` | |

π΄ | 4 | quantifiers | `{n,m}` | |

π’ | 5 | any character | `.` | Any |

π‘ | 6 | range | `[a-z]` | |

π’ | 6 | set of characters | `[abc]` | |

π’ | 7 | negation of set | `[^abc]` | - |

π’ | 6 | escape sequences | `"\n"` | |

π‘ | 8 | lexeme | `lex(x)` | Lex |

π‘ | 9 | ignored | `ign(x)` | Ign |

π‘ | 10 | positive lookahead | `&` | Pla |

π‘ | negative lookahead | `!` | Nla | |

π’ | 11 | end of file | `!.` | Eof |

π‘ | 12 | ordered choice | `/` | |

π΄ | 13 | intersection | ||

π΄ | 14 | negation | ||

π΄ | codepoints | `\u{hhhh}` | ||

π΄ | character classes | `\w, \d` |

- Symbol expressed as a property of Node (Expression)
- Kleene plus implemented as
`A+ = A A*`

.**TODO**: implement as`Rep`

with min, max - Optional implemented as
`A? = "" | A`

.**TODO**: implement as`Rep`

with min, max - Quantifiers β.
**TODO**: implement as`Rep`

with min, max - Matches any character of length 1. It doesnβt match EoF ("")
- For now implemented as
`Tok`

.**TODO**: support multiple ranges, like`[a-zA-Z_]`

- Implemented as property (
`invert`

) of the`Tok`

`Lexeme`

makes parser scannerless.- I have doubt about notation. For now I use
`lex(x)`

. Other options:- Separate section?
- At rule level?
`lex: S -> "a";`

,`S:lex -> "a";`

- At symbol level?
`S -> lex("a")`

,`S -> lex:"a"`

- I have doubt about notation. For now I use
- Ignored symbols consume input and output empty strings. Useful when tree compaction used (all empty nodes are removed).
- For now implemented as separate Expression type, but could be as well implemented as property of Node (Expression).
- I have doubt about notation. For now I use
`ign(x)`

. Maybe`~x`

?

- Lookaheads similar to PEG operators
**TODO**: There are probably bugs in implementation (especially when lookaheads are recursive)- Allows to express some
**context-sensitive**grammars, like $a^nb^nc^n$

- End of file matched only if token is empty string (which can appear only in the end of file
`str[pos] || ""`

). The difference from`Tok`

is that after matching it doesnβt stop traversing, instead it continues as if it was empty`Seq`

- Ordered choice operator from PEG simulated with negative lookahead:
`A / B = A | !A B`

- I think I need to use backtracking in order to implement effective ordered choice

- Intersection. Original Brzozowski paper had this, but when Might proposed PwD he omitted it. It could be trivially implemented in PwD
- In PwZ it is a bit more tricky. Should work the same as
`Alt`

except it matches only if all branches match - Other formalism that proposed to use intersection is conjuctive grammar
- Allows to express some
**context-sensitive**grammars, like $a^nb^nc^n$

- In PwZ it is a bit more tricky. Should work the same as
- Negation makes sense for matching (negation of character, complement of a language, negative lookahead), but not for parsing. Which tree you suppose to return?