Skip to content

Interesting languages

“Hardest” languages to test parsers. Most examples taken from Expressive power of LL(k) Boolean grammars

TODO:

LR(0)

{anbnn1}\{a^n b^n | n \geq 1\}

on the other hand aa^* is regular, but not LR(0)

Linear context-free language

“Balanced brackets”:

{anbnn0}\{a^n b^n | n \geq 0\}

And variations of the above:

{anbncln,l0}\{a^n b^n c^l | n, l \geq 0\} {ambmanbnm,n0}\{a^m b^m a^n b^n | m, n \geq 0\}

Classical context sensitive (LinConjLL)

Classical context sensitive example:

{anbncnn0}\{a^nb^nc^n | n \geq 0\}
S → A&C
A → aA | D
D → bDc | ε
C → aCc | B
B → bB | ε

Context-sensitive, Indexed

{anbncn}\{a^nb^nc^n \} {anbn2cn0}\{a^nb^{n^2}c^n 0\} {a2nn1}\{a^{2^n} | n \geq 1\}
S → ACaB
Ca → aaC
CB → DB|E
aD → Da
AD → AC
aE → Ea
AE → ε

“copy langauge”:

{www{a,b}}\{ ww | w \in \{a, b\}^* \} {wwww{a,b}}\{ www | w \in \{a, b\}^* \}

Context-sensitive, not indexed

{an!}\{ a^{n!} \}

From: R.H. Gilman, A shrinking lemma for indexed languages. Theoret. Comput. Sci. 163 (1996) 277–281

{(abn)n}\{ (ab^n)^n \}

PEG

Same example used by Ford to show that PEG can handle context sensitive grammar.

A ← aAb/ε
B ← bBc/ε
S ← &(A!b)a∗ B!.

Actually it can’t handle exactly this language, but it still can handle other context sensitive language. See https://github.com/SRI-CSL/PVSPackrat/issues/3

{a2nn1}\{a^{2^n} | n \geq 1\}

Other context sensitive

{ann is prime}\{a^{n} | n \text{ is prime}\}

TAG (Tree Adjoining Grammar)

{anbmcndmn,m0}\{ a^n b^m c^n d^m | n, m \geq 0 \}

MG (Minimalist Grammar)

{anbncnenfnn0}\{ a^n b^n c^n e^n f^n | n \geq 0 \}

CFLL, but not in LinCFLL

{anbncsn0,s{a,b}}\{a^n b^n cs | n \geq 0, s \in \{a, b\}\}
S → X &¬T
T → X &¬Aca&¬Acb
A → aAb | ε
X → aX | bX | cX | ε.

Strongly non-left-recursive Boolean P-complete language

an1jnban2jn1ba2j3ba1j2ba^{n−1−j_n} b a^{n−2−j_{n−1}} b \ldots a^{2−j_3} b a^{1−j_2} b
S → E&¬AbS&¬CS
A → aA | ε
C → aCAb | b
E → aE | bE | ε.

Strongly non-left-recursive conjunctive language

{wcww{a,b}}\{wcw | w \in \{a, b\}^*\}
S → C &D
C → XCX | c
X → a | b
D → aA&aD | bB&bD | cE
A → XAX | cEa
B → XBX | cEb
E → aE | bE | ε.

Jeż language

{a4nn0}\{a^{4n} | n \geq 0\}
A1 → A1 A3 & A2 A2 | a
A2 → A1 A1 & A2 A6 | aa
A3 → A1 A2 & A6 A6 | aaa
A6 → A1 A2 & A3 A3.

LL(1) linear Boolean language

{ambn0mn}\{a^m b^n | 0 \leq m \leq n\}
S → aS&¬A | bB
A → aAb | ε
B → bB | ε

Boolean

{anb2nn1}\{ a^n b^{2^n} | n \geq 1\}

Context sensitive with backreferences

(a*)b\1b\1

anbanbana^nba^nba^n