Skip to content

Ambiguity

a word or expression that can be understood in two or more possible ways

Merriam Webster

In regards to parsing ambiguity means that there is more than one parse tree for the same string.

For example:

e := e "+" e | e "-" e | e "*" e | e "/" e | num
num := "1"

For string 1 + 1 + 1 we can produce two parse trees:

pt1 11 e 12 e 11->12 13 + 11->13 14 e 11->14 18 1 12->18 15 e 14->15 16 + 14->16 17 e 14->17 19 1 15->19 20 1 17->20 1 e 2 e 1->2 3 + 1->3 4 e 1->4 5 e 2->5 6 + 2->6 7 e 2->7 8 1 4->8 9 1 5->9 10 1 7->10

Note: this grammar would work only for parser which can handle Left recursion and ambigious grammars, which is not always the case. For example, classical PEG can’t do either.

Why ambiguity is a problem

  1. You need only one interpretation of a string in order to create practical parser for programming language
  2. Ambigious parsing requires more computation power (higher time complexity) and more space. Number of possible parse trees grows according to Catalan Number (it grows very fast)

Dealing with ambiguity

SPPF

First of all we can take care about space. Instead of producing list of separate arrays we can re-use shared parts across different trees. This data structure is called SPPF.

Rewrite grammar

Second approach is to rewrite grammar to unambiguous version. But:

  • this is manual process
  • this will (most likely) change the shape of the tree - due to introfuction of new none-terminals

For example, grammar above can be changed to

e := md
md := md "*" pm | md "/" pm | pm
pm := pm "+" num | pm "-" num | num
num := "1"
pt1 1 e 2 md 1->2 3 pm 2->3 4 pm 3->4 5 + 3->5 10 1 3->10 7 pm 4->7 8 + 4->8 12 1 4->12 11 1 7->11

As the result:

We added only one level of precedence - MD (*, /) more than PM (+, -). Real-life languages can use 7 or more levels. Imagine how much more convoluted grammar and tree would become.

Disambiguating rules

Third approach is to add disambiguating rules on top of ambigious grammar. Typical example of such rules would be precedence and associativity. Something like this:

precedence 2: "*", "/"
precedence 1: "+", "-"
left-associative: "*", "/", "+", "-"

For example precedence and/or associativity supported by:

Other

Another well-known example of ambiguity is “dangling else”.