Homework 1 Solutions

(3.2a) Given the grammar G=

A --> Ba | bC
B --> d | eBf
C --> gC | g

determine which of the following strings are in L(G). Construct parse trees for those that are.

bg  bffd  bggg  edfa  eedffa  faae  defa

(3 points) Notice that strings in L(G) must end in either a or g; hence the strings bffd and faae cannot be in the language. Also notice that the only string starting with d is da, so defa is not in the language either. The remaining strings have the following parse trees (hence they are all in L(G)):

   A          A             A             A
  / \        / \           / \           / \
 /   \      /   \         /   \         /   \
b     C    b     C       B     a       B     a
      |         /|      /|\           /|\
      |        / |     / | \         / | \
      g       g  C    e  B  f       e  B  f
                /|       |            /|\
               / |       |           / | \
              g  C       d          e  B  f
                 |                     |
                 |                     |
                 g                     d

(3.5) Given the grammar

S --> aS | SB | d
B --> Bb | c

the parse tree below shows the derivation of the string aadccb. Show that this grammar is ambiguous by exhibiting a different parse tree for the same string.

   S
  / \
 /   \
a     S
     / \
    /   \
   a     S
        / \
       /   \
      S     B
     / \   / \
    /  |   |  \
   S   B   B   b
   |   |   |
   |   |   |
   d   c   c

(2 points) There are actually six different parse trees for aadccb; here are the other five:

   S            S            S            S            S
  /|           /|            |\           |\           |\
 / |          / |            | \          | \          | \
a  S         a  S            S  B         S  B         S  B
   |\           |\          /|  |\       /|  |\       /|  |\
   | \          | \        / |  | \     / |  | \     / |  | \
   S  B         S  B      a  S  B  b   a  S  B  b   S  B  B  b
  /|  |\       /|  |\       /|  |         |\  \     |\  \  \
 / |  | \     / |  | \     / |  |         | \  \    | \  \  \
a  S  B  b   S  B  B  b   a  S  c         S  B  c   a  S  c  c
   |\  \     |\  \  \        |\          /|  |        /|
   | \  \    | \  \  \       | \        / |  |       / |
   S  B  c   a  S  c  c      S  B      a  S  c      a  S
   |  |         |            |  |         |            |
   |  |         |            |  |         |            |
   d  c         d            d  c         d            d

(3.9) Our usual expression grammar
```
E --> E+T | E-T | T
T --> T*F | T/F | F
F --> (E) | i
```
can't be used as is in a predictive parser because it has left recursions. Someone had the idea that he could remove left recursions from this grammar by changing it to
```
E --> T+E | T-E | T
T --> F*T | F/T | F
F --> (E) | i
```
Show that this isn't such a great idea by (a) drawing parse trees for a-b-c using both the original and the revised grammar and (b) using the trees as a guide to determine the results of the corresponding computations if a=3, b=5, and c=8.
1. (2 points)
```
      E           E
     /|\         /|\
    / | \       / | \
   E  -  T     T  -  E
  /|\    |     |    /|\
 / | \   |     |   / | \
E  -  T  F     F  T  -  E
|     |  |     |  |     |
|     |  |     |  |     |
T     F  c     a  F     T
|     |           |     |
|     |           |     |
F     b           b     F
|                       |
|                       |
a                       c
```
2. (1 point) The first parse tree, corresponding to the original grammar, groups the left subtraction before the right, so it computes (a-b)-c; if a=3, b=5, and c=8, then this evaluates to (3-5)-8=-10. The second parse tree, corresponding to the revised grammar, groups the right subtraction before the left, so it computes a-(b-c); if a=3, b=5, and c=8, then this evaluates to 3-(5-8)=6. Therefore, the revised grammar does not produce the same results; it treats subtraction as a right-associative operation instead of left-associative, and would not generate the correct code.
(3.11) Find the FIRST and FOLLOW sets for the grammar
```
S --> ABC
A --> a | Cb | e
B --> c | dA | e
C --> e | f
```
(Note the difference between the terminal e and the empty string e.)
(4 points)
```
FIRST(ABC) = {a,c,d,e,f}
FIRST(a) = {a}   FIRST(Cb) = {e,f}   FIRST(e) = {e}
FIRST(c) = {c}   FIRST(dA) = {d}     FIRST(e) = {e}
FIRST(e) = {e}   FIRST(f) = {f}

FOLLOW(S) = {$}
FOLLOW(A) = {c,d,e,f}
FOLLOW(B) = {e,f}
FOLLOW(C) = {b,$}
```
Observe that this grammar is not LL(1), because both A --> Cb and A --> e are possible when the next symbol is either e or f (these symbols are in both FIRST(Cb) and FOLLOW(A)). The grammar is not ambiguous, however, because looking ahead one more symbol will always find a b in the first case and the end marker $ in the second.