CIS505 Assignment 2

10 points; due September 20

We will develop an object-oriented language in stages. We start with the syntax of the assignment language from Exercise 1, P ::= CL CL ::= C;* (zero or more C separated by ; ) C ::= L = E | if E : CL else CL end | print E E ::= N | E1 OP E2 | L OP ::= + | - L ::= I N ::= a nonempty string of digits I ::= a nonempty string of letters, not a keyword but this time implemented by a heap-storage virtual machine, like that in Chapter 2, Section 2.2.

To start, you get for free the complete system for the above language. Its parser is generated from a tool, PLY, which built the parser from the grammar rules. You can look inside $a23pars.py$ to see what I mean. There is documentation at $http://www.dabeaz.com/ply/$.)

When you download and run the interpreter on a program, you will see it behaves like this:

Type program; OK to do it on multiple lines; terminate with  !
  as the first symbol on a line by itself:
x = 2;
print x;
y = (x + 1);
x = (y + y)
!
Parse tree:
[[], [['=', 'x', '2'], ['print', ['deref', 'x']], ['=', 'y', ['+', ['deref', 'x'], '1']], ['=', 'x', ['+', ['deref', 'y'], ['deref', 'y']]]]]
Execution:
2
namespace = h0
heap = {
  h0 : {'x': 2}
}
Successful termination.
namespace = h0
heap = {
  h0 : {'y': 3, 'x': 6}
}
Press Enter key to terminate
Notice that (i) The parse tree has an empty list of declarations (since the core language doesn't have declarations). (ii) The global variables are stored in a namespace in the heap, and the handle to that namespace is remembered by the interpreter. (The handles are not Greek letters but $h0$, $h1$, etc.) (iii) The $print E$ command prints $E$'s value along with a "dump" of the current storage. This will help you debug your work.

What you will do

You will extend the language with declared variables and parameterized procedures. Procedures can own private declarations. (Objects and classes are added in Assignment 3.) First complete Part A of this exercise. Once you do so, work Part B. (If you don't finish Part B, it is not a disaster, but be certain to do Part A!)

Part A: declarations and procedure calls

First, we add integer declarations and procedures with parameters. A sample program in the extended language looks like this:
int x = 2;
proc p(y, z):
   print y;  x = (y - z);  q(z);  z = 0
end;
proc q(y): x = (x + (y - 1)); print y;  end;
print x;
p(9, (x+1));
Here is the execution of the above program. It uses an activation stack, which you must add to the interpreter. Declared procedures are saved as closures.
Type program; OK to do it on multiple lines; terminate with  !
  as the first symbol on a line by itself:

int x = 2;
proc p(y, z):
   print y;  x = (y - z);  q(z);  z = 0
end;
proc q(y): x = (x + (y - 1)); print y;  end;
print x;
p(9, (x+1));
!
Parse tree:
[[['int', 'x', '2'], ['proc', 'p', ['y', 'z'], [], [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']]], ['proc', 'q', ['y'], [], [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]], ['print', ['deref', 'y']]]]], [['print', ['deref', 'x']], ['call', 'p', ['9', ['+', ['deref', 'x'], '1']]]]]
Execution:
2
activation stack = ['h0']
heap = {
  h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2}
  h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']],
        'params': ['y', 'z'], 
        'type': 'proc', 
        'link': 'h0'}
  h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 
        'params': ['y'], 
        'type': 'proc', 
        'link': 'h0'}
}
9
activation stack = ['h0', 'h3']
heap = {
  h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2}
  h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 
        'params': ['y', 'z'], 
        'type': 'proc', 
        'link': 'h0'}
  h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 
        'params': ['y'], 
        'type': 'proc', 
        'link': 'h0'}
  h3 : {'y': 9, 'parentns': 'h0', 'z': 3}
}
3
activation stack = ['h0', 'h3', 'h4']
heap = {
  h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8}
  h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 
        'params': ['y', 'z'], 
        'type': 'proc', 
        'link': 'h0'}
  h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 
        'params': ['y'], 
        'type': 'proc', 
        'link': 'h0'}
  h3 : {'y': 9, 'parentns': 'h0', 'z': 3}
  h4 : {'y': 3, 'parentns': 'h0'}
}

Successful termination.
activation stack = ['h0']
heap = {
  h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8}
  h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 
        'params': ['y', 'z'], 
        'type': 'proc', 
        'link': 'h0'}
  h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 
        'params': ['y'], 
        'type': 'proc', 
        'link': 'h0'}
  h3 : {'y': 9, 'parentns': 'h0', 'z': 0}
  h4 : {'y': 3, 'parentns': 'h0'}
}
Press Enter key to terminate
The implementation uses an activation stack (implemented as a Python list) of handles to namespaces (Python dictionaries). A namespace holds declared variables and procedures. Procedures are saved as closures, modelled here by dictionaries. When a procedure is called, a new namespace is allocated, the parameter-argument bindings are stored in the new namespace, the $parentns$ link is set, and the namespace's handle is pushed onto the activation stack. When a procedure completes, its handle is popped from the activation stack. See Section 5.2.1 of the lecture notes for how this works.

IMPORTANT: the execution did no garbage collection of unneeded namespaces. I won't stop you if you want to erase unneeded namespaces, but it isn't required.

Here's the extended language's grammar:

P : Program                      E : Expression
CL : CommandList                 L : LefthandSide
C : Command                      I : Variable
DL : DeclarationList             N : Numeral
D : Declaration

P  ::=  DL  CL

DL ::=  D;*
   that is, a sequence of zero or more Ds, separated by  ;
   ( DL ::=  D  |  D ; DL  |  empty  )

D  ::=  int I = E  |  proc I ( IL ) : CL end 

CL ::=  C;*
C  ::=  L = E  |  if E : CL1 else CL2 end  |  print E  |  L ( EL )

EL ::=  E,*
E  ::=  N  |  ( E1 OP E2 )  |  L
OP ::=  +  |  -

L ::= I
IL ::=  I,*
I ::=  strings of letters, not including keywords

N ::=  string of digits

Interpreter input format

The input to the interpreter is the list-based operator tree constructed by the parser. The syntax goes like this:

PTREE ::=  [DLIST, CLIST]

DLIST ::=  [ DTREE* ]
           where  DTREE*  means zero or more DTREEs
DTREE ::=  ["int", ID, ETREE]  |  ["proc", ID, ILIST, [], CLIST]
           (note: the [] in the "proc" operator tree will be used in Part B)

CLIST ::=  [ CTREE* ]
CTREE ::=  ["=", LTREE, ETREE]  |  ["if", ETREE, CLIST, CLIST]
        |  ["print", ETREE]  |  ["call", LTREE, ELIST]

ELIST ::=   [ ETREE* ]
ETREE ::=  NUM  |  [OP, ETREE, ETREE] |  ["deref", LTREE]  
      where  OP ::=  "+"  | "-"

LTREE ::=  ID

ILIST ::= [ ID* ]
ID    ::=  a nonempty string of letters

NUM   ::=  a nonempty string of digits
I have all this already built for you, just like I did for Assignment 1.

Do Part A in these steps:

Here are the steps you take to do this assignment:

  1. Go to the CIS505 assignment page and download the folder, $Ex2$. It contains the scanner ($a23lex$), parser ($a23parse$), parser-generator library ($ply$), heap ($heapmodule$), starting interpreter ($interpret$), and driver ($run$). Double-click the $run$ icon or type $python run.py$ to run the prototype system and test the example at the top of this sheet.

  2. Within $heapmodule.py$, replace the active-namespace variable, $ns$, by an activation stack, and write functions that push a handle onto the activation stack, pop the stack, check if the stack is empty, and return the top handle. Modify $activeNS$, $initializeHeap$, and $printHeap$ so that they use the stack (and not $ns$). Test that the system still executes the starter test case ok. (Hint: read the notes at http://people.cis.ksu.edu/~schmidt/505f14/pythonstrucs.html to see how to use a Python list like a stack.)

    Next, in $interpret.py$, add integer declarations to $interpretDTREE$. For $int I = E$, do this: compute the value of $E$ in the active namespace (the namespace whose handle lies at the top of the activation stack). Make certain that variable $I$ is not already declared in the active namespace. (If it is already declared, then it's an error that prints a message and stops execution.) Finally, bind $I$ to $E$'s value in the active namespace. It will be simplest if you write a new function, $declare$, add it to $heapmodule.py$, and call it to declare a new variable.

    IMPORTANT: modify the interpreter to enforce that the variable that appears in position $L$ of $L = E$ is already declared. (Otherwise, it's an error that prints a message and stops execution.) It will be simplest if you modify the $update$ function in $heapmodule.py$ to do the check.

  3. Next, implement procedure declarations in $interpretDTREE$: For $proc I(IL): CL end$, do this: bind $I$ to the handle of a newly allocated closure (the closure is implemented as a dictionary) that holds $IL$, $CL$, and the handle to the active namespace. (See the CIS505 notes, Section 5.2.1, for details. Study carefully the example at the top of this assignment sheet.) Make certain that variable $I$ is not already declared in the active namespace (Otherwise, it's an error that prints a message and stops execution).

  4. Now, implement procedure call in $interpretCTREE$: For $L(EL)$, do these steps, as described in the CIS505 notes: (i) Compute the meaning of $L$, verify that the meaning is the handle to a procedure closure, and extract from that closure these parts: $IL$, $CL$, and $parentns$ link. (If $L$ is not bound to a handle of a proc closure, it's an error that stops execution.) (ii) evaluate $EL$ to a list of values (iii) Allocate a new namespace. (iv) Within the new namespace, bind $parentns$ to the handle extracted from the closure; bind the values from $EL$ to the corresponding names in $IL$. (Make certain that the number of arguments in $EL$ equals the number of parameters in $IL$. Otherwise, it's an error that prints a message and stops execution). (v) Push the new namespace's handle onto the activation stack, execute $CL$, and upon completion pop the activation stack.

    Since the intepreter uses multiple namespaces, you must implement a smarter lookup algorithm: within $interpretLTREE$, when searching for a variable, $V$, look in the active namespace first; if $V$ is not there, then look for $V$ in the $parentns$ namespace. (See the CIS505 lecture notes, Section 5.2.1.)

Remember to document appropriately your modified interpreter.

Part B: multiple levels of declarations

Don't start this unless Part A is completely finished correctly.

C, Java, and C# artificially restrict programs to have just two levels of declaration nesting: local and global. This is done so the compiler can work with a simple storage layout. Other object languages have no such restriction. In Part B, you extend the language so there are multiple levels of declarations.

Here is an example: procedure $p$ owns local declarations, $z$ and $q$, and it can reference the global variable $x$ (and itself!). Within $q$, there will be three levels of declarations, which you see when $q$ is called: int x = 2; proc p(y): int z = 3; proc q(m) : print m; x = ((m + y) + z) end; q((x + z)); end; p(9); Here is the execution: Parse tree: [[['int', 'x', '2'], ['proc', 'p', ['y'], [['int', 'z', '3'], ['proc', 'q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]], [['call', 'q', [['+', ['deref', 'x'], ['deref','z']]]]]]], [['call', 'p', ['9']]]] Execution: 5 activation stack = ['h0', 'h2', 'h4'] heap = { h0 : {'parentns': 'nil', 'p': 'h1', 'x': 2} h1 : {'body': [['call', 'q', [['+', ['deref', 'x'], ['deref', 'z']]]]], 'params': ['y'], 'type': 'proc', 'link': 'h0', 'locals': [['int', 'z', '3'], ['proc','q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]]} h2 : {'y': 9, 'parentns': 'h0', 'z': 3, 'q': 'h3'} h3 : {'body': [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]], 'params': ['m'], 'type': 'proc', 'link':'h2', 'locals': []} h4 : {'parentns': 'h2', 'm': 5} } Successful termination. activation stack = ['h0'] heap = { h0 : {'parentns': 'nil', 'p': 'h1', 'x': 17} h1 : {'body': [['call', 'q', [['+', ['deref', 'x'], ['deref', 'z']]]]], 'params': ['y'], 'type': 'proc', 'link': 'h0', 'locals': [['int', 'z', '3'], ['proc','q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]]} h2 : {'y': 9, 'parentns': 'h0', 'z': 3, 'q': 'h3'} h3 : {'body': [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]], 'params': ['m'], 'type': 'proc', 'link':'h2', 'locals': []} h4 : {'parentns': 'h2', 'm': 5} } Press Enter key to terminate When $q$ is called, its namespace, $h4$, holds its parameter, $m$, and a link to $h2$, which holds $y$, $z$, and $q$, and a link to $h0$, which holds $x$ and $p$. All these variables are visible to $q$'s code.

The language's syntax for declarations now looks like this:

D ::=  int I = E  |  proc I ( IL ) : DL CL end
The local declarations $DL$ are evaluated when the procedure is called, just before the procedure's $CL$ is computed. The operator trees for declarations look like this:
DTREE ::=  ["int", ID, ETREE]  
        |  ["proc", ID, ILIST, DLIST, CLIST]  
           # Note:  DLIST holds the local decs; it might be empty, too.
You revise the code in $interpretDTREE$ so that it builds a closure that saves the local declarations. When the procedure is called in $interpretCTREE$, you evaluate the local declarations just before you execute the procedure's commands. These steps are easy.

The interesting part is improving variable lookup in $interpretLTREE$ so that it uses a loop to follow the $parentns$ links to locate nonlocal variables. Be careful.

Testing

The $Ex2$ folder contains a file of test cases that you must use. Use at least these tests to check your implementation. You might devise additional tests to see if the interpreter detects program errors and prints appropriate messages.

IMPORTANT: For each successful test case, you must copy the test case and all its output from the command window and paste it into a file named $tests.txt$. You will not receive credit for the tests unless you submit them in the $tests.txt$ file.

If you do not know how to paste into a command window or how to copy and paste from a command window, please ask me! It is not hard to learn.

Submission and grading

Make a folder, $YourFirstName.YourLastName2$, and place in it your modified versions of $heapmodule.py$ and $interpret.py$ and also $tests.txt$. (See the note just above about the tests!) Zip the folder into a $.zip$ file and submit the $.zip$ file to the CIS505 site on K-State Online. The teaching assistants will study your work and your tests and apply some additional tests before scoring the submission.