We will develop an object-oriented language in stages.
We start with the syntax of the assignment language from Exercise 1,
To start, you get for free the complete system for the above language. Its parser is generated from a tool, PLY, which built the parser from the grammar rules. You can look inside $a23pars.py$ to see what I mean. There is documentation at $http://www.dabeaz.com/ply/$.)
When you download and run the interpreter on a program, you will see it behaves like this:
Type program; OK to do it on multiple lines; terminate with ! as the first symbol on a line by itself: x = 2; print x; y = (x + 1); x = (y + y) ! Parse tree: [[], [['=', 'x', '2'], ['print', ['deref', 'x']], ['=', 'y', ['+', ['deref', 'x'], '1']], ['=', 'x', ['+', ['deref', 'y'], ['deref', 'y']]]]] Execution: 2 namespace = h0 heap = { h0 : {'x': 2} } Successful termination. namespace = h0 heap = { h0 : {'y': 3, 'x': 6} } Press Enter key to terminateNotice that (i) The parse tree has an empty list of declarations (since the core language doesn't have declarations). (ii) The global variables are stored in a namespace in the heap, and the handle to that namespace is remembered by the interpreter. (The handles are not Greek letters but $h0$, $h1$, etc.) (iii) The $print E$ command prints $E$'s value along with a "dump" of the current storage. This will help you debug your work.
You will extend the language with declared variables and parameterized procedures. Procedures can own private declarations. (Objects and classes are added in Assignment 3.) First complete Part A of this exercise. Once you do so, work Part B. (If you don't finish Part B, it is not a disaster, but be certain to do Part A!)
int x = 2; proc p(y, z): print y; x = (y - z); q(z); z = 0 end; proc q(y): x = (x + (y - 1)); print y; end; print x; p(9, (x+1));Here is the execution of the above program. It uses an activation stack, which you must add to the interpreter. Declared procedures are saved as closures.
Type program; OK to do it on multiple lines; terminate with ! as the first symbol on a line by itself: int x = 2; proc p(y, z): print y; x = (y - z); q(z); z = 0 end; proc q(y): x = (x + (y - 1)); print y; end; print x; p(9, (x+1)); ! Parse tree: [[['int', 'x', '2'], ['proc', 'p', ['y', 'z'], [], [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']]], ['proc', 'q', ['y'], [], [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]], ['print', ['deref', 'y']]]]], [['print', ['deref', 'x']], ['call', 'p', ['9', ['+', ['deref', 'x'], '1']]]]] Execution: 2 activation stack = ['h0'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} } 9 activation stack = ['h0', 'h3'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 3} } 3 activation stack = ['h0', 'h3', 'h4'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 3} h4 : {'y': 3, 'parentns': 'h0'} } Successful termination. activation stack = ['h0'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 0} h4 : {'y': 3, 'parentns': 'h0'} } Press Enter key to terminateThe implementation uses an activation stack (implemented as a Python list) of handles to namespaces (Python dictionaries). A namespace holds declared variables and procedures. Procedures are saved as closures, modelled here by dictionaries. When a procedure is called, a new namespace is allocated, the parameter-argument bindings are stored in the new namespace, the $parentns$ link is set, and the namespace's handle is pushed onto the activation stack. When a procedure completes, its handle is popped from the activation stack. See Section 5.2.1 of the lecture notes for how this works.
IMPORTANT: the execution did no garbage collection of unneeded namespaces. I won't stop you if you want to erase unneeded namespaces, but it isn't required.
Here's the extended language's grammar:
P : Program E : Expression CL : CommandList L : LefthandSide C : Command I : Variable DL : DeclarationList N : Numeral D : Declaration P ::= DL CL DL ::= D;* that is, a sequence of zero or more Ds, separated by ; ( DL ::= D | D ; DL | empty ) D ::= int I = E | proc I ( IL ) : CL end CL ::= C;* C ::= L = E | if E : CL1 else CL2 end | print E | L ( EL ) EL ::= E,* E ::= N | ( E1 OP E2 ) | L OP ::= + | - L ::= I IL ::= I,* I ::= strings of letters, not including keywords N ::= string of digits
The input to the interpreter is the list-based operator tree constructed by the parser. The syntax goes like this:
PTREE ::= [DLIST, CLIST] DLIST ::= [ DTREE* ] where DTREE* means zero or more DTREEs DTREE ::= ["int", ID, ETREE] | ["proc", ID, ILIST, [], CLIST] (note: the [] in the "proc" operator tree will be used in Part B) CLIST ::= [ CTREE* ] CTREE ::= ["=", LTREE, ETREE] | ["if", ETREE, CLIST, CLIST] | ["print", ETREE] | ["call", LTREE, ELIST] ELIST ::= [ ETREE* ] ETREE ::= NUM | [OP, ETREE, ETREE] | ["deref", LTREE] where OP ::= "+" | "-" LTREE ::= ID ILIST ::= [ ID* ] ID ::= a nonempty string of letters NUM ::= a nonempty string of digitsI have all this already built for you, just like I did for Assignment 1.
Here are the steps you take to do this assignment:
Next, in $interpret.py$, add integer declarations to $interpretDTREE$. For $int I = E$, do this: compute the value of $E$ in the active namespace (the namespace whose handle lies at the top of the activation stack). Make certain that variable $I$ is not already declared in the active namespace. (If it is already declared, then it's an error that prints a message and stops execution.) Finally, bind $I$ to $E$'s value in the active namespace. It will be simplest if you write a new function, $declare$, add it to $heapmodule.py$, and call it to declare a new variable.
IMPORTANT: modify the interpreter to enforce that the variable that appears in position $L$ of $L = E$ is already declared. (Otherwise, it's an error that prints a message and stops execution.) It will be simplest if you modify the $update$ function in $heapmodule.py$ to do the check.
Since the intepreter uses multiple namespaces, you must implement a smarter lookup algorithm: within $interpretLTREE$, when searching for a variable, $V$, look in the active namespace first; if $V$ is not there, then look for $V$ in the $parentns$ namespace. (See the CIS505 lecture notes, Section 5.2.1.)
Remember to document appropriately your modified interpreter.
C, Java, and C# artificially restrict programs to have just two levels of declaration nesting: local and global. This is done so the compiler can work with a simple storage layout. Other object languages have no such restriction. In Part B, you extend the language so there are multiple levels of declarations.
Here is an example: procedure $p$ owns local declarations, $z$ and $q$,
and it can reference the global variable $x$ (and itself!).
Within $q$, there will be three levels
of declarations, which you see when $q$ is called:
The language's syntax for declarations now looks like this:
D ::= int I = E | proc I ( IL ) : DL CL endThe local declarations $DL$ are evaluated when the procedure is called, just before the procedure's $CL$ is computed. The operator trees for declarations look like this:
DTREE ::= ["int", ID, ETREE] | ["proc", ID, ILIST, DLIST, CLIST] # Note: DLIST holds the local decs; it might be empty, too.You revise the code in $interpretDTREE$ so that it builds a closure that saves the local declarations. When the procedure is called in $interpretCTREE$, you evaluate the local declarations just before you execute the procedure's commands. These steps are easy.
The interesting part is improving variable lookup in $interpretLTREE$ so that it uses a loop to follow the $parentns$ links to locate nonlocal variables. Be careful.
IMPORTANT: For each successful test case, you must copy the test case and all its output from the command window and paste it into a file named $tests.txt$. You will not receive credit for the tests unless you submit them in the $tests.txt$ file.
If you do not know how to paste into a command window or how to copy and paste from a command window, please ask me! It is not hard to learn.