We will develop an object-oriented language in stages.
We start with the syntax of the assignment language from Exercise 1,
P ::= CL
CL ::= C;* (zero or more C separated by ; )
C ::= L = E | if E : CL else CL end | print E
E ::= N | E1 OP E2 | L
OP ::= + | -
L ::= I
N ::= a nonempty string of digits
I ::= a nonempty string of letters, not a keyword
but this time implemented by a heap-storage virtual machine, like that in Chapter 2, Section 2.2.
To start, you get for free the complete system for the above language. Its parser is generated from a tool, PLY, which built the parser from the grammar rules. You can look inside a23pars.py to see what I mean. There is documentation at http://www.dabeaz.com/ply/.)
When you download
and run the interpreter on a program, you will see it behaves like this:
===================================================
Type program; OK to do it on multiple lines; terminate with !
as the first symbol on a line by itself:
x = 2;
print x;
y = (x + 1);
x = (y + y)
!
Parse tree:
[[], [['=', 'x', '2'], ['print', ['deref', 'x']], ['=', 'y', ['+', ['deref', 'x'], '1']], ['=', 'x', ['+', ['deref', 'y'], ['deref', 'y']]]]]
Execution:
2
namespace = h0
heap = {
h0 : {'x': 2}
}
Successful termination.
namespace = h0
heap = {
h0 : {'y': 3, 'x': 6}
}
Press Enter key to terminate
===================================================
Notice that
(i)
The parse tree has an empty list of declarations (since the
core language doesn't have declarations).
(ii) The global variables are stored in a namespace
in the heap, and the handle to that namespace is remembered
by the interpreter. (The handles are not Greek letters but
h0, h1, etc.)
(iii) The print E command prints E's value along
with a "dump" of the current storage. This will help you
debug your work.
You will extend the language with declared variables and parameterized procedures. Procedures can own private declarations. (Objects and classes are added in Assignment 3.) First complete Part A of this exercise. Once you do so, work Part B. (If you don't finish Part B, it is not a disaster, but be certain to do Part A!)
=================================================== int x = 2; proc p(y, z): print y; x = (y - z); q(z); z = 0 end; proc q(y): x = (x + (y - 1)); print y; end; print x; p(9, (x+1)); ===================================================Here is the execution of the above program. It uses an activation stack, which you must add to the interpreter. Declared procedures are saved as closures.
=================================================== Type program; OK to do it on multiple lines; terminate with ! as the first symbol on a line by itself: int x = 2; proc p(y, z): print y; x = (y - z); q(z); z = 0 end; proc q(y): x = (x + (y - 1)); print y; end; print x; p(9, (x+1)); ! Parse tree: [[['int', 'x', '2'], ['proc', 'p', ['y', 'z'], [], [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']]], ['proc', 'q', ['y'], [], [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]], ['print', ['deref', 'y']]]]], [['print', ['deref', 'x']], ['call', 'p', ['9', ['+', ['deref', 'x'], '1']]]]] Execution: 2 activation stack = ['h0'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'], ['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} } 9 activation stack = ['h0', 'h3'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 2} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 3} } 3 activation stack = ['h0', 'h3', 'h4'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 3} h4 : {'y': 3, 'parentns': 'h0'} } Successful termination. activation stack = ['h0'] heap = { h0 : {'q': 'h2', 'parentns': 'nil', 'p': 'h1', 'x': 8} h1 : {'body': [['print', ['deref', 'y']], ['=', 'x', ['-', ['deref', 'y'],['deref', 'z']]], ['call', 'q', [['deref', 'z']]], ['=', 'z', '0']], 'params': ['y', 'z'], 'type': 'proc', 'link': 'h0'} h2 : {'body': [['=', 'x', ['+', ['deref', 'x'], ['-', ['deref', 'y'], '1']]],['print', ['deref', 'y']]], 'params': ['y'], 'type': 'proc', 'link': 'h0'} h3 : {'y': 9, 'parentns': 'h0', 'z': 0} h4 : {'y': 3, 'parentns': 'h0'} } Press Enter key to terminate ===================================================The implementation uses an activation stack (implemented as a Python list) of handles to namespaces (Python dictionaries). A namespace holds declared variables and procedures. Procedures are saved as closures, modelled here by dictionaries. When a procedure is called, a new namespace is allocated, the parameter-argument bindings are stored in the new namespace, the parentns link is set, and the namespace's handle is pushed onto the activation stack. When a procedure completes, its handle is popped from the activation stack. See Section 5.2.1 of the lecture notes for how this works.
IMPORTANT: the execution did no garbage collection of unneeded namespaces. I won't stop you if you want to erase unneeded namespaces, but it isn't required.
Here's the extended language's grammar:
===================================================
P : Program E : Expression
CL : CommandList L : LefthandSide
C : Command I : Variable
DL : DeclarationList N : Numeral
D : Declaration
P ::= DL CL
DL ::= D;*
that is, a sequence of zero or more Ds, separated by ;
( DL ::= D | D ; DL | empty )
D ::= int I = E | proc I ( IL ) : CL end
CL ::= C;*
C ::= L = E | if E : CL1 else CL2 end | print E | L ( EL )
EL ::= E,*
E ::= N | ( E1 OP E2 ) | L
OP ::= + | -
L ::= I
IL ::= I,*
I ::= strings of letters, not including keywords
N ::= string of digits
===================================================
The input to the interpreter is the list-based operator tree constructed by the parser.
The syntax goes like this:
===================================================
PTREE ::= [DLIST, CLIST]
DLIST ::= [ DTREE* ]
where DTREE* means zero or more DTREEs
DTREE ::= ["int", ID, ETREE] | ["proc", ID, ILIST, [], CLIST]
(note: the [] in the "proc" operator tree will be used in Part B)
CLIST ::= [ CTREE* ]
CTREE ::= ["=", LTREE, ETREE] | ["if", ETREE, CLIST, CLIST]
| ["print", ETREE] | ["call", LTREE, ELIST]
ELIST ::= [ ETREE* ]
ETREE ::= NUM | [OP, ETREE, ETREE] | ["deref", LTREE]
where OP ::= "+" | "-"
LTREE ::= ID
ILIST ::= [ ID* ]
ID ::= a nonempty string of letters
NUM ::= a nonempty string of digits
===================================================
I have all this already built for you, just like I did for Assignment 1.
Here are the steps you take to do this assignment:
Next, in interpret.py, add integer declarations to interpretDTREE. For int I = E, do this: compute the value of E in the active namespace (the namespace whose handle lies at the top of the activation stack). Make certain that variable I is not already declared in the active namespace. (If it is already declared, then it's an error that prints a message and stops execution.) Finally, bind I to E's value in the active namespace. It will be simplest if you write a new function, declare, add it to heapmodule.py, and call it to declare a new variable.
IMPORTANT: modify the interpreter to enforce that the variable that appears in position L of L = E is already declared. (Otherwise, it's an error that prints a message and stops execution.) It will be simplest if you modify the update function in heapmodule.py to do the check.
Since the intepreter uses multiple namespaces, you must implement a smarter lookup algorithm: within interpretLTREE, when searching for a variable, V, look in the active namespace first; if V is not there, then look for V in the parentns namespace. (See the CIS505 lecture notes, Section 5.2.1.)
Remember to document appropriately your modified interpreter.
C, Java, and C# artificially restrict programs to have just two levels of declaration nesting: local and global. This is done so the compiler can work with a simple storage layout. Other object languages have no such restriction. In Part B, you extend the language so there are multiple levels of declarations.
Here is an example: procedure p owns local declarations, z and q,
and it can reference the global variable x (and itself!).
Within q, there will be three levels
of declarations, which you see when q is called:
int x = 2;
proc p(y):
int z = 3;
proc q(m) : print m; x = ((m + y) + z) end;
q((x + z));
end;
p(9);
Here is the execution:
Parse tree:
[[['int', 'x', '2'], ['proc', 'p', ['y'], [['int', 'z', '3'], ['proc', 'q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]], [['call', 'q', [['+', ['deref', 'x'], ['deref','z']]]]]]], [['call', 'p', ['9']]]]
Execution:
5
activation stack = ['h0', 'h2', 'h4']
heap = {
h0 : {'parentns': 'nil', 'p': 'h1', 'x': 2}
h1 : {'body': [['call', 'q', [['+', ['deref', 'x'], ['deref', 'z']]]]], 'params': ['y'], 'type': 'proc', 'link': 'h0', 'locals': [['int', 'z', '3'], ['proc','q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]]}
h2 : {'y': 9, 'parentns': 'h0', 'z': 3, 'q': 'h3'}
h3 : {'body': [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]], 'params': ['m'], 'type': 'proc', 'link':'h2', 'locals': []}
h4 : {'parentns': 'h2', 'm': 5}
}
Successful termination.
activation stack = ['h0']
heap = {
h0 : {'parentns': 'nil', 'p': 'h1', 'x': 17}
h1 : {'body': [['call', 'q', [['+', ['deref', 'x'], ['deref', 'z']]]]], 'params': ['y'], 'type': 'proc', 'link': 'h0', 'locals': [['int', 'z', '3'], ['proc','q', ['m'], [], [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]]]]}
h2 : {'y': 9, 'parentns': 'h0', 'z': 3, 'q': 'h3'}
h3 : {'body': [['print', ['deref', 'm']], ['=', 'x', ['+', ['+', ['deref', 'm'], ['deref', 'y']], ['deref', 'z']]]], 'params': ['m'], 'type': 'proc', 'link':'h2', 'locals': []}
h4 : {'parentns': 'h2', 'm': 5}
}
Press Enter key to terminate
When q is called, its namespace, h4, holds its parameter, m,
and a link to h2, which holds y, z, and q, and a
link to h0, which holds x and p. All these variables are visible to q's
code.
The language's syntax for declarations now looks like this:
===================================================
D ::= int I = E | proc I ( IL ) : DL CL end
===================================================
The local declarations DL are evaluated when the procedure is called,
just before the procedure's CL is computed.
The operator trees for declarations look like this:
===================================================
DTREE ::= ["int", ID, ETREE]
| ["proc", ID, ILIST, DLIST, CLIST]
# Note: DLIST holds the local decs; it might be empty, too.
===================================================
You revise the code in interpretDTREE
so that it builds a closure
that saves the local declarations. When the
procedure is called in interpretCTREE,
you evaluate the local declarations just before
you execute the procedure's commands. These steps are easy.
The interesting part is improving variable lookup in interpretLTREE so that it uses a loop to follow the parentns links to locate nonlocal variables. Be careful.
IMPORTANT: For each successful test case, you must copy the test case and all its output from the command window and paste it into a file named tests.txt. You will not receive credit for the tests unless you submit them in the tests.txt file.
If you do not know how to paste into a command window or how to copy and paste from a command window, please ask me! It is not hard to learn.