The previous chapter showed how to implement a language's syntax with a parser and the semantics with an interpreter. The former converts a sequence of textual words into an operator tree, which we represent as a nested list. The latter traverses the operator tree and updates internal data structures (e.g., the namespace).
The intepreter in the previous chapter is overly naive. In this chapter we study two more realistic interpreter architectures: one for C-like languages and one for object-oriented languages. Here are the key features of each:
=================================================== P : Program CL : CommandList C : Command E : Expression D : Declaration L : LefthandSide I : Variable N : Numeral P ::= CL CL ::= C | C ; CL C ::= L = E | while E : C end | print L | E ::= N | ( E1 + E2 ) | L | & L L ::= I | * L N ::= string of digits I ::= strings of letters, not including the keywords, while, print ===================================================There is a new syntax domain, LefthandSide, which names the phrases that can appear on the left-hand side of an assignment.
To understand this language, we look at the architecture
of its interpreter (virtual machine). There are three components:
the controller ("processor unit") which are the interpreter's functions, an environment ("symbol table") data structure, and a memory vector.
Data is stored in the memory's cells, and the location numbers of the cells
are saved in the environment. Here is a sample configuration:
The layout shows that variable x names location 2 in the memory;
y names location 0, and z names location 1.
In the memory itself, location 0 holds 5, location 1 holds 0, etc.
Perhaps the configuration was constructed by these commands:
y = 5;
z = 0;
x = (6 + y)
Since there are no declarations in our little language, the environment is
updated and memory cells
are allocated when new variables are mentioned. For example, the first
command, y = 5, stores 5 in newly
allocated memory cell 0
and y is bound to location 0 in the environment.
Similarly, z names newly allocated location 1, and location 1 holds 0.
Finally, 6 is added to the int that is found at location 0 and stored
in newly allocated locaton 2.
But the same environment-memory layout is
constructed by the following commands,
which manipulate location numbers as well as ordinary ints:
y = 5;
z = &y; # store in z's cell the location named by y
x = (6 + *z) # add 6 to the int stored at the location stored in z's location
# and store the sum in the location named by x (whew!)
The first command works like before, but the second stores
y's location (0) into z's location ---
read &y as ``y's location.'' Some people say that
``z points to y.''
The third command adds 6 to the integer found at location 0:
Pointers are powerful, because they let you use locations the same way you use ints. (You can even do arithmetic with them.) They are used a lot in systems programming, and it is critical that you understand them.
Let's study the interpreter that implements pointers.
Here are the operator trees the interpreter uses;
they are the same as in
the last chapter, plus the two new forms:
===================================================
PTREE ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
CTREE ::= ["=", LTREE, ETREE] | ["while", ETREE, CLIST] | ["print", VAR]
ETREE ::= NUM | ["+", ETREE, ETREE] | ["&", LTREE] | LTREE
LTREE ::= VAR | ["*", LTREE]
NUM ::= string of digits
VAR ::= string of letters but not "while" or "print" or "end"
===================================================
For example, the PTREE for the earlier example,
y = 5; z = &y; x = (6 + *z), looks like this:
[["=", "y", "5"],
["=", "z", ["&", "y"]],
["=", "x", ["+", "6", ["*", "z"]]]
}
Here is the coded interpreter --- the entire virtual machine, controller plus data structures. Read the comments at the top and then read the coding of interpretLTREE, which calculates the meaning of an assignment's left-hand side, which computes to a location number.
Next, study interpretETREE to see how ["&", LTREE]
computes to a meaning different from just LTREE. This is
crucial, because the former computes to a location number and the
latter computes to the number stored at the location number.
It may look strange, but that's how it's done in C!
===================================================
"""Interpreter for a C-like mini-language with pointers.
There are two crucial data structures:
"""
memory = [] # a global variable that models primary storage.
# It is a list (array), where the indexes to the array are
# meant to model addresses. For example, if
# memory = [5,0,11], then location 2 of memory holds 11.
env = {} # a global variable that holds the program's variable names
# and the locations that each denotes.
# It is a Python hash table (dictionary). For example, if
# env = {'x'=2, 'y'=0, 'z'=1}, then variable x names location 2
# in memory, and memory[2] == 11. This is how computer
# storage is used in an implementation of C.
def interpretCLIST(p) :
"""pre: p is a program represented as a CLIST ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
post: memory holds all the updates commanded by program p
"""
for command in p :
interpretCTREE(command)
def interpretLTREE(x) :
"""pre: x is an L-value represented as an LTREE:
LTREE ::= VAR | ["*", LTREE]
post: ans holds the location named by x
returns: ans
"""
if isinstance(x, str) : # a VAR ?
if x in env :
ans = env[x] # look up its location
else : # it's a brand new var, so allocate a memory cell for it:
memory.append("err") # add a cell at the end of memory
env[x] = len(memory) - 1 # remember the location
ans = env[x]
else : # a pointer dereference, ["*", LTREEy]
y = interpretLTREE(x[1]) # get value of LTREEy
ans = memory[y] # dereference it and return the location therein
return ans
def interpretCTREE(c) :
"""pre: c is a command represented as a CTREE:
CTREE ::= ["=", LTREE, ETREE] | ["print", LTREE] | ["while", ETREE, CLIST]
post: memory holds all the updates commanded by c
"""
operator = c[0]
if operator == "=" : # assignment command, ["=", LTREE, ETREE]
lval = interpretLTREE(c[1])
exprval = interpretETREE(c[2]) # evaluate the right-hand side
memory[lval] = exprval # do the assignment
elif operator == "print" : # print command, ["print", VAR]
num = interpretETREE(c[1])
print num
elif operator == "while" : # while command
expr = c[1]
body = c[2]
while (interpretETREE(expr) > 0) :
interpretCLIST(body)
else : # error
crash("invalid command")
def interpretETREE(e) :
"""pre: e is an expression represented as an ETREE:
ETREE ::= NUMERAL | LTREE | ["&", LTREE] | [OP, ETREE1, ETREE2]
where OP is either "+" or "-"
post: ans holds the value of e
"""
if isinstance(e, str) and e.isdigit() : # a NUMERAL
ans = int(e)
elif isinstance(e, list) and (e[0] == "+" or e[0] == "-"): # [OP, E1, E2]
op = e[0]
ans1 = interpretETREE(e[1])
ans2 = interpretETREE(e[2])
if op == "+" :
ans = ans1 + ans2
elif op == "-" :
ans = ans1 - ans2
else :
crash("illegal arithmetic operator")
elif isinstance(e, list) and e[0] == "&" : # ["&", LTREE]
ans = interpretLTREE(e[1]) # return the location named by the LTREE
else: # is an LTREE that must be dereferenced:
x = interpretLTREE(e)
ans = memory[x]
return ans
def crash(message) :
"""pre: message is a string
post: message is printed and interpreter stopped
"""
print message + "! crash! core dump: ", memory
raise Exception # stops the interpreter
def main(program) :
global memory, env # these variables are global to main
memory = [] # reset them both
env = {}
try :
interpretCLIST(program)
except Exception :
print "Due to error, interpreter must quit prematurely. Sorry."
print "final env =", env
print "final memory =", memory
===================================================
Exercise:
Test the interpreter with these programs:
x = 3
main([["=", "x", "3"]])
x = 2; p = &x; y = *p
main([["=", "x", "2"],
["=", "p", ["&", "x"]],
["=", "y", ["*", "p"]]
])
x = 1; p = &x; *p = 2
main([["=", "x", "1"],
["=", "p", ["&", "x"]],
["=", ["*", "p"], "2"]
])
x = 0; y = (6 + &x); *x = 999; print y
main([["=", "x", "0"],
["=", "y", ["+", "6", ["&", "x"]]],
["=", ["*", "x"], "999"],
["print", "y"]
])
Exercise: In C, arrays are laid out in memory
as a sequence of cells. For example, these commands,
x = 4 ;
r = [0,0,0,0] ;
y = 99
would generate this environment and memory:
env = {"x": 0, "r": 1, y: 5}
memory = [4, 0, 0, 0, 0, 99]
This makes an array lookup, like r[E], work like this:
declare y : int ; declare z : * int ; y = 5 ; z = &y ; declare x : int ; x = (6 + *z)(In C, declarations are written more tersely, e.g., int y, but we use the declare keyword to get your attention.) The declarations indicate that only z can store locations in its memory cell. The program as written will operate correctly, but in contrast, this one will fail:
declare x : int ; x = 0 ; declare y : int ; y = (6 + &x)because an int is added to a location. This program also fails:
declare x : int ; x = 0 ; *x = 999because x's location never holds a location.
Here is the language with declarations:
===================================================
P ::= CL
CL ::= C | C ; CL
C ::= L = E | while E : CL end | print L | declare I : T
T ::= int | * T
E ::= N | ( E1 + E2 ) | L | & L
L ::= I | * L
N ::= string of digits
I ::= strings of letters, not including the keywords, while, print
PTREE ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
CTREE ::= ["=", LTREE, ETREE] | ["while", ETREE, CLIST] | ["print", VAR]
| ["dec", VAR, TYPE]
TYPE ::= [ "ptr"* , "int" ]
where "ptr"* means zero or more occurrences of "ptr"
(Note: it is bad luck for us that C uses * as well to mean "ptr"!)
ETREE ::= NUM | ["+", ETREE, ETREE] | ["&", LTREE] | LTREE
LTREE ::= VAR | ["*", LTREE]
NUM ::= string of digits
VAR ::= string of letters but not "while" or "print" or "end" or "declare" or "int"
===================================================
The interpreter remembers the
type tags, in
the environment. The tags are checked every time a variable is used.
Here is a picture of the configuration generated by the first
example program in this section:
Since z is a pointer variable, its type tag, *int,
is saved internally as ["ptr", "int"].
Now, read the interpreter's code to see how the declarations
create type tags and how interpretCTREE uses them to check the correctness
of assignments and how interpretETREE uses them to check correctness
of additions:
===================================================
"""Interpreter for a C-like mini-language with variables, pointers,
declarations and loops. There are two crucial data structures:
"""
memory = [] # a global variable that models primary storage.
# It is a list (array), where the indexes to the array are
# meant to model addresses. For example, if
# memory = [5,0], then location 0 of memory holds 5.
env = {} # a global variable that holds the program's variable names
# and the type,location that each denotes.
# It is a Python hash table (dictionary). For example, if
# env = {'x'= (["int"],0), 'p' = (["ptr", "int"], 1},
# then variable x names location 2, which holds an int,
# and p names location 1, which holds a pointer to an int;
# that is, location 1 holds a location that holds an int.
def interpretCLIST(p) :
"""pre: p is a program represented as a CLIST ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
post: memory holds all the updates commanded by program p
"""
for command in p :
interpretCTREE(command)
def interpretLTREE(x) :
"""pre: x is an L-value represented as an LTREE:
LTREE ::= VAR | ["*", LTREE]
returns: datatype,location pair named by x
"""
if isinstance(x, str) : # a VAR ?
if x in env :
ans = env[x] # look up its location
else : # undeclared
crash("variable " + x + " undeclared")
else : # a pointer dereference, ["*", LTREE]
datatype, loc = interpretLTREE(x[1])
if datatype[0] == "ptr" :
ans = (datatype[1:], memory[loc]) # dereference it
else :
crash("variable not a pointer")
return ans
def interpretCTREE(c) :
"""pre: c is a command represented as a CTREE:
CTREE ::= ["=", LTREE, ETREE] | ["print", LTREE]
| ["while", ETREE, CLIST] | ["dec", VAR, TYPE]
where TYPE ::= [ "ptr"*, "int" ]
and "ptr"* means zero or more occurrences of "ptr"
separated by commas
post: memory holds all the updates commanded by c
"""
operator = c[0]
if operator == "=" : # assignment command, ["=", LTREE, ETREE]
type1,lval = interpretLTREE(c[1])
type2,exprval = interpretETREE(c[2])
if type1 == type2 :
memory[lval] = exprval # do the assignment
else :
crash("incompatible types for assignment")
elif operator == "print" : # print command, ["print", VAR]
type,num = interpretETREE(c[1])
print num
elif operator == "while" : # while command
expr = c[1]
body = c[2]
type,val = interpretETREE(expr)
while (val > 0) :
interpretCLIST(body)
elif operator == "dec" : # declaration
x = c[1]
if x in env :
crash("variable " + x + " redeclared")
else :
memory.append("err") # add a cell at the end of memory
env[x] = (c[2], len(memory) - 1) # save type and location for x
else : # error
crash("invalid command")
def interpretETREE(e) :
"""pre: e is an expression represented as an ETREE:
ETREE ::= NUMERAL | [OP, ETREE1, ETREE2] | ["&", LTREE] | LTREE
where OP is either "+" or "-"
post: ans holds the datatype and value of e
returns: ans, a datatype,value pair
"""
if isinstance(e, str) and e.isdigit() : # a NUMERAL
ans = (["int"], int(e))
elif isinstance(e, list) and (e[0] == "+" or e[0] == "-"): # [OP, E1, E2]
op = e[0]
type1, ans1 = interpretETREE(e[1])
type2, ans2 = interpretETREE(e[2])
if type1 == ["int"] and type2 == ["int"] :
if op == "+" :
ans = (["int"], ans1 + ans2)
elif op == "-" :
ans = (["int"], ans1 - ans2)
else :
crash("illegal arithmetic operator")
else :
crash("cannot do arithmetic on non-ints")
elif isinstance(e, list) and e[0] == "&" : # ["&", LTREE]
type0,val0 = interpretLTREE(e[1])
ans = (["ptr"] + type0, val0)
else: # is an LTREE that must be dereferenced:
type0,loc0 = interpretLTREE(e)
ans = (type0, memory[loc0])
return ans
def crash(message) :
"""pre: message is a string
post: message is printed and interpreter stopped
"""
print message + "! crash! core dump: ", memory
raise Exception # stops the interpreter
def main(program) :
global memory, env # these variables are global to main
memory = [] # reset them both
env = {}
interpretCLIST(program)
print "final env =", env
print "final memory =", memory
===================================================
When you read interpretETREE, you see that the meaning of
an expression must be a pair: a data type and a value.
For example, 3 computes to (["int"], 3).
The type tag ensures that the 3 is assigned only to a variable
whose type is ["int"].
A pointer to an int will have type ["ptr", "int"]. For example,
if we have declare x : int, which
allocates location 0 for x, then
["&", "x"] evaluates to (["ptr", "int"], 0).)
Exercises:
declare x: int; x = 3; declare y: int; y = (x + 1) main([["dec", "x", ["int"]], ["=", "x", "3"], ["dec", "y", ["int"]], ["=", "y", ["+", "x", "1"]] ]) declare x: int; declare p: *int; x = 2; p = &x; declare y: int; y = *p main([["dec", "x", ["int"]], ["dec", "p", ["ptr", "int"]], ["=", "x", "2"], ["=", "p", ["&", "x"]], ["dec", "y", ["int"]], ["=", "y", ["*", "p"]] ]) declare x: int; declare p: *int; x = 1; p = &x; x = *x main([["dec", "x", ["int"]], ["dec", "p", ["ptr", "int"]], ["=", "x", "1"], ["=", "p", ["&", "x"]], ["=", "x", ["*", "x"]] ])
declare x : int ; declare y : * int ; declare z : * * int ; z = &y ; y = &x ; **z = 99What happens?
declare x: int ; x = 3 ; while x: declare y: int; y = 1; x = (x - y); print x end ; print xWhat happens? Is the declaration of y in the loop body legal in C, even when the loop repeats ? (Yes.) Repair the interpreter so that it is legal here, too.
C ::= . . . | declare I : T = ERevise both the parser and interpreter.
C ::= . . . | proc I : CNote: if you worked the previous exercise, you have basically worked this one, too, in this format:
C ::= . . . | declare I : proc = CAdd procedure types to the type tags:
T ::= int | proc | * TRevise the interpreter so that a variable can point to procedure code:
declare x : int ; proc p : x = (x + 1) ; declare y : * proc ; y = &p ; call *yC lets you do this. (Have you heard of a ``function pointer''?)
Here is an example object-oriented program that creates two objects:
x = 7;
y = new {f, g};
y.f = x;
y.g = new {r};
y.g.r = y.f;
The two global variables are x and y, the former an int and the
latter an object with two fields, f and g. A second object
is created and assigned to y's g field.
Although this program
lacks classes and methods, it is easy to guess what should be allocated in the
heap. The virtual machine consists of a controller (the interpreter's functions), the heap,
and a cell that remembers the address ("handle") of the namespace that holds the
interpreted program's global variables:
The program's namespace (global variables)
lives in the heap at location α,
and the two objects are saved at
β and γ. We use the term, ``handle'', to talk about the
location of an object, e.g., y's handle is β.
Here is the syntax:
===================================================
P : Program
CL : CommandList
C : Command
E : Expression
F : FieldNames
L : LefthandSide
I : Variable
N : Numeral
P ::= CL
CL ::= C | C ; CL
C ::= L = E | if E : CL1 else CL2 end | print L |
E ::= N | ( E1 + E2 ) | L | new { F }
F ::= I | I , F
L ::= I | L . I
N ::= string of digits
I ::= strings of letters, not including keywords
===================================================
The new constructions are new (allocates a new object)
and field indexing, represented by ..
In a later chapter, we add methods, classes, etc.
The operator trees are
===================================================
PTREE ::= CLIST
CLIST ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
CTREE ::= ["=", LTREE, ETREE] | ["if", ETREE, CLIST, CLIST]
| ["print", LTREE]
ETREE ::= NUM | ["+", ETREE, ETREE] | ["deref", LTREE] | ["new", OB ]
OB ::= [ ID+ ]
where ID+ means one or more IDs
LTREE ::= [ ID+ ]
NUM ::= a nonempty string of digits
ID ::= a nonempty string of letters
===================================================
Study the forms of ETREE; they are important.
Also, note that an LTREE
is a sequence of variable/field names that must be traversed
to find a location.
For example, the CTREEs for
a = new {f,g}; a.f = 3; a.g = (1 + a.f) look like this:
[ ["=", ["a"], ["new", ["f","g"]]],
["=", ["a", "f"], "3"],
["=", ["a", "g"], ["+", "1", ["deref", ["a", "f"]]]] ]
Here is the interpreter; it uses a heap and a var, ns, that remembers
the handle to the namespace with the global variables:
picture.
===================================================
"""Interpreter for object-oriented language.
"""
### THE HEAP:
heap = {}
heap_count = 0 # how many objects stored in the heap
"""The program's heap is a dictionary that maps handles to namespaces.
An object is a namespace.
heap : { (HANDLE : NAMESPACE)+ }
where HANDLE = a string of digits
NAMESPACE = a dictionary that maps var names to values
Example:
heap = { "0": {"x":7, "y":"1", "z":"2"},
"1": {"f":"nil", "g":5},
"2": {"r":12} }
heap_count = 3
is an example heap, where handle "0" names a namespace
whose x field holds int 7, "y" field holds handle "1"
(a handle to the object at heaploc "1"), "z" holds handle "2".
Heaploc "1" holds a two-field namespace;
Heaploc "2" holds a namespace with a single field, "r"
The above example heap was generated from this sample program:
x = 7;
y = new {f, g};
y.g = 5;
z = new {r};
z.r = (y.g + x);
"""
### NAMESPACE (environment): is the handle to the namespace in the heap
# that holds the program's global variables
ns = "" # This will be initialized in the main function, interpretPTREE, as
# ns = allocateNS() See below.
### MAINTENANCE FUNCTIONS FOR THE heap:
def clearTheHeap():
"""resets the heap to be empty"""
global heap_count, heap
heap_count = 0
heap = {}
def allocateNS() :
"""allocates a new, empty namespace in the heap and returns its handle"""
global heap_count
newloc = str(heap_count)
heap[newloc] = {}
heap_count = heap_count + 1
return newloc
def dereference(lval) :
"""looks up the value of lval in the heap
param: lval is a pair, (handle,fieldname),
returns: Say that lval = (h,f). The function extracts the
object at heap[h], indexes it with f, and returns
the value found there: return heap[h][f]
For example, for the heap at the top of this program,
dereference(("0","x")) returns 7
dereference(("1","f")) returns "nil"
"""
ans = "nil"
loc,f = lval
if isinstance(loc, str) and loc.isdigit() : # a legal heaploc
ans = heap[loc][f]
else :
crash("dereference error; lval =" + str(lval))
return ans
def store(lval, rval) :
"""stores the rval into the heap at lval
params: lval -- a pair, (handle, field), as described above
rval -- an int or a handle
Say that lval = (h,f). The function finds the object at h
and saves rval at index f within that object: heap[h][f] = rval.
For example, for the heap at the top of this program,
store(("1","f"), 98) would replace "nil" by 98.
"""
loc,qual = lval
if isinstance(loc, str) and loc.isdigit() : # a legal heaploc
heap[loc][qual] = rval
else : crash("store error; lval =" + str(lval))
############ Interpreter functions:
# See the end of program for the driver function, interpretPTREE
def interpretCLIST(clist) :
"""pre: clist is a program represented as a CLIST ::= [ CTREE+ ]
where CTREE+ means one or more CTREEs
post: memory holds all the updates commanded by program p
"""
for command in clist :
interpretCTREE(command)
def interpretCTREE(c) :
"""pre: c is a command represented as a CTREE:
CTREE ::= ["=", LTREE, ETREE] | ["if", ETREE, CLIST, CLIST2]
| ["print", LTREE]
post: heap holds the updates commanded by c
"""
operator = c[0]
if operator == "=" : # , ["=", LTREE, ETREE]
lval = interpretLTREE(c[1])
rval = interpretETREE(c[2])
store(lval, rval)
elif operator == "print" : # ["print", LTREE]
v = dereference(interpretLTREE(c[1]))
print v
elif operator == "if" : # ["if", ETREE, CLIST1, CLIST2]
test = interpretETREE(c[1])
if test != 0 :
interpretCLIST(c[2])
else :
interpretCLIST(c[3])
else : # error
crash("invalid command")
def interpretETREE(etree) :
"""interpretETREE computes the meaning of an expression operator tree.
ETREE ::= NUM | ["+", ETREE, ETREE] | ["deref", LTREE]
| ["new", OB ]
OB = [ ID* ]
post: updates the heap as needed and returns the etree's value,
either an int, a handle, or "nil"
"""
ans = "nil"
if isinstance(etree, str) and etree.isdigit() : # NUM
ans = int(etree)
elif etree[0] == "+" : # ["+", ETREE, ETREE]
ans1 = interpretETREE(etree[1])
ans2 = interpretETREE(etree[2])
if isinstance(ans1,int) and isinstance(ans2, int) :
ans = ans1 + ans2
else : crash("addition error --- nonint value used")
elif etree[0] == "deref" :
lval = interpretLTREE(etree[1])
ans = dereference(lval)
elif etree[0] == "new" :
handle = allocateNS()
fields = etree[1]
for f in fields :
store((handle,f), "nil")
ans = handle
else : crash("invalid expression form")
return ans
def interpretLTREE(ltree) :
"""interpretLTREE computes the meaning of a lefthandside operator tree.
LTREE ::= [ ID+ ] where + means "one or more"
Here are some example ltree values:
["x"] # x
["x", "g"] # x.g
["x", "g", "h"] # x.g.h
Returns a pair, (handle, field), where handle is the location
in the heap where an object/namespace is stored,
and field is an ID that indexes into the object.
Look again at the example heap at the very top of this program.
For that heap, here is what the function returns:
["x"] # returns ("0","x")
["x", "f" ] # returns ("1", "f")
"""
id = ltree[0]
lval = (ns, id) # we start from the main ns
indexlist = ltree[1:]
while indexlist != [] :
fieldname = indexlist[0]
indexlist = indexlist[1:]
handle = dereference(lval)
lval = (handle, fieldname)
return lval
def crash(message) :
"""pre: message is a string
post: message is printed and interpreter stopped
"""
print message + "! crash! ns=", ns, "heap=", heap
raise Exception # stops the interpreter
# MAIN FUNCTION ####
def interpretPTREE(tree) :
"""interprets a complete program tree
pre: tree is a PTREE ::= [ CLIST ]
post: final values are deposited in memory and env
"""
global ns
# initialize heap and ns:
clearTheHeap()
ns = allocateNS()
interpretCLIST(tree)
print "final namespace =", ns
print "final heap =", heap
raw_input("Press Enter key to terminate")
===================================================
Exercise:
Here are three sample programs. Use the interpreter on each and
study the final configurations:
x = 2; y = new {f, g}; y.f = x; y.g = new {h}; print y.f interpretPTREE( [ ["=", ["x"], "2"], ["=", ["y"], ["new", ["f","g"]]], ["=", ["y","f"], ["deref",["x"]]], ["=", ["y","g"], ["new", ["h"]]], ["print", ["y", "f"]] ] )
x = 7; y = new {f,g}; y.f = x; y.g = r; y.g.r = y.f interpretPTREE( [ ["=", ["x"], "7"], ["=", ["y"], ["new", ["f","g"]]], ["=", ["y","f"], ["deref",["x"]]], ["=", ["y","g"], ["new", ["r"]]], ["=", ["y","g","r"], ["deref",["y", "f"]]], ])
y = new {f,g}; x = y; y.h = 5; # the object grows as needed! print y interpretPTREE( [ ["=", ["y"], ["new", ["f","g"]]], ["=", ["x"], ["deref", ["y"]]], ["=", ["y", "h"], "5"], ["print", ["y"]] ])
Exercise:
Notice there is no trouble in reusing a global variable name as a field name:
x = 2;
y = new {x, y};
y.x = x; y.y = y; x = y.x
The indexing makes clear which variable should be consulted.
x = 2; y = new {a = x; b = a}This says that y's field a is set to x's value, and field b is set to a's value.
We will use this syntax for
initialization commands:
E ::= . . . | new { F* }
where F* means zero of more occurrences of F, separated by ;
F ::= I = E
Revise the parser and interpreter
for these new constructions. How does your semantics
operate on the example? (That is, does the semantics locate
the value of x when it is initializing a?
Does it locate the value of a when it is initializing b?
Make certain that it does. Hint: in the interpreter, replace
variable ns by a stack of handles to namespaces.
Use the stack when computing the semantics of a LefthandSide.
x = 2; y = new {x = super.x; y = this.x}states that y's field x is initialized with the value of global variable x and field y is initialized with the value of field x in this object being constructed.
Revise the syntax of Lefthandside to read:
L ::= I | L . I | this | super
and modify parser and interpreter to implement the new constructions.
E ::= N | ( E1 + E2 ) | L | new { CL }That is, when a new object is allocated, the commands, CL, execute. Here is an example:
x = 7; y = { if x : a = x else b = 0 end; c = x; print c }This code sets x to 7 and sets y to {a:7, c:7}. You see that we are not far from modern objects, which contain a mix of declarations and executable constructor code.