There is a paradigm of programming that dispenses with assignments and does computation with only expressible and denotable values --- answers are computed and named but never updated. This means loops and assignments are useless. Computation in this paradigm, the functional paradigm, proceeds like algebra, where one does equals-for-equals substitution to compute the answer that is bound (once and forever) to a variable name.
Examples of functional programming languages are Lisp, Scheme, ML, Ocaml, and Haskell. We study their principles in this chapter.
=================================================== E: Expression N: Numeral E ::= N | E1 + E2 | E1 - E2 | E1 * E2 | E1 / E2 N ::= sequences of digits ===================================================Programs in this language are expressions, and computation does equals-for-equals substitution of answers for subexpressions, like this:
(2 * (4 - 3)) + 5 == (2 * 1) + 5 == 2 + 5 == 7You used equational laws, like 4 - 3 == 1 and 2 * 1 == 2, on expressions of numerals to compute the answer. This program's output (answer) is 7, because there are no subexpressions left to compute.
When you learned algebra, you learned that expressions can be
named. The language of algebra is arithmetic extended with
expression abstracts: The expressibles are ints, the denotables
are the expressibles, and there are no storables:
===================================================
P: AlgebraProgram
D: Definition
E: Expression
P ::= given D solve E
D ::= I = E | D1 and D2
E ::= N | E1 + E2 | ... | I
===================================================
Here is an example program in our ``algebra language'':
given
y = 2 * x
and
z = y - 1
solve
y * z
The computation proceeds by calculation and substitution
of answers for subexpressions until no more substitutions can be done:
y * z == y * (y - 1) # substitute for z
== (y * y) - (1 * y) # apply distributive law for *
== (2 * x * 2 * x) - (1 * 2 * x) # substitute for y
== (2 * 2 * x * x) - (1 * 2 * x) # apply commutative law for *
== (2 * 2 * x * x) - (2 * x) # 1 * 2 == 2
== (4 * x * x) - (2 * x) # 2 * 2 == 4
In algebra, we apply calculation laws that apply to expressions
(e.g. E1 * E2 == E2 * E1),
and we calculate on unknowns (e.g., x).
We also replace a name, like y, with the expression it names ---
the equation, y = 2 * x, defines a new law we can use.
More generally in algebra, a Definition is any equation that contains one or more variables, e.g., 3 + y = (2 * x) + (2 + 1) is a definition. The calculation laws of algebra let us compute that y = 2 * x and also x = y / 2, as needed.
Can we make algebra ``grow up'' into a full-fledged programming language that has the same flavor?
Here is a core language that resembles core Lisp;
the expressibles are atoms and lists.
For the moment, there are no denotables and no storables:
===================================================
E: Expression
A: Atom (words)
E ::= A | nil | cons E1 E2 | head E | tail E | ifnil E1 then E2 else E3
A ::= strings of letters
===================================================
An atom is a string, like "abc".
nil is a list with nothing in it.
In Python and ML, this is [].
We will occasionally use [] for nil in our calculations.
(We will follow a classic Lisp convention and treat
nil and "nil" as the same value(!)
If this sounds wierd to you, don't worry about it....)
cons places the value denoted by E1 on the front of the list denoted by E2, e.g., cons "b" (cons "a" nil) evaluates to a two-element list, ["b","a"]. (You can think of cons as ``stack push.'') We will use the ["b","a"] notation in our calculations.
In ML, cons is written ::, for example, "b" :: ( "a" :: [] ) assembles ["b","a"].
head returns the front element of a list, e.g., head (cons "b" (cons "a" nil)) computes to "b". (It is like ``stack top.'') In ML, head is spelled hd.
tail returns a new list missing the former front element, e.g., tail (cons "b" (cons "a" nil)) is cons "a" nil, which is the one-element list, ["a"]. (It is like ``stack pop.'') In ML, tail is spelled tl.
The ifnil-expression is a control structure that chooses to compute E2 or E3 depending on whether or not E1 computes to an empty list. (Note: when E1 computes to a non-nil atom, this is not an empty list, so the else-arm is selected.)
This language does ``list arithmetic,'' e.g.,
ifnil (cons "a" nil)
then nil
else head (tail (cons "b" (cons "c" nil)))
== head (tail (cons "b" (cons "c" nil)))
== head (cons "c" nil) == "c"
The previous example used these equational laws:
===================================================
ifnil nil then E1 else E2 == E1
ifnil (cons E1 E2) then E3 else E4 == E4
head (cons E1 E2) == E1
tail (cons E1 E2) == E2
===================================================
So we have an ``arithmetic'' for lists, based on these equations.
If we use ML-style [...] notation for list values, we can rewrite the
above expression as a computer would do it:
ifnil (cons "a" nil)
then nil
else head (tail (cons "b" (cons "c" nil)))
== ifnil ["a"] then nil else head (tail (cons "b" (cons "c" nil)))
== head (tail (cons "b" (cons "c" nil))) # chose the else-arm
== head (tail (cons "b" ["c"])) # evaluate arguments first
== head (tail ["b","c"])
== head ["c"] == "c"
This next example shows we can embed a conditional expression inside
an expression:
cons "a" (ifnil nil then nil else (cons "b" nil))
== cons "a" nil == ["a"]
The language lets us mix atoms and lists
within the same list, e.g., this list
cons (cons "b" (cons "c" nil)) (cons "a" (cons nil nil))
computes to the data structure,
[["b","c"], "a", []].
In contrast, the
ML language has a data-typing system that makes lists behave
like Java's arrays --- all the elements
must have the same data type --- we cannot mix atoms and lists in
ML.
Finally, Lisp allows cons to glue together a list to an atom or an atom to an atom, like cons "a" "b" and cons nil "a". These data structures are called dotted pairs, and you can think of them like Python tuples.
The above examples display the underlying principle of the functional paradigm: computation is expression simplification, like in algebra, with equals-for-equals substitution of subexpressions within a program.
The interactive version of ML uses a weak sequential control: After an
expression is computed to its answer, the answer is saved
in a temporary variable, named it. The expression
that evaluates next can reference it, and once that expression
finishes, its answer becomes the new value of
it. Here is an example of a sequence of three expressions
to simpify, separated by semicolons:
cons "a" nil; head it; cons it (cons it nil)
This compound expression computes to ["a","a"], because the
first expression, cons "a" nil, computes to ["a"],
so the next expression, head it, computes to "a" (because
it names ["a"], and the last expression computes to
["a","a"] (because it names "a").
=================================================== D: Definition E: Expression A: Atom I: Identifier E ::= A | nil | cons E1 E2 | head E | tail E | ifnil E1 then E2 else E3 | let D in E end | I D ::= I = E ===================================================I = E binds identifier I to expression E. Now, whenever we mention I, it means E and can be substituted by E, equals for equals. I is not a location in storage, it is a constant value, set just once, just like ``final variables'' in Java, just like variables in algebra.
Here is an example:
let x = nil in
let y = cons "a" x in
cons x (tail y)
This looks like the algebra example we saw at the start of the
chapter, and it computes as expected:
let x = nil in
let y = cons "a" x in
cons x (tail y)
== let x = nil in
cons x (tail (cons "a" x)) # substitute for y
== cons nil (tail (cons "a" nil)) # substitute for x
== ... == cons nil nil # == [[]]
The order that we substitute for occurrences
of x and y does not matter, just like in algebra.
The equational law for the let-construction is written by
logicians like this:
===================================================
let I = E1 in E2 == [E1/I]E2
===================================================
where [E1/I]E2 represents expression E2 changed so that all its
occurrences of I are changed to E1.
In the above example, we did this:
let x = nil in
let y = cons "a" x in
cons x (tail y)
== let x = nil in [(cons "a" x)/y](cons x (tail y)
== let x = nil in cons x (tail (cons "a" x))
== [nil/x](cons x (tail (cons "a" x))) == cons nil (tail (cons"a" nil))
== cons nil nil
but we could have done the substitutions in the other order, too:
let x = nil in
let y = cons "a" x in
cons x (tail y)
== [x/nil](let y = cons "a" x in cons x (tail y))
== let y = cons"a" nil in cons nil (tail y)
== [cons "a" nil/y](cons nil (tail y))
== cons nil (tail (cons "a" nil)) == cons nil nil
Because let definitions can be embedded anywhere, we can
write complex expressions like this:
tail (let x = nil in cons x (let y = "a" in cons y x))
It is difficult to see how
an interpreter for the language will maintain a namespace or a stack
of namespaces to remember which names are visible to which subphrases
of the expression. We will study this soon.
For now, we apply the algebra laws stated above and simplify the
expression to its answer:
===================================================
tail (let x = nil in cons x (let y = "a" in cons y x))
== tail (let x = nil in cons x (cons "a" x))
== tail (cons nil (cons "a" nil))
== cons "a" nil # == ["a"]
===================================================
Here is a more delicate example, where we redefine x in nested
blocks. (Perhaps we shouldn't allow this, but it is similar
to writing a procedure that contains a local variable of the same
name as a global variable.)
When we try substitution, we get into trouble:
===================================================
let x = "a" in
let y = cons x nil in
let x = nil in
cons y x
===================================================
If we substitute for y first,
we appear to get
let x = "a" in
let x = nil in
cons (cons x nil) x
This is incorrect --- it confuses the two definitions of x and
there is no way we will obtain the correct answer,
cons (cons "a" nil) nil.
The example displays a famous problem that dogged
19th-century logicians. The solution is, when we substitute
for an identifier, we never allow a clash of two definitions ---
we must rename the inner definition, if necessary.
In the earlier example, if we substitute for y first,
we must eliminate the clash between the two defined xs by renaming the
inner one:
===================================================
let x = "a" in
let y = cons x nil in
let x' = nil in
cons y x'
===================================================
Now the subsitution can proceed with no problem.
How can a computer implementation do this without rebuilding the
program while it computes on it?
We consider this question next.
Here is the syntax of operator trees we will use; it closely matches
the source syntax:
===================================================
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
DTREE ::= [ID, ETREE]
ATOM ::= a string of letters
ID ::= a string of letters, not including keywords
===================================================
For example, the expression,
let x = (cons "abc" nil) in
let y = ifnil x nil (head x) in
(cons y x)
has this operator tree:
["let", ["x", ["cons", "abc", ["nil"]],
["let", ["y", ["ifnil", ["ref", "x"],
["nil"],
["head", ["ref", "x"]]]],
["cons", ["ref", "y"], ["ref", "x"]]]]]
The interpreter will process the expression's operator tree
in the standard way, opening its subparts and computing their
meanings. In the process, each occurrence of a "cons" operator
will create a new object in the interpreter's heap --- a cons cell
(``cons-pair''). After all, a list is built using nil and
multiple cons operations, and this is how the list is assembled
in the heap: from "nil" and cons cells.
When a let definition is processing, the binding is saved in a
newly created namespace, which is linked to the currently active
namespace, using the "parentns" link shown in the previous
chapter for modelling procedure calls and superclasses in object languages.
Here is a drawing of the heap storage for the above example
when the interpreter has finished computing the final answer, cons "abc" (cons "abc" nil):
The final answer is actually the handle, δ,
but this is ``pretty printed'' as the list structure,
as cons "abc" (cons "abc" nil). (Can you see why?)
The namespace used to evaluate the phrase,
cons y x is γ, which holds the binding for y and
a link to the namespace that holds the binding for x.
Notice that the final answer shares the cons cell at α that
is used by x. This sharing of substructure is safe
because there is no assignment to alter objects already stored in the heap.
This is an advantage of using functional languages
over imperative languages for object management --- there is less
copying of data.
Here is the code for the interpreter:
===================================================
"""Interpreter for functional language with cons and simple let.
Here is the syntax of operator trees interpreted:
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
DTREE ::= [ID, ETREE]
ATOM ::= a string of letters
ID ::= a string of letters, not including keywords
"""
### HEAP:
heap = {}
heap_count = 0 # how many objects stored in the heap
"""The program's heap --- a dictionary that maps handles
to namespace or cons-pair objects.
heap : { (HANDLE : NAMESPACE or CONSPAIR)+ }
where
HANDLE = a string of digits
NAMESPACE = {IDENTIFIER : DENOTABLE} + {"parentns" : HANDLE or "nil"}
that is, each namespace must have a "parentns" field
CONSPAIR = (DENOTABLE, DENOTABLE)
DENOTABLE = HANDLE or ATOM or "nil"
ATOM = string of letters
Example:
heap = { "0": {"w": "nil", "parentns": "nil"},
"1": {"x":"ab", "parentns": "0"},
"2": {"z":"3", "parentns":"1"},
"3": ("ab","nil"),
"4": {"y": "5", "parentns":"2"},
"5": ("3","3")
}
heap_count = 6
is an example heap, where handles "0","1","2","4" name namespaces
which hold definitions for w,x,z,y, due to let expressions.
Handles "3" and "5" name cons-cells that are constructed due to
a use of cons.
This example heap might have been constructed by this expression:
let w = nil in
let x = "ab" in
let z = cons x w in
let y = cons z z in
...
The values stored are w = [], x = "ab", z = ["ab"], y = [["ab"], "ab"]
"""
### ASSOCIATED MAINTENANCE FUNCTIONS FOR THE heap:
def isHandle(v) :
"""checks if v is a legal handle into the heap"""
return isinstance(v, str) and v.isdigit() and int(v) < heap_count
def allocate(value) :
"""places value into the heap with a new handle
param: value - a namespace or a pair
returns the handle of the newly saved value
"""
global heap_count
newloc = str(heap_count)
heap[newloc] = value
heap_count = heap_count + 1
return newloc
def deref(handle):
""" looks up a value stored in the heap: returns heap[handle]"""
return heap[handle]
### MAINTENANCE FUNCTIONS FOR NAMESPACES:
def lookupNS(handle, name) :
"""looks up the value of name in the namespace named by handle
If name isn't found, looks in the parentns and keeps looking....
params: a handle and an identifier
returns: the first value labelled by name in the chain of namespaces
"""
if isHandle(handle):
ns = deref(handle)
if not isinstance(ns, dict):
crash("handle does not name a namespace")
else :
if name in ns :
ans = ns[name]
else : # name not in the most local ns, so look in parent:
ans = lookupNS(ns["parentns"], name)
else :
crash("invalid handle: " + name + " not found")
return ans
def storeNS(handle, name, value) :
"""stores name:value into the namespace saved at heap[handle]
"""
if isHandle(handle):
ns = deref(handle)
if not isinstance(ns, dict):
crash("handle does not name a namespace")
else :
if name in ns :
crash("cannot redefine a bound name in the current scope")
else :
ns[name] = value
#########################################################################
# See the end of program for the driver function, interpretPTREE
def evalETREE(etree, ns) :
"""evalETREE computes the meaning of an expression operator tree.
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
post: updates the heap as needed and returns the etree's value,
"""
def getConspair(h):
"""dereferences handle h and returns the conspair object stored
in the heap at h
"""
if isHandle(h):
ob = deref(h)
if isinstance(ob, tuple): # a pair object ?
ans = ob
else :
crash("value is not a cons pair")
else :
crash("value is not a handle")
return ans
ans = "error"
if isinstance(etree, str) : # ATOM
ans = etree
else :
op = etree[0]
if op == "nil" :
ans = op
elif op == "cons" :
arg1 = evalETREE(etree[1], ns)
arg2 = evalETREE(etree[2], ns)
ans = allocate((arg1,arg2)) # store new conspair in heap
elif op == "head" :
arg = evalETREE(etree[1], ns)
ans, tail = getConspair(arg)
elif op == "tail" :
arg = evalETREE(etree[1], ns)
head, ans = getConspair(arg)
elif op == "ifnil" :
arg1 = evalETREE(etree[1], ns)
if arg1 == "nil" :
ans = evalETREE(etree[2], ns)
else :
ans = evalETREE(etree[3], ns)
elif op == "let" :
newns = evalDTREE(etree[1], ns) # make new ns of new definitions
ans = evalETREE(etree[2], newns) # use new ns to eval ETREE
# at this point, newns isn't used any more, so forget it!
elif op == "ref" :
ans = lookupNS(ns, etree[1])
else : crash("invalid expression form")
return ans
def evalDTREE(dtree, ns) :
"""computes the meaning of a new definition and stores
the ID, meaning binding in a new namespace, chained to ns
DTREE ::= [ID, ETREE]
returns a handle to the new namespace
"""
name = dtree[0]
expr = dtree[1]
value = evalETREE(expr, ns)
newns = allocate({name:value, "parentns": ns})
return newns
###########################
def crash(message) :
"""pre: message is a string
post: message is printed and interpreter stopped
"""
print message + "! crash! ns=", ns, "heap=", heap
raise Exception # stops the interpreter
def prettyprint(value):
if isHandle(value):
v = deref(value)
if isinstance(v, tuple):
ans = "(cons " + prettyprint(v[0]) + " " + prettyprint(v[1]) +")"
else :
ans = "HANDLE TO DICTIONARY AT " + v
else :
ans = value
return ans
# MAIN FUNCTION ###########################################################
def evalPGM(tree) :
"""interprets a complete program
pre: tree is an ETREE
post: final values are deposited in heap
"""
global heap, heap_count
# initialize heap and ns:
heap = {}
heap_count = 0
ans = evalETREE(tree, "nil")
print "final answer =", ans
print "pretty printed answer =", prettyprint(ans)
print "final heap ="
print heap
raw_input("Press Enter key to terminate")
===================================================
Install this code and run it with Python.
Try at least these test cases:
evalPGM( ["cons", "abc", ["nil"]] )
evalPGM( ["head", ["cons", "abc", ["nil"]]] )
evalPGM( ["ifnil", ["nil"], ["head", ["cons", "abc", ["nil"]]], "def"] )
evalPGM( ["let", ["x", "abc"], ["cons", ["ref", "x"], ["nil"]]] )
evalPGM( ["let", ["x", ["cons", "abc", ["nil"]]],
["cons", ["ref", "x"], ["ref", "x"]]])
evalPGM( ["let", ["x", ["cons", "abc", ["nil"]]],
["let", ["y", "nil"], ["cons", ["ref", "x"], ["ref", "y"]]] ])
Study the heap layouts as well as the answers computed for each case.
let val x = ["abc", "de"] val y = cons "f" x in head (let val z = [x, y] in tail z)This computes x == ["abc", "de"], y == ["f", "abc", "de"], and z == [["abc", "de"], ["f", "abc", "de"]] to obtain the final answer, "f".
Here is the storage layout that our revised interpreter will
compute for this example:
Now, ML-lists are saved as heap objects, as well as namespaces
and cons cells. (Cons cells are retained because they provide
a quick, simple way of attaching a new element to the front of
a list object already stored in the heap --- see handle γ
in the picture, which represents the list, ["f", "abc", "de"].
The language's syntax is extended like this:
===================================================
E ::= A | nil | [ E* ]
where E* means zero or more E phrases, separated by commas
| cons E1 E2 | head E | tail E
| ifnil E1 then E2 else E3 | let D in E end | I
D ::= val I = E | D1 D2
===================================================
Now we can write lists like [v0, ..., vm] and
blocks like let val I1 = E1 ... val Im = En in E.
Here is the modified interpreter. Its heap stores namespaces,
cons cells, and lists. Its code is almost identical
to the one in the previous subsection but is reproduced in its entirely
so that you can read the revised comments at the top and
you do not have to cut-and-paste code fragments to assemble it.
(Look for the changes to evalETREE, evalDTREE, and prettyprint.)
===================================================
"""Interpreter for functional language with cons, lists, and
multi-definition let.
Here is the syntax of operator trees interpreted:
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["list", ETREE*]
| ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
DTREE ::= [ [ID, ETREE]+ }
ATOM ::= a string of letters
ID ::= a string of letters, not including keywords
"""
### HEAP:
heap = {}
heap_count = 0 # how many objects stored in the heap
"""The program's heap --- a dictionary that maps handles
to namespaces or cons-cells or multicell lists.
heap : { (HANDLE : NAMESPACE or CONSPAIR or LIST)+ }
where
HANDLE = a string of digits
NAMESPACE = {IDENTIFIER : DENOTABLE} + {"parentns" : HANDLE or "nil"}
that is, each namespace must have a "parentns" field
CONSPAIR = (DENOTABLE, DENOTABLE)
LIST = [ DENOTABLE* ]
DENOTABLE = HANDLE or ATOM or "nil"
ATOM = string of letters
Example:
heap = { "0": {"x":"ab", "y":"2", "z":"1", "parentns": "nil"},
"1": ["ab", "ab"],
"2": ("cd", "1"),
"3": {"w": "4", "parentns": "0"},
"4": ["ab","2","1"]
}
heap_count = 5
This example heap might have been constructed by this expression:
let x = "ab"
z = [x, x]
y = cons "cd" z
in ...
let w = [x,y,z] in ...
"""
### ASSOCIATED MAINTENANCE FUNCTIONS FOR THE heap:
def isHandle(v) :
"""checks if v is a legal handle into the heap"""
return isinstance(v, str) and v.isdigit() and int(v) < heap_count
def allocate(value) :
"""places value into the heap with a new handle
param: value - a namespace or a pair
returns the handle of the newly saved value
"""
global heap_count
newloc = str(heap_count)
heap[newloc] = value
heap_count = heap_count + 1
return newloc
def deref(handle):
"""returns heap[handle]"""
return heap[handle]
### MAINTENANCE FUNCTIONS FOR NAMESPACES:
def lookupNS(handle, name) :
"""looks up the value of name in the namespace named by handle
If name isn't found, looks in the parentns
params: a handle and an identifier
returns: the first value labelled by name in the chain of namespaces
"""
if isHandle(handle):
ns = deref(handle)
if not isinstance(ns, dict):
crash("handle does not name a namespace")
else :
if name in ns :
ans = ns[name]
else : # name not in the most local ns, so look in parent:
ans = lookupNS(ns["parentns"], name)
else :
crash("invalid handle: " + name + " not found")
return ans
def storeNS(handle, name, value) :
"""stores name:value into the namespace at heap[handle]
"""
if isHandle(handle):
ns = deref(handle)
if not isinstance(ns, dict):
crash("handle does not name a namespace")
else :
if name in ns :
crash("cannot redefine a bound name in the current scope")
else :
ns[name] = value
##########################################################################
### Here is the syntax of operator trees interpreted by this interpreter:
"""
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["list", ETREE*]
| ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
DTREE ::= [ [ID, ETREE]+ }
ATOM ::= a string of letters
ID ::= a string of letters, not including keywords
"""
#########################################################################
# See the end of program for the driver function, interpretPTREE
def evalETREE(etree, ns) :
"""evalETREE computes the meaning of an expression operator tree.
ETREE ::= ATOM | ["nil"] | ["cons", ETREE, ETREE] | ["head", ETREE]
| ["tail", ETREE] | ["list" ETREE*]
| ["ifnil", ETREE1, ETREE2, ETREE3]
| ["let", DTREE, ETREE] | ["ref", ID]
post: updates the heap as needed and returns the etree's value,
"""
def getCons(whichpart, h):
"""dereferences handle h and returns the conspair component
in the heap at h
parms: whichpart - is "head" or "tail"
h - handle to the conspair/list to be opened
returns: the value of whichpart of the conspair/list at h
"""
if isHandle(h):
ob = deref(h)
if isinstance(ob, tuple): # a pair object ?
if whichpart == "head" :
ans = ob[0]
elif whichpart == "tail" :
ans = ob[1]
elif isinstance(ob,list) and ob != [] : # a nonempty list object ?
# break the list into its head,tail pair:
if whichpart == "head" :
ans = ob[0]
elif whichpart == "tail" :
ans = allocate(ob[1:]) # must allocate new ob for tail )-:
else :
crash("value is not a cons pair")
else :
crash("value is not a handle")
return ans
# end function getCons
ans = "error"
if isinstance(etree, str) : # ATOM
ans = etree
else :
op = etree[0]
if op == "nil" :
ans = op
elif op == "cons" :
arg1 = evalETREE(etree[1], ns)
arg2 = evalETREE(etree[2], ns)
ans = allocate((arg1,arg2)) # store new conspair in heap
elif op == "head" :
arg = evalETREE(etree[1], ns)
ans = getCons("head", arg)
elif op == "tail" :
arg = evalETREE(etree[1], ns)
ans = getCons("tail", arg)
elif op == "list" : # ["list", ETREE* ]
args = etree[1:]
newlist = []
for e in args: # eval each of the list elements and collect them
newlist.append(evalETREE(e, ns))
ans = allocate(newlist) # store the new list into the heap
elif op == "ifnil" :
arg1 = evalETREE(etree[1], ns)
if arg1 == "nil" or (isHandle(arg1) and deref(arg1) == []) :
ans = evalETREE(etree[2], ns)
else :
ans = evalETREE(etree[3], ns)
elif op == "let" :
newns = evalDTREE(etree[1], ns) # make new ns of new definitions
ans = evalETREE(etree[2], newns) # use new ns to eval ETREE
# at this point, newns isn't used any more, so forget it!
elif op == "ref" :
ans = lookupNS(ns, etree[1])
else : crash("invalid expression form")
return ans
def evalDTREE(dtree, ns) :
"""computes the meaning of a sequence of new definitions and stores
the ID, meaning bindings in a new namespace
DTREE ::= [ [ID, ETREE]+ ]
returns a handle to the new namespace
"""
newns = allocate({"parentns": ns}) # create the new ns in the heap
for bindingpair in dtree : # add all the new bindings to the new ns
name = bindingpair[0]
expr = bindingpair[1]
value = evalETREE(expr, newns)
storeNS(newns, name, value)
return newns
###########################
def crash(message) :
"""pre: message is a string
post: message is printed and interpreter stopped
"""
print message + "! crash! ns=", ns, "heap=", heap
raise Exception # stops the interpreter
def prettyprint(value):
if isHandle(value):
v = deref(value)
if isinstance(v, tuple):
subans1 = prettyprint(v[0])
subans2 = prettyprint(v[1])
if subans2[0] == "[" :
if subans2 == "[]" :
ans = "[" + subans1 + "]"
else :
ans = "[" + subans1 + ", " + subans2[1:]
else :
ans = "(cons " + subans1 + " " + subans2 + ")"
elif isinstance(v, list):
ans = "["
for arg in v :
ans = ans + prettyprint(arg) + ", "
if ans[len(ans)-1] == " " :
ans = ans[:len(ans)-2]
ans = ans + "]"
else :
ans = "HANDLE TO DICTIONARY AT " + v
elif value == "nil":
ans = "[]"
else : ans = value
return ans
# MAIN FUNCTION ###########################################################
def evalPGM(tree) :
"""interprets a complete program
pre: tree is an ETREE
post: final values are deposited in heap
"""
global heap, heap_count
# initialize heap and ns:
heap = {}
heap_count = 0
ans = evalETREE(tree, "nil")
print "final answer =", ans
print "pretty printed answer =", prettyprint(ans)
print "final heap =", heap
raw_input("Press Enter key to terminate")
===================================================
Install the interpreter and test it on these examples:
evalPGM( ["list", "abc", "de"])
evalPGM( ["cons", "abc", ["list", "abc", "de"] ] )
evalPGM( ["tail", ["tail", ["list", "abc", "de"]] ])
evalPGM( ["let", [["x", ["list", "abc", "de"]]],
["head", ["cons", ["ref", "x"], ["ref", "x"]]]] )
evalPGM( ["let", [["x", ["cons", "abc", ["list"]]],
["y", "nil"]],
["cons", ["ref", "x"], ["ref", "y"]]] )
evalPGM( ["let", [["x", ["list", "abc", ["nil"]]]],
["y", ["ref", "x"]]],
["cons", ["ref", "x"], ["ref", "y"]]] )
Definitions can be parameterized;
the result are functions:
===================================================
E ::= ... | [ E* ] | let D in E | I | I(E*)
D ::= val I = E | fun I1(I2*) = E | D1 D2
===================================================
where E* means zero or more expressions, separated by commas
and I* means zero or more identifiers, separated by commas.
When a function is defined, its code (expression)
is saved. When the function is called, its arguments
bind to its parameters, and the function's code evaluates.
Here is a small example:
===================================================
let fun second(list) = let val rest = tail list
in head rest
in
second(["a", "b", "c"])
===================================================
The argument binds
to the parameter name when the function is called.
The rewriting might go like this:
(We have a better way to do it in a moment!)
===================================================
second(["a","b","c"])
== let val list = ["a","b","c"] in # bind the argument to the parameter
let val rest = tail list
in head rest
== let val rest = tail ["a","b","c"] in # subsititue for list
head rest
== let val rest = ["b","c"]
in head rest
== head ["b","c"] == "b"
===================================================
We will have much more to say about functions in a moment.
The previous example calculation did not follow precisely the style we used so far. In particular, we should substitute the code for second into the position where second is referenced. Let's see where this leads us.
The definition,
fun second(list) = let val rest = tail list
in head rest
can be written as an ordinary val-definition like this, by moving
the parameter name to the right of the equals sign:
val second = (list) let val rest = tail list
in head rest
It is a tradition to place the word, lambda, in front of the
parameter name, (list), so that the reader can identify it clearly:
===================================================
val second = lambda list : let val rest = tail list
in head rest
===================================================
second is the same function, just formatted a little differently.
Now it is clear that second is the name
of the function code,
lambda list : let val rest = tail list in head rest.
Both the parameter name and the body are the
function code.
Let's rework the algebraic calculation of the example in this format:
===================================================
let val second = lambda list: let val rest = tail list in head rest
in second(["a","b","c"])
===================================================
We begin by substituting for second:
===================================================
(lambda list: let val rest = tail list in head rest)(["a","b","c"])
== let val rest = tail ["a","b","c"] in head rest # substitute ["a","b","c"] for list
== let val rest = ["b","c"] in head rest
== head ["b","c"] == "b"
===================================================
The expression, lambda list: ..., is the function code, divorced
from its name. We copied the function code into the position where
the function is referenced --- called. This is how substitution is
supposed to work: you replace an identifier by the value it names.
The binding of argument to parameter is
defined by this equational rule:
===================================================
(lambda PARAM: BODY)(ARG) == [ARG/PARAM]BODY
===================================================
That is, (lambda PARAM: BODY)(ARG) == let PARAM = ARG in BODY.
Now, everything works smoothly.
The syntax we use for functions and function call can look like this:
===================================================
E ::= ... | let D in E | I | E1(E2*) | lambda I*: E
D ::= val I = E | D1 D2
===================================================
A function body, lambda I*: E, is an expression,
just like nil or cons(E1,E2), because it can appear embedded
within an expression:
cons "a" (lambda x: [x,x])(nil)
== cons "a" [nil, nil] == ["a", nil, nil]
The nameless function is
is called the lambda abstraction --- it is a kind of
abstract, a naming device, for the parameter.
The lambda abstraction has a long, rich history, extending to the
debates in 19th-century philosophy that let to the development
of modern set theory and predicate logic. They also happen to be
quite useful for computation!
Now the language's characteristic domains go as follows: the expressible values are atoms, lists, and functions on expressibles. The denotable values are exactly the expressibles. There are still no storable (updateable) values.
As before a closure object is a pair, consisting of function-code-plus-parent-namespace-pointer. A couple of pictures will make this clear.
For this sample program:
===================================================
let val second = lambda list: head (tail list) #(i)
in let val x = ["a","b","c"]
in cons second(x) x # (ii)
===================================================
The heap layout at point (i), where second is called and computes its answer, "b",
looks like this:
second's value, saved in namespace α, is the closure object
at handle β. The closure remembers the code for the function
along with a link to the namespace of global names that can be
seen by the function.
When second was called, namespace ε was created to hold its
parameter, list, and ε becomes the current namespace.
Note that
x, saved in namespace γ, is not visible
to second's code.
Once second returns its answer, "b" (the head of the sublist at
handle φ), we reach point (ii), and
the current namespace reverts to γ, which lets us compute the
final answer, the list named by handle &eta (that is, ["b","a","b","c"]):
Notice that the objects at ε and φ are no longer accessible
to the main program, and a garbage-collector program can safely erase
the ``garbage objects.''
A function can restart itself by looking up its own definition; this is called recursion. (You can see this in the previous picture: the code for second can refer to itself if it wishes.) A recursive call executes exactly the same way as any other function call; no new machinery is needed.
By using recursion, we can write repetitive computations in the functional
language.
Here is an example that uses an input() command to read a sequence
of values from the user and assemble them into a list,
in reverse order:
===================================================
let fun makeList(ans) = # ans binds to a list argument
let val v = input() # input() asks the user to type a value
in ifnil v then ans
else makeList(cons v ans) # cons the input value to ans
# and repeat the function
in makeList(nil) # start the recursions with an empty list
===================================================
An equational calculation of
the example proceeds like this. (Say that the user will enter input
values "a", "b", and nil to the program. This will construct
the list, ["b","a"].)
===================================================
makeList(nil)
== let val ans = [] in # bind argument to parameter name
let val v = input()
in ifnil v then ans
else makeList(cons v ans)
== let val v = input() in
ifnil v then []
else makeList(cons v [])
# the user types input "a" :
== let v = "a" in
ifnil v then []
else makeList(cons v [])
== ifnil "a" then []
else makeList(cons "a" [])
== makeList(cons "a" [])
== makeList(["a"]) == let val ans = ["a"] in
let val v = input()
in ifnil v then ans
else makeList(cons v ans)
== let val v = input()
in ifnil v then ["a"]
else makeList(cons v ["a"])
# the user types input "b":
== let val v = "b"
in ifnil v then ["a"]
else makeList(cons v ["a"])
== makeList(cons "b" ["a"])
# restart, again:
== makeList(["b","a"]) == let val ans = ["b","a"] in
let val v = input()
in ifnil v then ans
else makeList(cons v ans)
== let val v = input()
in ifnil v then ["b","a"]
else makeList(cons v ["b","a"])
# user types nil:
== let val v == nil
in ifnil v then ["b","a"]
else ...
== ["b","a"]
== ["b","a"]
== ["b","a"]
===================================================
Each time makeList appears in the calculation, we substitute its
code (actually, its lambda abstraction --- see the previous section!)
for it.
Here is another example, a function that checks if value v
is found within a list,
list. If yes, then v is returned as the answer, else
nil is returned:
===================================================
fun member(v, list) = ifnil list then nil
else if v == head(list)
then v
else member(v, tail list)
===================================================
The function restarts to repeatly search inside all the tails of
the list until it finds v (or fails when there are no more
tails left to search).
Here is a function that glues (appends) two lists together:
===================================================
fun append(list1, list2) = ifnil list1
then list2
else let val rest = append(tail list1, list2)
in cons (head list1) rest
===================================================
append uses the trick of extracting all the head elements of all the
tails within list1 to build the answer. This is a crucial technique
to understand, so here is an example:
===================================================
let val x = ["a","b"]
in append(x, ["c"])
== append(["a","b"], ["c"])
== let val list1 = ["a","b"]
val list2 = ["c"] in
ifnil list1 then ... else ...
== let val rest = append(["b"],["c"]) == let val list1 = ["b"]
in cons("a", rest) val list2 = ["c"] in
ifnil list1 then ... else ...
== let val rest = append([],["c"]) == let val list1 = []
in cons "b" rest
val list2 = ["c"] in
ifnil list1 then ...
== ["c"]
== let val rest = ["c"]
in cons "b" rest
== cons "b" ["c"]
== ["b", "c"]
== let val rest = ["b","c"]
in cons "a" rest
== cons "a" ["b","c"]
== ["a", "b", "c"]
===================================================
This example starts function append three times to move one by one the elements
in the first argument list into the second argument list.
Here is a function that uses append to build a new list that reverses the order of elements
in an argument list:
===================================================
fun reverse(list) = ifnil list
then nil
else let val rest = reverse(tail list)
in append(rest, [head list])
===================================================
A simpler coding of reverse uses an extra parameter (an ``accumulator parameter'') to
collect the eventual answer:
===================================================
let fun reverse2(x, ans) = ifnil x
then ans
else reverse2(tail x, cons (head x) ans)
in
fun reverse(list) = reverse2(list, nil)
===================================================
You may find these little list games artificial and even a bit frustrating, but all of mathematics is based upon them! The reason is that a number, like 4, is coded as a list, ["a", "a", "a", "a"]. Function append defines addition on base-one numbers, and you should code functions that do multiplication and subtraction. If you have more enthusiasm, use lists to encode base-two numbers, where 4 is represented by the list, ["one", "zero", "zero"]. You should code the standard arithmetic operations for base-two numbers and design a CPU and list-base memory for an emulated computer.
In this section, we add data typing to the language's syntax and see how it encourages us to define new types of data structure also.
Here is the syntax of types of the core language:
===================================================
T: Type
T ::= Atom | T List
===================================================
We will require that every expressible value must conform to
a specific type.
For example,
"a" has type Atom
cons("a", nil), which evaluates to ["a"], has type Atom List
cons(nil, cons("a", nil), cons("a", cons("b", nil))),
which evaluates to [[], ["a"], ["a","b"]] has type (Atom List)List
These requirements can be formalized by logic-rule typing laws, defined
on the operators of the language:
===================================================
A : Atom E1 : T E2 : T List
------------------------
cons(E1, E2) : T List
E : T List E : T List
----------- ---------------
head(E) : T tail(T) : T List
E1 : T List E2 : T' E3 : T'
---------------------------------
ifnil E1 then E2 else E3 : T'
===================================================
A question remains: What is the type of nil?
The answer is, T' List, for any type T' whatsoever.
So, nil has ``many types,'' depending on where it is inserted
in an expression. (See the earlier examples.)
For this reason, its ``typing rule'' goes:
===================================================
nil : T List
===================================================
where you can fill in T as you wish.
If we have types, we can
can have ``type abstracts,'' where we invent names for types
we define. This idea was used brilliantly by Rod Burstall in
the language, Hope, and adapted by Luca Cardelli into ML.
Here is an example of a type abstract that defines
a data type of binary trees that hold atoms at their nodes and leaves:
===================================================
type Tree = leaf of Atom | node of Atom * Tree * Tree
===================================================
The identifiers, leaf and node, are constructors,
for constructing trees,
like cons is a constructor for constructing lists.
Here are some expressions that have data type Tree:
===================================================
leaf("a"), which represents a leaf tree, "a"
node("b", leaf("a"), leaf("c")), which represents the node tree, "b"
/ \
"a" "c"
let val tree1 = leaf("a")
tree2 = node("b", tree1, tree1)
in node("c", tree1, tree2) which represents "c"
/ \
"a" "b"
/ \
"a" "a"
or, for that matter, represents "c"
/ \
+ "b"
| / \
| | |
"a"<--<-+ where leaf("a") is shared
===================================================
Because the type is defined in terms of itself (that is, trees can hold
other, smaller trees), it is called inductively defined.
This means we can assemble trees of arbitrary depth, just like
we can construct lists of arbitrary length.
Indeed, the lists already built into the core language
are inductively defined, somewhat like this:
type List = nil | cons of Expressible * List
where Expressible = Atom + List
User-defined types define schemas for data-structure building,
much like classes do for object-oriented programming. Here are some
types for a library:
type Item = book of Atom * Atom # book(title,author)
| dvd of Atom # dvd(title)
type Holding = key of Atom * Item # key(idnumber, item)
type Database = Holding List
A major strength of functional languages is the ability to
construct complex inductively defined data structures and process them.
If you review the chapter on grammars, interpreters, and parsers,
you see inductively-defined data types everywhere.
=================================================== type Tree = leaf of Atom | node of Atom * Tree * Tree ===================================================We might write a function that collects into a list the atoms in a Tree:
=================================================== fun collect(tree) = if isinstance(tree, leaf(x)) then [x] # x binds to the atom held within leaf(x) else if isinstance(tree, node(x, y, z)) then let val left = collect(y) # x, y, z bind to the components of val right = collect(z) # the node in cons x append(left, right) ===================================================The isinstance operator tries to match its first argument to the pattern in its second argument, and if successful, binds the names in the pattern to the components within the first argument. But there is a prettier way to write this, by merging isinstance with the parameter name, tree, like this:
=================================================== fun collect(leaf(x)) = [x] | collect(node(x,y,z)) = cons x append(collect(y), collect(z)) ===================================================There is a tremendous amount of computation embedded within these two lines of coding! Here is an example:
=================================================== collect(node("a", leaf("b"), leaf("c"))) == let val x = "a" val y = leaf("b") val z = leaf("c") in # these three bindings are made by the pattern match cons x append(collect(y), collect(z)) == cons "a", append( collect(leaf("b")), == let val x = "b" in [x] == ["b"] collect(leaf("c")) == let x = "c" in [x] == ["c"] ) == cons("a", append( ["b"] ["c"] ) == ... == ["b","c"] ) == cons("a", ["b","c"]) == ["a", "b", "c"] ===================================================
In the Chapter on BNF, we saw a grammar for arithmetic operator trees
and we saw a translator that mapped operator trees into postfix
representation. Here is what this all looks like when written with a inductively defined
type:
===================================================
type Optree = numeral of Atom # e.g., numeral("3")
| expr of Atom * Optree * Optree # e.g., expr("+", tree1, tree2)
fun postfix(numeral(n)) = [n]
| postfix(expr(operator, arg1, arg2) =
let val ans1 = postfix(arg1)
val ans2 = postfix(arg2) in
append(ans1, append(ans2, [operator]))
===================================================
For example,
the operator tree
for 2 + (3 * 4) would be
tree0 = expr("+", numeral("2"), expr("*", numeral("3"), numeral("4"))),
so
postfix(tree0) computes to the postfix translation,
["2", "3", "4", "*", "+"].
Our use of Python lists as operator trees is merely an approximation of an inductively
defined type, Optree!
The computation-as-algebra concept fails with imperative languages,
because identifiers denote variables whose
stored values change from
line to line. For example, if we have an imperative
program, like this one,
int x;
int y;
x = 0;
if ... {
x = x + 1 }
else {
x = 2
}
y = x;
it makes absolutely no sense to use x = 0 to
``substitute'' 0 for ``all occurrences''
of x in the rest of the program.
Instead, we must execute the program with an instruction
counter and primary storage --- there is no way to understand
the program without the storage it manipulates, and
there is no way to ``calculate''
the program's meaning by substitution.
In summary, imperative languages work with storage structures that are incrementally updated, for example, a data base that is shared by many programs over a period of time. Or, a graphics system that maintains the storage structure that represents the pixels on a display.
In contrast, functional languages solve self-contained problems that assemble inputs into a data structure and then process the data structure into an answer. An example is a compiler, which translates an input program into a tree data structure and then processes the tree into output code. Or, a batch payroll system that converts a file of payroll information into a file of paycheks. Or, a library of numerical functions that compute answers to questions in physics or biology.
When you face a problem that can be solved by equational-style calculation, solve it with a functional language.