Copyright © 2010 David Schmidt

Chapter 7:
The functional paradigm: From an arithmetic core to ML


7.1 Arithmetic and algebra
7.2 A core language for list structures
    7.2.1 Control structures
7.3 Expression abstracts
7.4 Interpreter architecture for functional languages
7.5 Expression parameters
    7.5.1 Lambda abstractions
    7.5.2 Adding closures to the interpreter
    7.5.3 Recursively defined functions
7.6 Inductively defined types
    7.6.1 Parameter patterns
    7.6.2 Map, filter, reduce
7.7 Conclusion


There is a paradigm of programming that dispenses with assignments and does computation with only expressible and denotable values --- answers are computed and named but never updated. This means loops and assignments are useless. Computation in this paradigm, the functional paradigm, proceeds like algebra, where one does equals-for-equals substitution to compute the answer that is bound (once and forever) to a variable name.

Examples of functional programming languages are Lisp, Scheme, ML, Ocaml, and Haskell. We study their principles in this chapter.


7.1 Arithmetic and algebra

The first programming language you ever learned, arithmetic, is a functional language --- its expressibles are the nonnegative ints, it has no denotables, and it has no storables. Here's its syntax:
===================================================

E: Expression
N: Numeral

E ::=  N  |  E1 + E2  |  E1 - E2  |  E1 * E2  |  E1 / E2
N ::=  sequences of digits

===================================================
Programs in this language are expressions, and computation does equals-for-equals substitution of answers for subexpressions, like this:
(2 * (4 - 3)) + 5  ==  (2 * 1) + 5  ==  2 + 5  == 7
You used equational laws, like 4 - 3 == 1 and 2 * 1 == 2, on expressions of numerals to compute the answer. This program's output (answer) is 7, because there are no subexpressions left to compute.

When you learned algebra, you learned that expressions can be named. The language of algebra is arithmetic extended with expression abstracts: The expressibles are ints, the denotables are the expressibles, and there are no storables:

===================================================

P: AlgebraProgram
D: Definition
E: Expression

P ::=  given D solve E
D ::=  I = E  |  D1 and D2
E ::=  N  |  E1 + E2 | ... | I

===================================================
Here is an example program in our ``algebra language'':
given
   y = 2 * x
   and
   z = y - 1
solve 
   y * z
The computation proceeds by calculation and substitution of answers for subexpressions until no more substitutions can be done:
y * z == y * (y - 1)                    # substitute for z
      == (y * y) - (1 * y)              # apply distributive law for *
      == (2 * x * 2 * x) - (1 * 2 * x)  # substitute for y
      == (2 * 2 * x * x) - (1 * 2 * x)  # apply commutative law for * 
      == (2 * 2 * x * x) - (2 * x)      # 1 * 2 == 2
      == (4 * x * x) - (2 * x)          # 2 * 2 == 4
In algebra, we apply calculation laws that apply to expressions (e.g. E1 * E2 == E2 * E1), and we calculate on unknowns (e.g., x). We also replace a name, like y, with the expression it names --- the equation, y = 2 * x, defines a new law we can use.

More generally in algebra, a Definition is any equation that contains one or more variables, e.g., 3 + y = (2 * x) + (2 + 1) is a definition. The calculation laws of algebra let us compute that y = 2 * x and also x = y / 2, as needed.

Can we make algebra ``grow up'' into a full-fledged programming language that has the same flavor?


7.2 A core language for list structures

Algebra is for playing ``number games.'' Many computing problems are ``data-structure games,'' where we assemble and disassemble data structures like stacks, queues, trees, and tables. Rather than using storage locations to hold the elements that we extract and insert, we can calculate directly on the entire data structure. This is the motivation for the first functional programming language, Lisp, a language for building lists and trees of words.

Here is a core language that resembles core Lisp or core ML; the expressibles are atoms and lists. For the moment, there are no denotables and no storables:

===================================================

E: Expression
A: Atom (words)

E ::=  A  |  nil  | E1 :: E2  |  hd E  |  tl E |  ifnil E1 then E2 else E3
A ::= strings of letters

===================================================
An atom is a string, like "abc". nil is a list with nothing in it. In Python and ML, this is []. We will occasionally use [] for nil in our calculations. (We will follow a classic Lisp convention and treat nil and "nil" as the same value(!) If this sounds wierd to you, don't worry about it....)

E1 :: E2 places the value denoted by E1 on the front of the list denoted by E2, e.g., "b" :: ("a" :: nil) evaluates to a two-element list, ["b","a"]. (You can think of :: as ``stack push.'') We will use the ["b","a"] notation in our calculations.

In Lisp, :: is written cons (e.g., cons "b" (cons "a" nil) assembles ["b","a"]). For this reason, we read :: as "cons".

hd (read as "head") returns the front element of a list, e.g., hd ("b" :: ("a" :: nil)) computes to "b". (It is like ``stack top.'')

tl (read as "tail") returns a new list missing the former front element, e.g., tl ("b" :: ("a" :: nil)) is "a" :: nil, which is the one-element list, ["a"]. (It is like ``stack pop.'')

The ifnil-expression is a control structure that chooses to compute E2 or E3 depending on whether or not E1 computes to an empty list. (Note: when E1 computes to a non-nil atom, this is not an empty list, so the else-arm is selected.) The ifnil conditional expression is not built into ML nor Lisp; we use it here to make simple examples. (In ML, you would say, if E1 = nil then E2 else E3, and in general, if B then E2 else E3, where B is a boolean-typed expression.)

This language does ``list arithmetic,'' e.g.,

ifnil ("a" :: nil)
  then  nil
  else  hd (tl ("b" :: ("c" :: nil)))

== hd (tl ("b" :: ("c" :: nil)))

== hd ("c" :: nil)  ==  "c"
The previous example used these equational laws:
===================================================

ifnil nil then E1 else E2  ==  E1

ifnil (E1 :: E2) then E3 else E4  ==  E4

hd (E1 :: E2)  ==  E1

tl (E1 :: E2)  ==  E2

===================================================
So we have an ``arithmetic'' for lists, based on these equations.

If we use ML-style [...] notation for computed list values, we can rewrite the above expression as a computer would do it:

ifnil ("a" :: nil)
  then nil
  else hd (tl ("b" :: ("c" :: nil)))

== ifnil ["a"] then nil else hd (tl ("b" :: ("c" :: nil)))

== hd (tl ("b" :: ("c" :: nil)))   # chose the else-arm

== hd (tl ("b" :: ["c"]))  # evaluates arguments first

== hd (tl ["b","c"])

== hd ["c"] == "c"
This next example shows we can embed a conditional expression inside an expression:
"a" :: (ifnil nil then nil else ("b" :: nil))

== "a" :: nil  ==  ["a"]
Our language lets us mix atoms and lists within the same list, e.g., this list
("b" :: ("c" :: nil)) :: ("a" :: (nil :: nil))
computes to the data structure, [["b","c"], "a", []].

If you try this example in ML, you will find that ML's data-typing system complains --- all the elements of an ML-list must have the same data type, and we cannot mix atoms and lists.

Finally, Lisp allows :: (that is, cons) to glue together a list to an atom or an atom to an atom, like cons "a" "b" and cons nil "a". These data structures are called dotted pairs, and you can think of them like Python tuples.

The above examples display the underlying principle of the functional paradigm: computation is expression simplification, like in algebra, with equals-for-equals substitution of subexpressions within a program.


7.2.1 Control structures

When we compute upon arithmetic expressions, we don't think of control structures. This is because control structures are primarily used to sequence the updates to a file or storage. But the previous examples showed how an if-structure lets us choose which list expression to compute, and it will be useful to control which expressions are evaluated when we process input data or input parameters.

The interactive version of ML uses a weak sequential control: After an expression is computed to its answer, the answer is saved in a temporary variable, named it. The expression that evaluates next can reference it, and once that expression finishes, its answer becomes the new value of it. Here is an example of a sequence of three expressions to simpify, separated by semicolons:

"a" :: nil;  hd it;  it :: (it :: nil)
This compound expression computes to ["a","a"], because the first expression, "a" :: nil, computes to ["a"], so the next expression, hd it, computes to "a" (because it names ["a"], and the last expression computes to ["a","a"] (because it names "a").


7.3 Expression abstracts

Let's give names to expressions. (These are expression abstracts.)
===================================================

D: Definition
E: Expression
A: Atom
I: Identifier

E ::=  A  |  nil  |  E1 :: E2  |  hd E  |  tl E 
    |  ifnil E1 then E2 else E3  |  let D in E end  |  I

D ::=  val I = E 

===================================================
val I = E binds identifier I to expression E. Now, whenever we mention I, it means E and can be substituted by E, equals for equals. I is not a location in storage, it is a constant value, set just once, just like ``final variables'' in Java, just like variables in algebra.

Here is an example:

let val x = nil  in
    let val y = "a" :: x  in
    x :: (tl y) end
end
This looks like the algebra example we saw at the start of the chapter, and it computes as expected:
let val x = nil  in
let val y = "a" :: x  in
x :: (tl y)  

== let val x = nil  in
   x :: (tl ("a" :: x))      # substitute for  y 

== nil :: (tl ("a" :: nil))  # substitute for x

==  ... == nil :: nil   #  ==  [[]]
The order that we substitute for occurrences of x and y does not matter, just like in algebra. The equational law for the let-construction is written by logicians like this:
===================================================

let val I = E1 in E2 end  ==  [E1/I]E2

===================================================
where [E1/I]E2 represents expression E2 changed so that all its occurrences of I are changed to E1. In the above example, we did this:
let val x = nil  in
let val y = "a" :: x  in
x :: (tl y)

== let val x = nil in [("a" :: x)/y](x :: (tl y))

== let val x = nil in x :: (tl ("a" :: x))

== [nil/x](x :: (tl ("a" :: x)))  ==  nil :: (tl ("a" :: nil))

== nil :: nil
but we could have done the substitutions in the other order, too:
let val x = nil  in
let val y = "a" :: x  in
x :: (tl y)

== [x/nil](let y = "a" :: x  in  x :: (tl y))

== let val y = "a" :: nil in nil :: (tl y)

== [("a" :: nil)/y](nil :: (tl y)) 

== nil :: (tl ("a" :: nil))  ==  nil :: nil

Because let definitions can be embedded anywhere, we can write complex expressions like this:

tl (let x = nil in x :: (let y = "a" in y :: x end) end)
It is difficult to see how an interpreter for the language will maintain a namespace or a stack of namespaces to remember which names are visible to which subphrases of the expression. We will study this soon. For now, we apply the algebra laws stated above and simplify the expression to its answer:
===================================================

tl (let x = nil in x :: (let y = "a" in y :: x end) end)

== tl (let x = nil in x :: ("a" :: x) end)

== tl (nil :: ("a" :: nil))

== "a" :: nil   # == ["a"]

===================================================

Here is a more delicate example, where we redefine x in nested blocks. (Perhaps we shouldn't allow this, but it is similar to writing a procedure that contains a local variable of the same name as a global variable.) When we try substitution, we get into trouble:

===================================================

let val x = "a"  in
    let val y = x :: nil  in
        let val x = nil  in
           y :: x
        end
    end
end

===================================================
If we substitute for y first, we appear to get
let val x = "a"  in
   let val x = nil  in
      (x :: nil) :: x
   end
end
This is incorrect --- it confuses the two definitions of x and there is no way we will obtain the correct answer, ("a" :: nil) :: nil.

The example displays a famous problem that dogged 19th-century logicians. The solution is, when we substitute for an identifier, we never allow a clash of two definitions --- we must rename the inner definition, if necessary. In the earlier example, if we substitute for y first, we must eliminate the clash between the two defined xs by renaming the inner one:

===================================================

let val x = "a"  in
    let val y = x :: nil  in
        let val x' = nil  in
           y :: x'
        end
    end
end

===================================================
Now the subsitution can proceed with no problem. How can a computer implementation do this without rebuilding the program while it computes on it? We consider this question next.


7.4 Interpreter architecture for functional languages

We can give an interpreter semantics to the functional language as it stands. We start from the interpreter for object languages, since lists are objects. We also build namespaces and will create a new namespace for each occurrence of a let construction.

Here is the syntax of operator trees we will use; it closely matches the source syntax:

===================================================

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]
DTREE ::=  [ID, ETREE]
ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords

===================================================
For example, the expression,
let val x = "abc" :: nil  in
let val y = ifnil x then nil else hd x in
y :: x  end end
has this operator tree:
["let", ["x", ["cons", "abc", ["nil"]],
        ["let", ["y", ["ifnil", ["ref", "x"],
                                ["nil"],
                                ["head", ["ref", "x"]]]],
                ["cons", ["ref", "y"], ["ref", "x"]]]]]
The interpreter will process the expression's operator tree in the standard way, opening its subparts and computing their meanings. In the process, each occurrence of a cons operator, ::, creates a new object in the interpreter's heap --- a cons cell (``cons-pair''). After all, a list is built using nil and multiple cons operations, and this is how the list is assembled in the heap: from "nil" and cons cells. When a let definition is processing, the binding is saved in a newly created namespace, which is linked to the currently active namespace, using the "parentns" link shown in the previous chapter for modelling procedure calls and superclasses in object languages.

Here is a drawing of the heap storage for the above example when the interpreter has finished computing the final answer, "abc" :: ("abc" :: nil). The final answer is actually the handle, δ, but this is ``pretty printed'' as "abc" :: ("abc" :: nil) or as ["abc", "abc"]. (Can you see why?)


The namespace used to evaluate the phrase, y :: x, is γ, which holds the binding for y and a link to the namespace that holds the binding for x. Notice that the final answer shares the cons cell at α that is used by x. This sharing of substructure is safe because there is no assignment to alter objects already stored in the heap. This is an advantage of using functional languages over imperative languages for object management --- there is less copying of data.

Here is the code for the interpreter:

===================================================

"""Interpreter for functional language with  cons  and  simple let.

Here is the syntax of operator trees interpreted:

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

DTREE ::=  [ID, ETREE]

ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords

"""

### HEAP:

heap = {}
heap_count = 0  # how many objects stored in the heap

"""The program's heap --- a dictionary that maps handles
                          to namespace or cons-pair objects.

      heap : { (HANDLE : NAMESPACE or CONSPAIR)+ }
        where 
         HANDLE = a string of digits
         NAMESPACE = {IDENTIFIER : DENOTABLE} + {"parentns" : HANDLE or "nil"} 
            that is, each namespace must have a "parentns" field
         CONSPAIR = (DENOTABLE, DENOTABLE)
         DENOTABLE = HANDLE or ATOM or "nil"
         ATOM = string of letters

   Example:
     heap = { "0": {"w": "nil", "parentns": "nil"},
              "1": {"x":"ab", "parentns": "0"},
              "2": {"z":"3", "parentns":"1"},
              "3": ("ab","nil"), 
              "4": {"y": "5",  "parentns":"2"},
              "5": ("3","3")
            }
     heap_count = 6
        is an example heap, where handles "0","1","2","4"  name  namespaces
        which hold definitions for  w,x,z,y,  due to  let  expressions.
        Handles "3" and "5" name cons-cells that are constructed due to
        a use of  cons.
        This example heap might have been constructed by this expression:
         let w = nil  in
         let x = "ab" in
         let z = x :: w in
         let y = z :: z in
          ...

       The values stored are  w = [],  x = "ab",  z = ["ab"], y = [["ab"], "ab"]
"""

### ASSOCIATED MAINTENANCE FUNCTIONS FOR THE  heap:

def isHandle(v) :
    """checks if  v  is a legal handle into the heap"""
    return isinstance(v, str) and v.isdigit() and  int(v) < heap_count


def allocate(value) :
    """places  value  into the heap with a new handle
       param:  value - a namespace or a pair
       returns the handle of the newly saved value
    """
    global heap_count
    newloc = str(heap_count)
    heap[newloc] =  value
    heap_count = heap_count + 1
    return newloc


def deref(handle):
    """ looks up a value stored in the heap: returns  heap[handle]"""
    return heap[handle]


### MAINTENANCE FUNCTIONS FOR NAMESPACES:


def lookupNS(handle, name) :
    """looks up the value of  name  in the namespace named by  handle
       If  name  isn't found, looks in the  parentns and keeps looking....
       params: a handle and an identifier 
       returns: the first value labelled by  name  in the chain of namespaces 
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                ans = ns[name]
            else :   # name not in the most local ns, so look in parent:
                ans = lookupNS(ns["parentns"], name)   
    else :
        crash("invalid handle: " + name + " not found")
    return ans


def storeNS(handle, name, value) :
    """stores  name:value  into the namespace saved at  heap[handle]
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                crash("cannot redefine a bound name in the current scope")
            else : 
                ns[name] = value


#########################################################################

# See the end of program for the driver function,  interpretPTREE


def evalETREE(etree, ns) :
    """evalETREE computes the meaning of an expression operator tree.

      ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

       post: updates the heap as needed and returns the  etree's value,
    """

    def getConspair(h):
        """dereferences handle  h  and returns the conspair object stored
           in the heap at  h
        """
        if isHandle(h):
            ob = deref(h)
            if isinstance(ob, tuple):  # a pair object ?
                ans = ob
            else :
                crash("value is not a cons pair")
        else :
            crash("value is not a handle")
        return ans


    ans = "error"
    if isinstance(etree, str) :  #  ATOM
        ans = etree 
    else :
        op = etree[0]
        if op == "nil" :
            ans = op

        elif op == "cons" :
            arg1 = evalETREE(etree[1], ns)
            arg2 = evalETREE(etree[2], ns)
            ans = allocate((arg1,arg2))   # store new conspair in heap

        elif op == "head" :
            arg = evalETREE(etree[1], ns)
            ans, tail = getConspair(arg)
 
        elif op == "tail" :
            arg = evalETREE(etree[1], ns)
            head, ans = getConspair(arg)

        elif op == "ifnil" :
            arg1 = evalETREE(etree[1], ns)
            if arg1 == "nil" :
                ans = evalETREE(etree[2], ns)
            else :
                ans = evalETREE(etree[3], ns)

        elif op == "let" :
            newns = evalDTREE(etree[1], ns)  # make new ns of new definitions
            ans = evalETREE(etree[2], newns)  # use new ns to eval ETREE
            # at this point,  newns  isn't used any more, so forget it!

        elif op == "ref" :
            ans = lookupNS(ns, etree[1])

        else :  crash("invalid expression form")
    return ans


def evalDTREE(dtree, ns) :
    """computes the meaning of a new definition and stores
       the  ID, meaning  binding in a new namespace, chained to  ns
           DTREE ::=  [ID, ETREE]
       returns a handle to the new namespace
    """
    name = dtree[0]
    expr = dtree[1]
    value = evalETREE(expr, ns)
    newns = allocate({name:value, "parentns": ns})
    return newns


###########################

def crash(message) :
    """pre: message is a string
       post: message is printed and interpreter stopped
    """
    print message + "! crash! ns=", ns, "heap=", heap
    raise Exception   # stops the interpreter


def prettyprint(value):
    if isHandle(value):
        v = deref(value)
        if isinstance(v, tuple):
           ans = "(cons " + prettyprint(v[0]) + " " + prettyprint(v[1]) +")"
        else :
           ans = "HANDLE TO DICTIONARY AT " + v
    else :
        ans = value
    return ans
        

# MAIN FUNCTION ###########################################################

def evalPGM(tree) :
    """interprets a complete program 
       pre: tree is an ETREE
       post: final values are deposited in heap
    """
    global heap, heap_count
    # initialize heap and ns:
    heap = {}
    heap_count = 0

    ans = evalETREE(tree, "nil")
    print "final answer =", ans
    print "pretty printed answer =", prettyprint(ans)
    print "final heap ="
    print  heap

    raw_input("Press Enter key to terminate")

===================================================
Install this code and run it with Python. Try at least these test cases:
evalPGM( ["cons", "abc", ["nil"]] )

evalPGM( ["head", ["cons", "abc", ["nil"]]] )

evalPGM( ["ifnil", ["nil"], ["head", ["cons", "abc", ["nil"]]], "def"] )

evalPGM( ["let", ["x", "abc"], ["cons", ["ref", "x"], ["nil"]]] )

evalPGM( ["let", ["x", ["cons", "abc", ["nil"]]],
         ["cons", ["ref", "x"], ["ref", "x"]]])

evalPGM( ["let", ["x", ["cons", "abc", ["nil"]]],
         ["let", ["y", "nil"], ["cons", ["ref", "x"], ["ref", "y"]]] ])
Study the heap layouts as well as the answers computed for each case.

Interpreter with ML-lists and multiple definitions

We can make the functional language look more like ML if we allow ML-style lists, that is, expressions of form, [v0, v1, ..., vm], and also let-constructions that define multiple bindings at once. Here is an example:
let val x = ["abc", "de"]
    val y = "f" :: x
in
    hd (let val z = [x, y] in tl z)
end
This computes x == ["abc", "de"], y == ["f", "abc", "de"], and z == [["abc", "de"], ["f", "abc", "de"]] to obtain the final answer, "f".

Here is the storage layout that our revised interpreter will compute for this example:


Now, entire ML-lists are saved as objects, along with namespaces and cons cells. (Cons cells are retained because they provide a quick, simple way of attaching a new element to the front of a list object already stored in the heap --- see handle γ in the picture, which represents the list, ["f", "abc", "de"].

The language's syntax is extended like this:

===================================================

E ::=  A  |  nil  |  [ E* ]
                  where  E*  means zero or more  E  phrases, separated by commas
    |  E1 :: E2  |  hd E  |  tl E 
    |  ifnil E1 then E2 else E3  |  let D in E end  |  I

D ::=  val I = E  |  D1 D2

===================================================
Now we can write lists like [v0, ..., vm] and blocks like let val I1 = E1 ... val Im = En in E.

Here is the modified interpreter. Its heap stores namespaces, cons cells, and lists. Its code is almost identical to the one in the previous subsection but is reproduced in its entirely so that you can read the revised comments at the top and you do not have to cut-and-paste code fragments to assemble it. (Look for the changes to evalETREE, evalDTREE, and prettyprint.)

===================================================

"""Interpreter for functional language with cons, lists, and
     multi-definition let.
   Here is the syntax of operator trees interpreted:

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]   |  ["list",  ETREE*]
        |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

DTREE ::=  [ [ID, ETREE]+ }

ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords

"""

### HEAP:

heap = {}
heap_count = 0  # how many objects stored in the heap

"""The program's heap --- a dictionary that maps handles
                          to namespaces or cons-cells or multicell lists.

      heap : { (HANDLE : NAMESPACE or CONSPAIR or LIST)+ }
        where 
         HANDLE = a string of digits
         NAMESPACE = {IDENTIFIER : DENOTABLE} + {"parentns" : HANDLE or "nil"} 
            that is, each namespace must have a "parentns" field
         CONSPAIR = (DENOTABLE, DENOTABLE)
         LIST = [ DENOTABLE* ]
         DENOTABLE = HANDLE or ATOM or "nil"
         ATOM = string of letters

   Example:
     heap = { "0": {"x":"ab", "y":"2", "z":"1", "parentns": "nil"},
              "1": ["ab", "ab"],
              "2": ("cd", "1"), 
              "3": {"w": "4", "parentns": "0"},
              "4": ["ab","2","1"]
            }
     heap_count = 5

     This example heap might have been constructed by this expression:
         let x = "ab"
             z = [x, x]
             y = cons "cd" z
         in ...
            let w = [x,y,z]  in ...
             
"""

### ASSOCIATED MAINTENANCE FUNCTIONS FOR THE  heap:

def isHandle(v) :
    """checks if  v  is a legal handle into the heap"""
    return isinstance(v, str) and v.isdigit() and  int(v) < heap_count


def allocate(value) :
    """places  value  into the heap with a new handle
       param:  value - a namespace or a pair
       returns the handle of the newly saved value
    """
    global heap_count
    newloc = str(heap_count)
    heap[newloc] =  value
    heap_count = heap_count + 1
    return newloc


def deref(handle):
    """returns  heap[handle]"""
    return heap[handle]


### MAINTENANCE FUNCTIONS FOR NAMESPACES:


def lookupNS(handle, name) :
    """looks up the value of  name  in the namespace named by  handle
       If  name  isn't found, looks in the  parentns
       params: a handle and an identifier 
       returns: the first value labelled by  name  in the chain of namespaces 
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                ans = ns[name]
            else :   # name not in the most local ns, so look in parent:
                ans = lookupNS(ns["parentns"], name)   
    else :
        crash("invalid handle: " + name + " not found")
    return ans


def storeNS(handle, name, value) :
    """stores  name:value  into the namespace at  heap[handle]
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                crash("cannot redefine a bound name in the current scope")
            else : 
                ns[name] = value


##########################################################################
### Here is the syntax of operator trees interpreted by this interpreter:

"""

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]   |  ["list",  ETREE*]
        |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

DTREE ::=  [ [ID, ETREE]+ }

ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords
"""

#########################################################################

# See the end of program for the driver function,  interpretPTREE


def evalETREE(etree, ns) :
    """evalETREE computes the meaning of an expression operator tree.

      ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["list" ETREE*]
        |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

       post: updates the heap as needed and returns the  etree's value,
    """

    def getCons(whichpart, h):
        """dereferences handle  h  and returns the conspair component 
           in the heap at  h
           parms: whichpart - is "head" or "tail"
                  h - handle to the conspair/list to be opened
           returns:  the value of  whichpart  of the conspair/list at  h
        """
        if isHandle(h):
            ob = deref(h)
            if isinstance(ob, tuple):  # a pair object ?
                if whichpart == "head" :
                    ans = ob[0]
                elif whichpart == "tail" :
                    ans = ob[1]
            elif isinstance(ob,list) and  ob != [] : # a nonempty list object ?
                # break the list into its head,tail pair:
                if whichpart == "head" :
                    ans = ob[0]
                elif whichpart == "tail" :
                    ans = allocate(ob[1:])  # must allocate new ob for tail )-:
            else :
                crash("value is not a cons pair")
        else :
            crash("value is not a handle")
        return ans
    # end function getCons

    ans = "error"
    if isinstance(etree, str) :  #  ATOM
        ans = etree 
    else :
        op = etree[0]
        if op == "nil" :
            ans = op

        elif op == "cons" :
            arg1 = evalETREE(etree[1], ns)
            arg2 = evalETREE(etree[2], ns)
            ans = allocate((arg1,arg2))   # store new conspair in heap

        elif op == "head" :
            arg = evalETREE(etree[1], ns)
            ans = getCons("head", arg)
 
        elif op == "tail" :
            arg = evalETREE(etree[1], ns)
            ans  = getCons("tail", arg)

        elif op == "list" :   # ["list", ETREE* ]
            args = etree[1:]
            newlist = []
            for e in args:   # eval each of the list elements and collect them
                newlist.append(evalETREE(e, ns))
            ans = allocate(newlist)   # store the new list into the heap

        elif op == "ifnil" :
            arg1 = evalETREE(etree[1], ns)
            if arg1 == "nil" or (isHandle(arg1) and deref(arg1) == []) :
                ans = evalETREE(etree[2], ns)
            else :
                ans = evalETREE(etree[3], ns)

        elif op == "let" :
            newns = evalDTREE(etree[1], ns)  # make new ns of new definitions
            ans = evalETREE(etree[2], newns)  # use new ns to eval ETREE
            # at this point,  newns  isn't used any more, so forget it!

        elif op == "ref" :
            ans = lookupNS(ns, etree[1])

        else :  crash("invalid expression form")
    return ans



def evalDTREE(dtree, ns) :
    """computes the meaning of a sequence of new definitions and stores
       the  ID, meaning  bindings in a new namespace
           DTREE ::=  [ [ID, ETREE]+ ]
       returns a handle to the new namespace
    """
    newns = allocate({"parentns": ns})  # create the new ns in the heap
    for bindingpair in dtree :  # add all the new bindings to the new ns
        name = bindingpair[0]
        expr = bindingpair[1]
        value = evalETREE(expr, newns)
        storeNS(newns, name, value)
    return newns

###########################

def crash(message) :
    """pre: message is a string
       post: message is printed and interpreter stopped
    """
    print message + "! crash! ns=", ns, "heap=", heap
    raise Exception   # stops the interpreter


def prettyprint(value):
    if isHandle(value):
        v = deref(value)
        if isinstance(v, tuple):
           subans1 = prettyprint(v[0])
           subans2 = prettyprint(v[1])
           if subans2[0] == "[" :
               if subans2 == "[]" :
                   ans = "[" + subans1 + "]"
               else :
                   ans = "[" + subans1 + ", " + subans2[1:]
           else :
               ans = "(cons " + subans1 + " " + subans2 + ")"
        elif isinstance(v, list):
           ans = "["
           for arg in v :
               ans = ans + prettyprint(arg) + ", "
           if ans[len(ans)-1] == " " :
               ans = ans[:len(ans)-2]
           ans = ans + "]"
        else :
           ans = "HANDLE TO DICTIONARY AT " + v
    elif value == "nil":
        ans = "[]"
    else : ans = value
        
    return ans

# MAIN FUNCTION ###########################################################

def evalPGM(tree) :
    """interprets a complete program 
       pre: tree is an ETREE
       post: final values are deposited in heap
    """
    global heap, heap_count
    # initialize heap and ns:
    heap = {}
    heap_count = 0

    ans = evalETREE(tree, "nil")
    print "final answer =", ans
    print "pretty printed answer =", prettyprint(ans)
    print "final heap =", heap

    raw_input("Press Enter key to terminate")

===================================================
Install the interpreter and test it on these examples:
evalPGM( ["list", "abc", "de"])

evalPGM( ["cons", "abc", ["list", "abc", "de"] ] )

evalPGM( ["tail", ["tail", ["list", "abc", "de"]] ])

evalPGM( ["let", [["x", ["list", "abc", "de"]]],
                 ["head", ["cons", ["ref", "x"], ["ref", "x"]]]] )

evalPGM( ["let", [["x", ["cons", "abc", ["list"]]],
                  ["y", "nil"]],
                 ["cons", ["ref", "x"], ["ref", "y"]]] )

evalPGM( ["let", [["x", ["list", "abc", ["nil"]]]],
                  ["y", ["ref", "x"]]],
                 ["cons", ["ref", "x"], ["ref", "y"]]] )


7.5 Expression parameters

Definitions can be parameterized; the result are functions:

===================================================

E ::=  ... |  [ E* ]  |  let D in E end  |  I  |  I(E*)
D ::=  val I = E  |  fun I1(I2*) = E  |  D1 D2

===================================================
where E* means zero or more expressions, separated by commas and I* means zero or more identifiers, separated by commas.

When a function is defined, its code (expression) is saved. When the function is called, its arguments bind to its parameters, and the function's code evaluates. Here is a small example:

===================================================

let fun second(list) = let val rest = tl list
                       in  hd rest
in
second(["a", "b", "c"])

===================================================

The argument binds to the parameter name when the function is called. The rewriting might go like this: (We have a better way to do it in a moment!)

===================================================

second(["a","b","c"])

== let val list = ["a","b","c"] in    # bind the argument to the parameter
   let val rest = tl list
   in  hd rest

== let val rest = tl ["a","b","c"]  in   # subsititue for list
   hd rest

== let val rest = ["b","c"]
   in  hd rest

== hd ["b","c"]  == "b"

===================================================
We will have much more to say about functions in a moment.


7.5.1 Lambda abstractions

The previous example calculation did not follow precisely the style we used so far. In particular, we should substitute the code for second into the position where second is referenced. Let's see where this leads us.

The definition,

fun second(list) = let val rest = tl list
                   in  hd rest  end
can be written as an ordinary val-definition like this, by moving the parameter name to the right of the equals sign:
val second = (list) let val rest = tl list  
                    in  hd rest end
It is a tradition to place the word, lambda, in front of the parameter name, (list), so that the reader can identify it clearly:
===================================================

val second = lambda list : let val rest = tl list
                           in  hd rest end

===================================================
second is the same function, just formatted a little differently. Now it is clear that second is the name of the function code, lambda list : let val rest = tl list in hd rest end. Both the parameter name and the body are the function code. This construction is called a lambda abstraction or an anonymous function.

(Important: in ML, the lambda abstraction is coded, fn list => let val rest = tl list in hd rest end.)

Let's rework the algebraic calculation of the example with the lambda abstraction:

===================================================

let val second = lambda list: let val rest = tl list in hd rest end
in  second(["a","b","c"]) end

===================================================
We begin by substituting for second:
===================================================

(lambda list: let val rest = tl list in hd rest end)(["a","b","c"])

== let val rest = tl ["a","b","c"] in hd rest end  # substitute ["a","b","c"] for list

== let val rest = ["b","c"] in hd rest end

== hd ["b","c"]  ==  "b"

===================================================
The expression, lambda list: ..., is the function code, divorced from its name. We copied the function code into the position where the function is referenced --- called. This is how substitution is supposed to work: you replace an identifier by the value it names. The binding of argument to parameter is defined by this equational rule:
===================================================

(lambda PARAM: BODY)(ARG)  ==  [ARG/PARAM]BODY

===================================================
That is, (lambda PARAM: BODY)(ARG) == let val PARAM = ARG in BODY end. Now, everything works smoothly.

The syntax we use can even be revised into this minimal form:

===================================================

E ::=  ...  |  let D in E end  |  I  |  E1(E2*)  |  lambda I*: E
D ::=  val I = E  |  D1 D2

===================================================
A function body, lambda I*: E, is an expression, just like nil or E1 ::E2, because it can appear embedded within an expression:
"a" :: (lambda x: [x,x])(nil) 

== "a" :: [nil, nil]  == ["a", nil, nil]
The nameless function is called a lambda abstraction because it is a kind of abstract, a naming device, for the parameter. The lambda abstraction has a long, rich history, extending to the debates in 19th-century philosophy that let to the development of modern set theory and predicate logic. They also happen to be quite useful for computation!

Now the language's characteristic domains go as follows: the expressible values are atoms, lists, and functions on expressibles. The denotable values are exactly the expressibles. There are still no storable (updateable) values.


7.5.2 Adding closures to the interpreter

We can easily add lambda abstractions to the interpreters in the previous section --- we use closure objects, just like we did when we implemented procedures in an object language.

As before a closure object is a pair, consisting of function-code-plus-parent-namespace-pointer. A couple of pictures will make this clear. For this sample program:

===================================================

let val second = lambda list: hd (tl list)  #(i)
in  let val x = ["a","b","c"]
    in  second(x) :: x   # (ii)
end end

===================================================
The heap layout at point (i), where second is called and computes its answer, "b", looks like this:

second's value, saved in namespace α, is the closure object at handle β. The closure remembers the code for the function along with a link to the namespace of global names that can be seen by the function. When second was called, namespace ε was created to hold its parameter, list, and ε becomes the current namespace. Note that x, saved in namespace γ, is not visible to second's code.

Once second returns its answer, "b" (the head of the sublist at handle φ), we reach point (ii), and the current namespace reverts to γ, which lets us compute the final answer, the list named by handle &eta (that is, ["b","a","b","c"]):


Notice that the objects at ε and φ are no longer accessible to the main program, and a garbage-collector program can safely erase the ``garbage objects.''


7.5.3 Recursively defined functions

A function can restart itself by looking up its own definition; this is called recursion. (You can see this in the previous picture: the code for second can refer to itself if it wishes.) A recursive call executes exactly the same way as any other function call; no new machinery is needed.

By using recursion, we can write repetitive computations in the functional language. Here is an example that uses an input() command to read a sequence of values from the user and assemble them into a list, in reverse order:

===================================================

let fun makeList(ans) =        #  ans  binds to a list argument
        let val v = input()    #  input()  asks the user to type a value
        in  ifnil v then ans
            else makeList(v :: ans)   # cons the input value to  ans
        end                           # and repeat the function
in  makeList(nil)  end         # start the recursions with an empty list

===================================================
(Note: ML has a more complicated way of handling interactive input --- sorry!)

An equational calculation of the example proceeds like this. (Say that the user will enter input values "a", "b", and nil to the program. This will construct the list, ["b","a"].)

===================================================

makeList(nil)

== let val ans = []  in   # bind argument to parameter name
   let val v = input() 
   in  ifnil v then ans
       else makeList(v :: ans)

== let val v = input()  in
   ifnil v then []
           else  makeList(v :: [])

# the user types input "a" :

== let v = "a"  in
   ifnil v then []
           else  makeList(v :: [])

== ifnil "a" then []
             else makeList("a" :: [])

== makeList("a" :: [])


== makeList(["a"])  ==  let val ans = ["a"] in    
                        let val v = input()
                        in  ifnil v then ans
                            else makeList(v :: ans)

                    ==  let val v = input() 
                        in  ifnil v then ["a"]
                            else makeList(v :: ["a"])

                    # the user types input "b":

                    ==  let val v = "b"
                        in  ifnil v then ["a"]
                            else makeList(v :: ["a"])

                    == makeList("b" :: ["a"])

                    # restart, again:

                    == makeList(["b","a"]) == let val ans = ["b","a"] in
                                              let val v = input() 
                                              in  ifnil v then ans
                                                  else makeList(v :: ans)
                                               
                                           == let val v = input() 
                                              in  ifnil v then ["b","a"]
                                                  else makeList(v :: ["b","a"])
                                             # user types nil:

                                             == let val v == nil 
                                                in  ifnil v then ["b","a"]
                                                    else ...

                                             == ["b","a"]
                   == ["b","a"]
== ["b","a"]

===================================================
Each time makeList appears in the calculation, we substitute its code (actually, its lambda abstraction --- see the previous section!) for it.

Here is another example, a function that checks if value v is found within a list, list. If yes, then v is returned as the answer, else nil is returned:

===================================================

fun member(v, list) = ifnil list then nil
                      else if v == hd(list) 
                           then  v
                           else  member(v, tl list)

===================================================
The function restarts to repeatly search inside all the tails of the list until it finds v (or fails when there are no more tails left to search).

Here is a function that glues (appends) two lists together:

===================================================

fun append(list1, list2) = ifnil list1 
                           then list2
                           else let val rest = append(tl list1, list2)
                                in  (hd list1) :: rest end

===================================================
append uses the trick of extracting all the head elements of all the tails within list1 to build the answer. This is a crucial technique to understand, so here is an example:
===================================================

let val x = ["a","b"]
in  append(x, ["c"])

== append(["a","b"], ["c"])

== let val list1 = ["a","b"]
       val list2 = ["c"] in
   ifnil list1 then ... else ...

== let val rest = append(["b"],["c"]) == let val list1 = ["b"] 
   in  ("a" :: rest)                         val list2 = ["c"] in
                                         ifnil list1 then ... else ...

                                      == let val rest = append([],["c"]) == let val list1 = []
                                         in  "b" :: rest
                                                                                val list2 = ["c"] in
                                                                            ifnil list1 then ...
 
                                                                         == ["c"]
                                      == let val rest = ["c"]
                                         in  "b" :: rest

                                      == "b" :: ["c"]

                                      == ["b", "c"]
== let val rest = ["b","c"]
   in  "a" :: rest

== "a" :: ["b","c"]

== ["a", "b", "c"]

===================================================
This example starts function append three times to move one by one the elements in the first argument list into the second argument list.

Here is a function that uses append to build a new list that reverses the order of elements in an argument list:

===================================================

fun reverse(list) = ifnil list
                    then nil
                    else let val rest = reverse(tl list)
                         in  append(rest, [hd list])

===================================================
A simpler coding of reverse uses an extra parameter (an ``accumulator parameter'') to collect the eventual answer:
===================================================

let fun reverse2(x, ans) = ifnil x
                           then ans
                           else reverse2(tl x,  (hd x) :: ans)
in 
fun reverse(list) = reverse2(list, nil)

===================================================

You may find these little list games artificial and even a bit frustrating, but all of mathematics is based upon them! The reason is that a number, like 4, is coded as a list, ["a", "a", "a", "a"]. Function append defines addition on base-one numbers, and you should code functions that do multiplication and subtraction. If you have more enthusiasm, use lists to encode base-two numbers, where 4 is represented by the list, ["one", "zero", "zero"]. You should code the standard arithmetic operations for base-two numbers and design a CPU and list-base memory for an emulated computer.


7.6 Inductively defined types

The language we have developed has two types of data: atoms and lists. (If we allow lambda abstractions, there is a third type --- functions.) Some functional languages (e.g., Lisp and Scheme) allow arbitrary combinations of values, e.g., lists that mix atoms and lists, and allow operations on all possible values (e.g., equality comparison of an atom versus a list). Other languages, like ML, Ocaml, and Haskell, require a Pascal- or Java-like type system, where each value has a specific type, each list's elements have one and the same type, and only values of the same type can be compared for equality.

In this section, we add data typing to the language's syntax and see how it encourages us to define new types of data structure also.

Here is the syntax of types of the core language:

===================================================

T: Type

T ::=  Atom  |  T List

===================================================
We will require that every expressible value must conform to a specific type. For example,
"a"  has type  Atom

"a" :: nil,  which evaluates to  ["a"],   has type  Atom List

nil :: (("a" :: nil) :: ("a" :: ("b" :: nil))),
  which evaluates to  [[], ["a"], ["a","b"]]   has type (Atom List)List
These requirements can be formalized by logic-rule typing laws, defined on the operators of the language:
===================================================


A : Atom               E1 : T    E2 : T List
                     ------------------------
                         E1 :: E2 : T List

E : T List             E : T List
-----------          ---------------
hd E : T             tl T : T List


E1 : T List   E2 : T'   E3 : T'
---------------------------------
 ifnil E1 then E2 else E3 : T'

===================================================
A question remains: What is the type of nil? The answer is, T' List, for any type T' whatsoever. So, nil has ``many types,'' depending on where it is inserted in an expression. (See the earlier examples.) For this reason, its ``typing rule'' goes:
===================================================

nil : T List

===================================================
where you can fill in T as you wish.

If we have types, we can can have ``type abstracts,'' where we invent names for types we define. This idea was used brilliantly by Rod Burstall in the language, Hope, and adapted by Luca Cardelli into the modern version of ML, now called SML ("Standard ML"). Here is an example of a type abstract that defines a data type of binary trees that hold atoms at their nodes and leaves:

===================================================

datatype Tree = Leaf of Atom  |  Node of Atom * Tree * Tree

===================================================
The identifiers, Leaf and Node, are constructors, for constructing trees, just like :: is a constructor for constructing lists. Here are some expressions that have data type Tree:
===================================================

Leaf("a"),  which represents a leaf tree,  "a"

Node("b", Leaf("a"), Leaf("c")), which represents the node tree,  "b"
                                                                 /   \
                                                                "a"  "c"
let val tree1 = Leaf("a") 
    val tree2 = Node("b", tree1, tree1) 
in  Node("c", tree1, tree2) end,   which represents   "c"
                                                     /   \
                                                   "a"    "b"
                                                         /  \
                                                       "a"  "a"

   or, for that matter, represents    "c"
                                     /   \ 
                                    +   "b"
                                    |   /  \
                                    |   |  |
                                   "a"<--<-+    where Leaf("a") is shared


===================================================
Because the type is defined in terms of itself (that is, trees can hold other, smaller trees), it is called inductively defined. This means we can assemble trees of arbitrary depth, just like we can construct lists of arbitrary length. Indeed, the lists already built into the core language are inductively defined, somewhat like this:
datatype EList = Nil |  Cons of Expressible * EList

where  Expressible = Atom + List

User-defined types define schemas for data-structure building, much like classes do for object-oriented programming. Here are some types for a library's database:

datatype Item =  Book of Atom * Atom    # Book(title,author)
              |  Dvd of Atom            # Dvd(title)

datatype Holding = Key of Atom * Item   # Key(idnumber, item)

type Database = Holding List
A major strength of functional languages is the ability to construct complex inductively defined data structures and process them. If you review the chapter on grammars, interpreters, and parsers, you see inductively-defined data types everywhere.


7.6.1 Parameter patterns

Every data structure should have control structures that are oriented towards processing the data structure. For the functional language, the control structure is embedded as argument patterns in the functions we write. For example, for
===================================================

datatype Tree = Leaf of Atom  |  Node of Atom * Tree * Tree

===================================================
We might write a function that collects into a list the atoms in a Tree. Here is a half-ML-style, half-Python-style coding:
===================================================

fun collect(tree) =
     if isinstance(tree, Leaf(x))
        then [x]      # x binds to the atom held within  leaf(x)
     else if isinstance(tree, Node(x, y, z))
        then let val left = collect(y)      # x, y, z  bind to the components of
                 val right = collect(z)     #  the node
             in  x :: append(left, right)

===================================================
The isinstance operator tries to match its first argument to the pattern in its second argument, and if successful, binds the names in the pattern to the components within the first argument. But there is a prettier way to write this, by replacing isinstance with parameter patterns:
===================================================

fun collect(Leaf(x)) = [x]
  | collect(Node(x,y,z)) = x :: (collect(y) @ collect(z))

===================================================
(Important: in ML, the append function is built-in and is represented by @, that is, list1 @ list2 means append(list1, list2).)

There is a tremendous amount of computation embedded within these two lines of coding! Here is an example:

===================================================

collect(Node("a", Leaf("b"), Leaf("c")))

== let val x = "a" 
       val y = Leaf("b") 
       val z = Leaf("c") in    # these three bindings are made by the pattern match
   x :: (collect(y) @ collect(z))

== "a" :: (
            collect(Leaf("b")), == let val x = "b" in [x]  == ["b"]
          @ collect(Leaf("c"))  == let x = "c" in [x]  == ["c"]
          )

== "a" :: (
            ["b"]
          @ ["c"]
          )  ==  ["b","c"]

== "a" :: ["b","c"]  ==  ["a", "b", "c"]

===================================================

In the Chapter on BNF, we saw a grammar for arithmetic operator trees and we saw a translator that mapped operator trees into postfix representation. Here is what this all looks like when written with a inductively defined type and parameter patterns:

===================================================

datatype Optree = Numeral of Atom                 # e.g., Numeral("3")
                | Expr of Atom * Optree * Optree  # e.g., Expr("+", t1, t2)

fun postfix(Numeral(n)) = [n]
  | postfix(Expr(operator, arg1, arg2)) =
       let val ans1 = postfix(arg1)
           val ans2 = postfix(arg2) in
       ans1 @ (ans2 @ [operator]) end

===================================================
For example, the operator tree for 2 + (3 * 4) would be tree0 = Expr("+", Numeral("2"), Expr("*", Numeral("3"), Numeral("4"))), so postfix(tree0) computes to the postfix translation, ["2", "3", "4", "*", "+"]. Our use of Python lists as operator trees is merely an approximation of an inductively defined type, Optree!


7.6.2 Map, filter, reduce

In an earlier chapter, we learned that data structures should possess their own custom control structures. These structures are often classified as map (apply some operation to each and every element of the data structure), filter (apply some boolean predicate to extract a subcollection of those elements that make the predicate True), and reduce (supply all the elements in the structure to an operation that ``totals up'' an answer).

It is easy to use parameter patterns to write exactly these structures for an inductively defined data type. Here are three sample instances of these control structures for binary trees of atoms:

===================================================

datatype Tree = Leaf of Atom  |  Node of Atom * Tree * Tree

fun map(f, Leaf(a)) =  Leaf(f(a))
  | map(f, Node(a, t1, t2)) = Node(f(a), map(f, t1), map(f, t2))

fun filter(b, Leaf(a)) = if b(a) then [a] else []
  | filter(b, Node(a, t1, t2)) = (if b(a) then [a] else [])
                                 @ filter(f, t1) @ filter(f, t2)

fun reduce(r, startvalue, Leaf(a)) = r(a, startvalue)
  | reduce(r, startvalue, Node(a, t1, t2))
                 = reduce(r, reduce(r, r(a, startvalue), t1), t2)

===================================================
Notice that f, b, and r are functions that are arguments to the control structures.

In any case, ML has built-in versions of map and reduce (two variants: foldl and foldr) for linear lists.


7.7 Conclusion

A functional language lets you compute by substituting equals-for-equals, like you learned when you learned arithmetic and algebra. This works because every identifier denotes a constant value, bound once and forever. It also means that the heap-base implementation can do massive sharing of substructures within a program.

The computation-as-algebra concept fails with imperative languages, because identifiers denote variables whose stored values change from line to line. For example, if we have an imperative program, like this one,

int x;
int y;
x = 0;
if ... {
    x = x + 1 }
else {
    x = 2
}
y = x;
it makes absolutely no sense to use x = 0 to ``substitute'' 0 for ``all occurrences'' of x in the rest of the program. Instead, we must execute the program with an instruction counter and primary storage --- there is no way to understand the program without the storage it manipulates, and there is no way to ``calculate'' the program's meaning by substitution.

In summary, imperative languages work with storage structures that are incrementally updated, for example, a data base that is shared by many programs over a period of time. Or, a graphics system that maintains the storage structure that represents the pixels on a display.

In contrast, functional languages solve self-contained problems that assemble inputs into a data structure and then process the data structure into an answer. An example is a compiler, which translates an input program into a tree data structure and then processes the tree into output code. Or, a batch payroll system that converts a file of payroll information into a file of paycheks. Or, a library of numerical functions that compute answers to questions in physics or biology.

When you face a problem that can be solved by equational-style calculation, solve it with a functional language.