Chapter 7:
The functional paradigm: From an arithmetic core to ML

7.1 Arithmetic and algebra
7.2 A core language for list structures
    7.2.1 Control structures
7.3 Expression abstracts
7.4 Interpreter architecture for functional languages
7.5 Expression parameters
    7.5.1 Lambda abstractions
    7.5.2 Adding closures to the interpreter
7.6 Recursively defined functions
    7.6.1 Application: Maintaining a table and undoing its updates
    7.6.2 Inductively defined datatypes
    7.6.3 Map, filter, reduce
7.7 Conclusion

There is a paradigm of programming that dispenses with traditional assignment. As a result, computation in this paradigm, the functional paradigm, looks somewhat like algebra, where one does equals-for-equals substitution to compute the answer that is bound (once and forever) to a name.

The reason one might take seriously this approach is because many many errors arise due to updating a cell in the wrong order or due to updating a shared cell or due to a race condition. The problem has appeared recently in multi-core processors, where there are huge problems with synchronizing all the processors' caches with primary storage.

Examples of functional programming languages are Lisp, Scheme, ML, Ocaml, and Haskell. We study their principles in this chapter.

The rewriting machine: A new computational model

Imperative languages were designed for von Neuman-style machines, where part of memory holds the program and part holds a data table whose cells are repeatedly updated: +-----+ | CPU | (controller) +-----+ | V +---------------+-+-+- ... +-+-+ | program codes | | | | | | +---------------+-+-+- ... +-+-+ (data table saved in cells)
The program updates the table's cells over and over until the instructions are finished. The von Neumann machine is based on a theoretical model known as the Universal Turing machine, which looks much like the picture above.

But not all computations are table driven. Think about arithmetic! Here is a "program" --- (3 * (4 + 5) --- and its computation:

(3 * (4 + 5)) =  (3 * 9) =  27

Here, computation rewrites the program until it can be rewritten no more. There are no ``cells'' to update.

The rewriting rules for arithmetic can be stated in algebra-equation style, using a formalism called a Post system. If we represent (nonnegative) numbers in Base 1 format (e.g., 0 is 0, 1 is s0, 2 is ss0, 3 is sss0, etc.), then only four equations are needed to define addition and multiplication:

(0 + N) =  N
(sM + N) =  (M + sN)

(0 * N) =  0
(sM * N) =  (N + (M * N))

M and N are algebra-style variables, e.g., the second rule lets us compute (s0 + ss0) = (0 + sss0) (where M matches 0 and N matches ss0), then the first rule lets us compute (0 + sss0) = sss0. That is, we computed (1 + 2) equals 3.

As an exercise, compute (2 * 3), that is (ss0 * sss0):

(ss0 * sss0) =  (sss0 + (s0 * sss0))  
             =  (sss0 + (sss0 + (0 * sss0)))
             =  (sss0 + (sss0 + 0))
             =  (sss0 + (ss0 + s0)) 
             =  (sss0 + (s0 + ss0)) =  (sss0 + (0 + sss0)) =  (sss0 + sss0)
             =  . . .  =  ssssss0

A computer based on this approach might look like this: +----------------------------------+ | hard-wired equations for * and + | (in real life, an ALU holds this wiring) +----------------------------------+ | V +------------------------+ | arithmetic expression | +------------------------+
The machine repeatly scans the arithmetic expression, searching for a phrase that can be rewritten by one of the equations for * and +. There is no instruction counter, nor data cells. The arithmetic-expression part is best represented as an operator tree, which is easy to traverse, match, and rewrite:
At the end, unneeded subtrees remain as garbage, but these can be erased by a garbage collector. This layout will lead to a beautiful solution to storage sharing.

Based on the arithmetic example, you can readily imagine a ``universal rewriting machine,'' which operates with user-defined equations and an expression:

  +-------------------------+
  | equation-rewrite engine |
  +-------------------------+
      |            |
      V            V
+------------+-----------------+
| equations  | expression tree |
+------------+-----------------+

The machine repeatedly matches equations in the ''equations'' part to subtrees in the ``expression'' part and does rewriting. There is no sequential code, no instruction counter, no data cells --- only equations and a tree that is constantly reconfigured. This is a different paradigm than the Turing/von Neuman machine, but it is equally computationally powerful.

There is an even more exotic version of rewriting, called the lambda calculus, where there is only one operator, λ, and programs and data are coded just with it and algebra variables. Here are some codings:

0:   (λs(λz z))
1:   (λs(λz (s z)))
2:   (λs(λz (s (s z))))
     and so on

M + N :  ((M plusOne) N) 
      where   plusOne :   (λn(λs(λz (s ((n s) z)))))

M * N :  ((M +) N)

There is just one rewrite equation for λ: ((λx M) N) = [N/x]M where [N/x]M is the replacement of all (free) occurrences of x by N in M
The lambda-calculus is equally computationally powerful to Post systems and Turing/von Neuman machines. We will examine it later in the chapter.

Now we are ready to learn the functional programming paradigm, which does computing as program rewriting. As shown above, the rewriting can be implemented as a machine that traverses and constructs operator trees.

7.1 Arithmetic and algebra

The first programming language you ever learned, arithmetic, is a functional language --- its expressibles are the nonnegative ints, it has no denotables, and it has no storables (updatables). Here's its syntax:

===================================================

E: Expression
N: Numeral

E ::=  N  |  E1 + E2  |  E1 - E2  |  E1 * E2 
N ::=  sequences of digits

===================================================

Programs in this language are expressions, and computation does equals-for-equals substitution of answers for subexpressions. For the program, (2 * (4 - 3)) + 5, here is its computation: (2 * (4 - 3)) + 5 = (2 * 1) + 5 = 2 + 5 = 7
We used equational laws, like 4 - 3 = 1 and 2 * 1 = 2, on the operators and numerals to compute the answer. This program's output (answer) is 7, because there are no subexpressions left to compute.

When you learned algebra, you learned that expressions can be named. The language of algebra is arithmetic extended with expression abstracts:

===================================================

P: AlgebraProgram
D: Definition
E: Expression

P ::=  given D solve E
D ::=  I = E  |  D1 and D2
E ::=  N  |  E1 + E2 | E1 * E2 | I

===================================================

The computation laws for algebra are interesting and important. Here are the ones you learn in abstract algebra: ((A + B) + C) = (A + (B + C)) (i) (A + B) = (B + A) (ii) (A + 0) = A (iii) ((A * B) * C) = (A * (B * C)) (iv) (A * B) = (B * A) (v) (A * 1) = A (vii) (A * (B + C)) = ((A * B) + (A * C)) (viii)
These laws can be applied in any order to a program. Here is an example program: given y = 2 * x and z = y + 1 solve 2 * z
We apply the equations to 2 * z to compute a phrase that has the same meaning as the starting program but has a more direct representation: 2 * z = 2 * (y + 1) # substitute for z = (2 * y) + (2 * 1) # (viii) = (2 * y) + 2 # 2 * 1 = 2 = (2 * (2 * x)) + 2 # substitute for y = ((2 * 2) * x) + 2 # (iv) = (4 * x) + 2 # 2 * 2 = 4
Since we do not know the value of x, we stop here.

7.2 A core language for list structures

Algebra is for playing ``number games.'' Many computing problems are ``data-structure games,'' where we assemble and disassemble data structures like stacks, queues, trees, and tables. Rather than using storage locations to hold the elements that we extract and insert, we can calculate directly on the entire data structure. This is the motivation for the first functional programming language, Lisp.

Here is a core language that resembles core Lisp or core ML; the expressibles are atoms and lists. For the moment, there are no denotables and no storables:

===================================================

E: Expression
A: Atom (words)

E ::=  A  |  nil  | E1 :: E2  |  hd E  |  tl E |  ifnil E1 then E2 else E3
A ::= strings of letters

===================================================

An atom is a string, like "abc".
nil is a list with nothing in it. In Python and ML, nil is written []. (We will follow a classic Lisp convention and treat nil and "nil" as the same value(!) If this sounds wierd to you, don't worry about it....)
E1 :: E2 places the value denoted by E1 on the front of the list denoted by E2:

"a" :: nil is a list of one element. (We can write this as ["a"] in ML.)
"b" :: ("a" :: nil) evaluates to a two-element list, where "b" is first and "a" second. (We can write this as ["b", "a"] in ML. (:: is a kind of ``stack push.'')
In Lisp, :: is written cons, e.g., (cons "b" (cons "a" nil)). For this reason, we read :: as "cons".
hd E (read as "head E") returns the front element of list, E, e.g., hd ("b" :: ("a" :: nil)) computes to "b". (It is like ``stack top.'')
tl E (read as "tail E") returns a new list missing the former front element, e.g., tl ("b" :: ("a" :: nil)) is "a" :: nil. (It is like ``stack pop.'')
ifnil E1 then E2 else E3 chooses to compute E2 or E3 depending on whether E1 computes to an empty list.
(In ML, you would say, if E1 = nil then E2 else E3, and in general, if B then E2 else E3, where B is a boolean-typed expression.)

What is amazing is that all of the above explanations boil down to these four equations, which define the language's semantics completely:

===================================================


hd (E1 :: E2)  =  E1                        (i)

tl (E1 :: E2)  =  E2                        (ii)

ifnil nil then E1 else E2  =  E1            (iii)

ifnil (E1 :: E2) then E3 else E4  =  E4     (iv)

===================================================

(Note: some people add this equation, so that the conditional can test on atoms as well as lists:
ifnil A then E1 else E2 = E2, if atom A is not nil.)

We have an ``arithmetic'' for lists, based on these equations. Here's one small example, a list that mixes atoms and lists:

"a" :: (tl (hd (("b" :: nil) :: ("c" :: nil))))

= "a" :: (tl ("b" :: nil))      # rule (i)

= "a" :: nil                    # rule (ii)

In the above, the start list is ("b" :: nil) :: ("c" :: nil), which we would write as [["b"], "c"] in ML. We then extract its head, ["b"] and then its tail, [], to which we cons "a", giving the answer, "["a"].

Here's another example:

ifnil ("a" :: nil)
  then  nil
  else  hd (tl ("b" :: ("c" :: nil)))

= hd (tl ("b" :: ("c" :: nil)))         # rule (iv)

= hd ("c" :: nil) =  "c"                # rules (ii) and (i)

The order in which we use the equations does not matter since there is no assignment command or sequencing control structure. The previous example could be worked like this: ifnil ("a" :: nil) then nil else hd (tl ("b" :: ("c" :: nil))) = ifnil ("a" :: nil) then nil else hd ("c" :: nil) = ifnil ("a" :: nil) then nil else "c" = "c"
We get the same answer with our equations, regardless of the order we apply them. (This is called the confluence or Noetherian property of a rewriting system.)

So far, our language doesn't look like much, but since we can build lists that mix atoms and other lists, we can easily model trees, tables, dictionaries, and indeed all the important data structures of computer science. And with one key extension (parameterized expression abstracts that can call themselves), we will achieve a language that has the same computing power as all known programming languages.

This next example shows we can embed a conditional expression inside an expression:

"a" :: (ifnil nil then nil else ("b" :: nil))   =   "a" :: nil

Finally, we can use :: to glue a list to an atom or an atom to an atom, e.g., "a" :: "b" and nil :: "a". These data structures are called dotted pairs, and they are like Python pairs.

7.2.1 Control structures

When we compute upon arithmetic expressions, we don't think of control structures. This is because control structures are primarily used to sequence the updates to a file or storage. But the previous examples showed how an if-structure lets us choose which list expression to compute.

The interactive version of ML uses a weak sequential control: After an expression is computed to its answer, the answer is saved in a temporary variable, named it. The expression that evaluates next can reference it, and once that expression finishes, its answer becomes the new value of it. Here is an example of a sequence of three expressions to simpify, separated by semicolons:

"a" :: nil;  hd it;  it :: (it :: nil)

In ML, this compound expression computes to ["a","a"], because the first expression, "a" :: nil, computes to ["a"], so the next expression, hd it, computes to "a" (because it names ["a"], and the last expression computes to ["a","a"] (because it names "a").

7.3 Expression abstracts

Let's give names to expressions. These are expression abstracts.

===================================================

D: Definition
E: Expression
A: Atom
I: Identifier

E ::=  A  |  nil  |  E1 :: E2  |  hd E  |  tl E 
    |  ifnil E1 then E2 else E3  |  let D in E end  |  I

D ::=  val I = E 

===================================================

val I = E binds identifier I to expression E. Now, whenever we mention I, it means E and can be substituted by E, equals for equals. I is not a location in storage, it is a constant value, set just once. (Java lets you declare a ``final variable'' that is initialized and fixed forever, e.g., final double pi = 3.14.159.)

We add one new rewriting equation to our semantics:

===================================================

let val I = E1 in E2 end  =  [ E1 / I ] E2

===================================================

where [ E1 / I ] E2 means that we substitute the phrase E1 for all (free) occurrences of name I within phrase E2.

Here is an example:

let val x = "a" :: nil  in
    let val y = "b" :: x  in
        x :: (tl y) end
end

This can compute as follows:

===================================================

let val x = "a" :: nil  in
    let val y = "b" :: x  in
    x :: (tl y) end  end

= let val y = "b" :: ("a" :: nil) in
    ("a" :: nil) :: (tl y) end  

= ("a" :: nil) :: (tl ("b" :: ("a" :: nil)))

= ("a" :: nil) :: ("a" :: nil)

===================================================

The answer displays as the list, [["a"], "a"], in ML.

As usual, the example can be worked in another order:

===================================================

let val x = "a" :: nil  in
    let val y = "b" :: x  in
    x :: (tl y) end  end

= let val x = "a" :: nil  in
      x :: (tl ("b" :: x) end

= let val x = "a" :: nil  in
      x :: x              end

= ("a" :: nil) :: ("a" :: nil)

===================================================

Sequencing does not matter when there are no assignments.

let definitions can be embedded, like this:

tl (let x = nil in x :: (let y = "a" in y :: x end) end)

which computes to

===================================================

= tl (let x = nil in x :: ("a" :: x) end)

= tl (nil :: ("a" :: nil))

= "a" :: nil

===================================================

Here is a more delicate example, where we redefine x in nested blocks. (This is similar to writing a procedure that contains a local variable of the same name as a global variable.)

===================================================

let val x = "a"  in
    let val y = x :: nil  in
        let val x = nil  in
           y :: x
        end
    end
end

===================================================

If we substitute carelessly, we get into trouble! Say we substitute for y first and appear to get let val x = "a" in let val x = nil in (x :: nil) :: x end end
This is incorrect --- it confuses the two definitions of x and there is no way we will obtain the correct answer, ("a" :: nil) :: nil.

The example displays a famous problem that dogged 19th-century logicians. The solution is, when we substitute for an identifier, we never allow a clash of two definitions --- we must rename the inner definition, if necessary. In the earlier example, if we substitute for y first, then we must eliminate the clash between the two defined xs by renaming the inner x:

===================================================

let val x = "a"  in
    let val y = x :: nil  in
        let val x' = nil  in
           y :: x'
        end
    end
end

===================================================

Now the substitution proceeds with no problem.

We can define substitutation and renaming precisely with equations. The equations cover all possible cases of substitution and only the last 3 equations are interesting:

===================================================

let val I = E1 in E2 end  =  [ E1 / I ] E2

where

[ E0 / I ] A  =  A           # atoms are left alone

[ E0 / I ] nil  =  nil       # so is nil


[ E0 / I ] E1 :: E2  =  [ E0 / I ]E1 :: [ E0 / I ]E2    # substitute into both parts

[ E0 / I ] hd E  =  hd [ E0 / I ]E    # substitute into the subexpression...

[ E0 / I ] tl E  =  tl [ E0 / I ]E

[ E0 / I ] ifnil E1 then E2 else E3  =
    ifnil  [ E0 / I ]E1  then  [ E0 / I ]E2  else  [ E0 / I ]E3

[ E0 / I ] let D in E end  =  let  [ E0 / I ]D  in  [ E0 / I ]E  end


[ E0 / I ] I  =  E0     # replace  I

[ E0 / I ] I' =  I'     if  I' not= I   # don't alter a var different than I


[ E0 / I ] val I' = E   =   val I' = [ E0 / I ]E   if  I not= I' and  I' does not appear in  E0
       # that is, if there is no name clash


[ E0 / I ] val I = E   =   val I = E    # do nothing because  I is redefined here


[ E0 / I ] val I' = E   =   val I'' = [ E0 / I ][ I'' / I' ]E  
       if  I not= I' and  I' appears in E0  and  I''  is a new name that does not appear in either  E0  or  E
            # if there is a name clash, rename  I'  to some new  I''

===================================================

The last equation makes clear that a name clash is repaired by inserting an extra substitution to replace the name that clashes.

A machine based on equation rewriting can apply the above substitution laws without problem. Still, the equations are defining an expression-tree traversal algorithm that it is easy to implement.

7.4 Interpreter architecture for functional languages

We now implement the functional language. Here is a great exercise:

Exercise

Build a program that repeatedly applies these five equations:

===================================================

ifnil nil then E1 else E2  =  E1

ifnil (E1 :: E2) then E3 else E4  =  E4

hd (E1 :: E2)  =  E1

tl (E1 :: E2)  =  E2

let val I = E1 in E2 end =  [ E1 / I ] E2

===================================================

to an operator tree in the functional language until an answer appears. You have built the tree-rewriting machine shown in the Introduction section! Such a machine would represent the program as an operator tree and would repeatedly search the tree, top down, to find a subtree that matched on an equation. For example, the program let val x = "a" :: nil in ifnil x then nil else (hd x) :: x end
would compute like this:
Since this is a data structures language, computation is moving links between substructures. This makes substitution (e.g., let val x = ... in ...) a matter of moving links and not copying code. Here, the answer tree is "a" :: ("a" :: nil). This is the approach used to implement the Haskell language.

There is another approach --- Recall from Chapter 1 that function interpret traversed an expression tree and computed its meaning by computing and combining the meanings of its subtrees. In picture form, interpret computed like this:

This traversal technique was first invented for computing Lisp programs, and we now use it here.

Our interpreter reads the programmer as an operator tree and uses heap storage. When the interpreter computes the meaning of a subtree, it constructs the meaning in the heap from two-celled objects, called cons cells. Each let subtree constructs a namespace. Once an object is constructed in heap storage, it is never altered. This allows massive, natural, sharing of data structures and eliminates sequencing, sharing, and race errors.

Here is how the earlier program computes:

Now, interpret "a" :: nil creates a cons cell in the heap:
The cell's handle binds to x in a new namespace for the let, and the namespace is used to interpret the meaning of ifnil:
Select the appropriate arm:
and compute the meaning of (hd x) :: x, which looks up x twice, does a head operation, and builds a cons cell to represent the final answer:
The answer, handle γ, is printed as "a" :: ("a" :: nil). Throughout the computation, α is used in multiple places to represent the structure, "a" :: nil. The sharing is safe because α's cons cell is never updated after it is first constructed.

Here is the syntax of operator trees; it closely matches the source syntax:

===================================================

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DTREE, ETREE]  |  ["ref", ID]

DLIST ::=  [  [ID, ETREE]+ ]
        that is, a list of one or more  [ID, ETREE]  pairs

ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords

===================================================

For example, the expression, let val x = "abc" :: nil val y = ifnil x then nil else hd x in y :: x end
has this operator tree: ["let", [["x", ["cons", "abc", ["nil"]]], ["y", ["ifnil", ["ref", "x"], ["nil"], ["head", ["ref", "x"]] ]]], ["cons", ["ref", "x"], ["ref", "y"]]]

Here is the code for the interpreter:

===================================================

"""Interpreter for functional language with  cons  and  simple let.

Here is the syntax of operator trees interpreted:

ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DLIST, ETREE]  |  ["ref", ID]

DLIST ::=  [  [ID, ETREE]+ ]
        that is, a list of one or more  [ID, ETREE]  pairs

ATOM  ::=  a string of letters
ID    ::=  a string of letters, not including keywords

"""

### HEAP:

heap = {}
heap_count = 0  # how many objects stored in the heap

"""The program's heap --- a dictionary that maps handles
                          to namespace or cons-pair objects.

      heap : { (HANDLE : NAMESPACE or CONSPAIR)+ }
        where 
         HANDLE = a string of digits
         NAMESPACE = {IDENTIFIER : DENOTABLE} + {"parentns" : HANDLE or "nil"} 
            that is, each namespace must have a "parentns" field
         CONSPAIR = (DENOTABLE, DENOTABLE)
         DENOTABLE = HANDLE or ATOM or "nil"
         ATOM = string of letters

   Example:
     heap = { "0": {"w": "nil", "x": "ab", "parentns": "nil"},
              "1": {"z":"2", "parentns":"0"},
              "2": ("ab","nil"), 
              "3": {"y": "4",  "parentns":"1"},
              "4": ("2","2")
            }
     heap_count = 5
        is an example heap, where handles "0","1","3"  name  namespaces
        which hold definitions for  w,x,z,y,  due to  let  expressions.
        Handles "2" and "4" name cons-cells that are constructed due to
        a use of  cons.
        This example heap might have been constructed by this expression:
         let val w = nil 
             val x = "ab" in
         let z = x :: w in
         let y = z :: z in
          ...

       The values computed are  w = [],  x = "ab",  z = ["ab"], y = [["ab"], "ab"]
"""

### ASSOCIATED MAINTENANCE FUNCTIONS FOR THE  heap:

def isHandle(v) :
    """checks if  v  is a legal handle into the heap"""
    return isinstance(v, str) and v.isdigit() and  int(v) < heap_count


def allocate(value) :
    """places  value  into the heap with a new handle
       param:  value - a namespace or a pair
       returns the handle of the newly saved value
    """
    global heap_count
    newloc = str(heap_count)
    heap[newloc] =  value
    heap_count = heap_count + 1
    return newloc


def deref(handle):
    """ looks up a value stored in the heap: returns  heap[handle]"""
    return heap[handle]


### MAINTENANCE FUNCTIONS FOR NAMESPACES:


def lookupNS(handle, name) :
    """looks up the value of  name  in the namespace named by  handle
       If  name  isn't found, looks in the  parentns and keeps looking....
       params: a handle and an identifier 
       returns: the first value labelled by  name  in the chain of namespaces 
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                ans = ns[name]
            else :   # name not in the most local ns, so look in parent:
                ans = lookupNS(ns["parentns"], name)   
    else :
        crash("invalid handle: " + name + " not found")
    return ans


def storeNS(handle, name, value) :
    """stores  name:value  into the namespace saved at  heap[handle]
    """
    if isHandle(handle):
        ns = deref(handle)
        if not isinstance(ns, dict):
            crash("handle does not name a namespace")
        else :
            if name in ns :
                crash("cannot redefine a bound name in the current scope")
            else : 
                ns[name] = value


#########################################################################

# See the end of program for the driver function,  interpretPTREE


def evalETREE(etree, ns) :
    """evalETREE computes the meaning of an expression operator tree.

      ETREE ::=  ATOM  |  ["nil"]  |  ["cons", ETREE, ETREE] |  ["head", ETREE] 
        |  ["tail", ETREE]  |  ["ifnil", ETREE1, ETREE2, ETREE3]
        |  ["let", DLIST, ETREE]  |  ["ref", ID]

       post: updates the heap as needed and returns the  etree's value,
    """

    def getConspair(h):
        """dereferences handle  h  and returns the conspair object stored
           in the heap at  h
        """
        if isHandle(h):
            ob = deref(h)
            if isinstance(ob, tuple):  # a pair object ?
                ans = ob
            else :
                crash("value is not a cons pair")
        else :
            crash("value is not a handle")
        return ans


    ans = "error"
    if isinstance(etree, str) :  #  ATOM
        ans = etree 
    else :
        op = etree[0]
        if op == "nil" :
            ans = op

        elif op == "cons" :
            arg1 = evalETREE(etree[1], ns)
            arg2 = evalETREE(etree[2], ns)
            ans = allocate((arg1,arg2))   # store new conspair in heap

        elif op == "head" :
            arg = evalETREE(etree[1], ns)
            ans, tail = getConspair(arg)
 
        elif op == "tail" :
            arg = evalETREE(etree[1], ns)
            head, ans = getConspair(arg)

        elif op == "ifnil" :
            arg1 = evalETREE(etree[1], ns)
            if arg1 == "nil" :
                ans = evalETREE(etree[2], ns)
            else :
                ans = evalETREE(etree[3], ns)

        elif op == "let" :
            newns = evalDLIST(etree[1], ns)  # make new ns of new definitions
            ans = evalETREE(etree[2], newns)  # use new ns to eval ETREE
            # at this point,  newns  isn't used any more, so forget it!

        elif op == "ref" :
            ans = lookupNS(ns, etree[1])

        else :  crash("invalid expression form")
    return ans


def evalDLIST(dtree, ns) :
    """computes the meaning of a sequence of new definitions and stores
       the  ID, meaning  bindings in a new namespace
           DTREE ::=  [ [ID, ETREE]+ ]
       returns a handle to the new namespace
    """
    newns = allocate({"parentns": ns})  # create the new ns in the heap
    for bindingpair in dtree :  # add all the new bindings to the new ns
        name = bindingpair[0]
        expr = bindingpair[1]
        value = evalETREE(expr, newns)
        storeNS(newns, name, value)
    return newns


###########################

def crash(message) :
    """pre: message is a string
       post: message is printed and interpreter stopped
    """
    print message + "! crash! ns=", ns, "heap=", heap
    raise Exception   # stops the interpreter


def prettyprint(value):
    if isHandle(value):
        v = deref(value)
        if isinstance(v, tuple):
           ans = "(cons " + prettyprint(v[0]) + " " + prettyprint(v[1]) +")"
        else :
           ans = "HANDLE TO DICTIONARY AT " + v
    else :
        ans = value
    return ans
        

# MAIN FUNCTION ###########################################################

def evalPGM(tree) :
    """interprets a complete program 
       pre: tree is an ETREE
       post: final values are deposited in heap
    """
    global heap, heap_count
    # initialize heap and ns:
    heap = {}
    heap_count = 0

    ans = evalETREE(tree, "nil")
    print "final answer =", ans
    print "pretty printed answer =", prettyprint(ans)
    print "final heap ="
    print  heap

    raw_input("Press Enter key to terminate")


===================================================

Install the above code and run it with Python. Try at least these test cases: evalPGM( ["cons", "abc", ["nil"]] ) evalPGM( ["head", ["cons", "abc", ["nil"]]] ) evalPGM( ["ifnil", ["nil"], ["head", ["cons", "abc", ["nil"]]], "def"] ) evalPGM( ["let", [["x", "abc"]], ["cons", ["ref", "x"], ["nil"]]] ) evalPGM( ["let", [["x", ["cons", "abc", ["nil"]]]], ["cons", ["ref", "x"], ["ref", "x"]]]) evalPGM( ["let", [["x", ["cons", "abc", ["nil"]]], ["y", "nil"]], ["cons", ["ref", "x"], ["ref", "y"]]] )
Study the heap layouts as well as the answers computed for each case.

One important point: there is no activation stack in the interpreter. Instead, each interpretTREE function is parameterized on the handle of the namespace it should use for doing variable lookups. This technique accomplishes the same work of an activation stack without all the coding overhead --- the stack is ``threaded'' through the sequence of function calls.

7.5 Expression parameters

Definitions can be parameterized; the result are functions:

===================================================

E ::=  ... |  [ E* ]  |  let D in E end  |  I  |  I(E*)
D ::=  val I = E  |  fun I1(I2*) = E  |  D1 D2

===================================================

where E* means zero or more expressions, separated by commas and I* means zero or more identifiers, separated by commas.

When a function is defined, its code (expression) is saved. When the function is called, its arguments bind to its parameters, and the function's code evaluates. Here is a small example:

===================================================

let fun second(list) = let val rest = tl list
                       in  hd rest
in
second(tl("a" :: ("b" :: ( "c" :: nil))))

===================================================

The argument binds to the parameter name when the function is called. The rewriting might go like this: (We have a better way to do it in a moment!)

===================================================

second(tl("a" :: ("b" :: ( "c" :: nil))))

= let val list = tl("a" :: ("b" :: ( "c" :: nil))) in   # bind the arg to the param
  let val rest = tl list in
  hd rest

= let val rest = tl( tl("a" :: ("b" :: ( "c" :: nil)))) in
  hd rest

= hd ( tl( tl("a" :: ("b" :: ( "c" :: nil)))))

= ... = "c"

===================================================

Since there are no assignments, it is unimportant when we compute the value named by the parameter.

7.5.1 Lambda abstractions

The previous example did not implement the function call in precisely the style we have used so far. In particular, we should substitute the code for second into the position where second is referenced. Let's see where this leads us.

The definition,

fun second(list) = let val rest = tl list
                   in  hd rest  end

can be written as an ordinary val-definition like this, by moving the parameter name to the right of the equals sign: val second = (list) let val rest = tl list in hd rest end
It is a tradition to place the word, lambda, in front of the parameter name, (list), so that the reader can identify it clearly:

===================================================

val second = lambda list : let val rest = tl list
                           in  hd rest end

===================================================

second is the same function, just formatted a little differently. Now it is clear that second is the name of the function code, lambda list : let val rest = tl list in hd rest end. This construction is called a lambda abstraction or an anonymous function.

(Important: in ML, the lambda abstraction is coded, fn list => let val rest = tl list in hd rest end.)

The lambda expression comes with this semantic equation, which is a variant of the one we use for let:

===================================================

(lambda I : E1) E2  =  [ E2 / I ] E1

===================================================

Let's rework the previous example using the lambda abstraction:

let val second = lambda list: let val rest = tl list in hd rest end
    in  second(tl("a" :: ("b" :: ( "c" :: nil))))
end

We substitute for second and then bind the argument to the parameter: = (lambda list: let val rest = tl list in hd rest end)(tl("a" :: ("b" :: ( "c" :: nil))) = let val rest = tl (tl("a" :: ("b" :: ( "c" :: nil)))) in hd rest = hd( tl (tl("a" :: ("b" :: ( "c" :: nil)))) ) = ... = "c"
The expression, lambda list: ..., is the function code, divorced from its name. We copied the function code into the position where the function is called. This is how substitution is supposed to work: you replace an identifier by the value it names. Using the new equation, for lambda, we bind the argument to its parameter name, and everything works smoothly.

Here is the ``minimal form'' of our functional language:

===================================================

E ::=  ...  |  let D in E end  |  I  |  E1(E2*)  |  lambda I*: E
D ::=  val I = E  |  D1 D2

===================================================

A function body, lambda I*: E, is an expression, just like nil or E1 ::E2, because it can appear as part of val I = E or even anonymously within an expression, e.g., "a" :: (lambda x: (x :: x))(nil) = "a" :: (nil :: nil)
The nameless function is called ``lambda abstraction'' because it is a kind of abstract, a naming device, for the parameter. The lambda abstraction has a long, rich history, extending to the debates in 19th-century philosophy that let to the development of modern set theory and predicate logic. It also happens to be quite useful for computation!

Now the language's characteristic domains go as follows: the expressible values are atoms, lists, and functions on expressibles. The denotable values are exactly the expressibles. There are still no storable (updateable) values.

7.5.2 Adding closures to the interpreter

We easily add lambda abstractions to the interpreter in the previous section --- we use a closure object, just like we used to implement procedures in an object language.

As before, a closure object is a pair, consisting of function-code-plus-parent-namespace-pointer. A couple of pictures will make this clear. For this sample program:

===================================================

 . . .
let val second = lambda list: hd (tl list)  
in  let val x = "a" :: ("b" :: nil)
    in  second(x) :: x  
end end

===================================================

The heap layout at the point where second is called looks like this:
second's value, saved in namespace β, is the closure object at handle γ. The closure remembers the code for the function along with a link to its global names. When second is called, namespace τ is created to hold its parameter, list:
Once second returns its answer, "b" (because the tail of cons cell κ is ε, and the head of its cons cell is "b"), the current namespace reverts to δ, which lets us compute the answer, the handle to a cons cell that holds "b" and κ:
The answer is called it in ML. Here, it is the handle to a list. It is easy to recode the interpreter in the earlier section to handle this form of call. Note again that we don't require an activation stack --- we parameterize the interpretTREE functions on the handle of the current namespace used for lookups. That's it.

7.6 Recursively defined functions

A function can restart itself by looking up its own definition; this is called recursion. (You can see this in the previous picture: the code for second can refer to itself if it wishes.) A recursive call executes exactly the same way as any other function call; no new machinery is needed.

Recursion with parameters can substitute for assignment and iteration. This example,

x = 0
while x < 100 :
    print x
    x = x + 1
end

depends on destructive assignment in the loop body to work. But it is not critical to have destructive assignment once we have recursively defined functions. In ML, we can write let fun printloop(x) = if x < 100 then (print(Int.toString(x)); print("\n"); printloop(x+1)) else nil (* done -- do nothing *) in printloop(0) end
Parameter passing replaces destructive assignment. (ML has a print expression that can print a string and returns nil as its answer. There is also a ; operator that sequences one expression followed by another.)

Here is a useful ML function that reads a sequence of text lines from the keyboard and collects the lines into a list. The function reads one line and restarts itself to read more lines. It quits when it sees a "!":

===================================================

fun collectText() =
    let val txt = TextIO.inputLine TextIO.stdIn   
             in  if hd (explode txt) = #"!"     (* if head is "!" *)
                 then []                        (* then quit *)
                 else txt :: collectText()      (* else save and RESTART *)
    end

===================================================

Note: some implementations require this wordier version:

===================================================

(* collectText reads a sequence of text lines from the input terminal
   and assembles them into a list of strings.  It quits when it sees
   a "!" as the first character of a textline.  
   It returns the list of strings as its answer.
*)
fun collectText() =
    let val t = TextIO.inputLine TextIO.stdIn   (* read the line *)
    in  if isSome(t)                            (* if line is nonempty *)
        then let val txt = valOf(t)             (* then pull out its text *)
             in  if hd (explode txt) = #"!"     (* if head char is "!" *)
                 then []                        (* then quit *)
                 else txt :: collectText()      (* else save txt and RESTART *)
             end
        else []
    end;

===================================================

Try this function in ML, and you will see that it reads a sequence like hello there. 123 !
and returns the list, ["hello there.\n", "\n", "123\n"]. The function restarts itself each time to look for another line of user input.

The ML instructions for textual input are ugly, so here is the recursion pattern again, this time to build a list of the first k+1 powers of 2. For example, powers(3) computes and returns the list, [8,4,2,1]: The example shows how recursive calls assemble a data structure in stages, from ``back to front'':

===================================================

(* powers builds of list of powers of two from  2**k down to 2**0
   param:  k  the upper bound, a nonnegative int
   returns: a list, [2**k, 2**(k-1) ..downto.. 2**0]
*)
fun powers(k) =
    if k = 0
    then [1]                                 (* because 2**0 = 1 *)
    else let val answers = powers(k - 1) in 
         (* assert:  answers  holds [2**(k-1) ..downto.. 2**0]  *)
         (2 * (hd answers)) :: answers       (* cons  2**k  to  answers *)
         end
;

===================================================

Here is a sample calculation:

===================================================

powers(3)

= let val answers = powers(2) in
  in (2 * (hd answers)) :: answers 
  end

= let val answers = (let val answers = powers(1) 
                     in (2 * (hd answers)) :: answers
                     end)
  in (2 * (hd answers)) :: answers 
  end    


= let val answers = (let val answers = (let val answers = powers(0)
                                        in (2 * (hd answers)) :: answers
                                        end)
                     in (2 * (hd answers)) :: answers
                     end)
  in (2 * (hd answers)) :: answers
  end

===================================================

The recursive calls generate fresh copies of the function (in the implementation, new activation namespaces are constructed) and build on the answer from powers(0) to build the answer from powers(1), etc. To finish:

===================================================

= let val answers = (let val answers = (let val answers = [1]
                                        in (2 * (hd answers)) :: answers
                                        end)
                     in (2 * (hd answers)) :: answers
                     end)
  in (2 * (hd answers)) :: answers
  end

= let val answers = (let val answers = (2 * (hd [1])) :: [1]
                     in (2 * (hd answers)) :: answers
                     end)
  in (2 * (hd answers)) :: answers
  end

= let val answers = (let val answers = [2,1]
                     in (2 * (hd answers)) :: answers
                     end)
  in (2 * (hd answers)) :: answers
  end

= ...

= let val answers = [4,2,1]
    in (2 * (hd answers)) :: answers
  end

= [8,4,2,1]

===================================================

Perhaps we want the numbers in ascending order. We reverse a list in ML like this, again using recursion to build the answer in stages:

===================================================

(* reverse reverses the elements in a list.
   param: ns - a list, e.g, ["c","b","a"]
   returns: a list that holds the items of ns in reverse order, e.g., ["a","b","c"]
*)
fun reverse(ns) =
    if ns = []
    then  []
    else  reverse(tl ns) @ [hd ns]    (* in ML,  @  means list append *)

===================================================

So, reverse(powers(3)) computes to [1,2,4,8]. Calculate this with equations and run it on the computer.

The recursion was done with the tail of the argument list, that is, with a list that is one smaller than the original argument. In this way, we ``count down'' (disassemble) the list down to an empty one, which stops the recursions.

In ML, we can define functions on lists in an equational style, with parameter patterns, like this:

===================================================

(* reverse reverses the elements in a list.
   param: ns - a list, e.g, ["c","b","a"]
   returns: a list that holds the items of ns in reverse order, e.g., ["a","b","c"]
*)
fun reverse(nil) = []
|   reverse(n :: ns) = reverse(ns) @ [n]

===================================================

The if, hd, and tl are automatically computed by matching the structure of the argument to the two possible equations for computing the argument's answer.

Finally, people who like loops often write this variant of list reverse. The second parameter is called an accumulator, because it accumulates the answer in stages:

===================================================

(* reverseloop(ns, ans)  reverses the items in list  ns  and appends them
    to the end of  ans.
   params:  both  ns  and  ans  are lists.
   returns: a list that holds the elements of  ans  followed by
    the elements of  ns  in reverse order.
   To use the function to reverse a list,  x,  do this:  reverseloop(x, []).
*)
fun reverseloop(nil, ans) = ans
|   reverseloop(n::ns, ans) = reverseloop(ns, n :: ans)

===================================================

Using parameter patterns, we can easily write a function that searches a list for a value:

===================================================

(* member(v, xs)  searches list  xs  to see if  v  is a member in it.
   params:  v - a value;  xs - a list of values
   returns: true exactly when  v  is found in  xs
*)
fun member(v, nil) = false
|   member(v, (w::rest)) = if v = w  then true
                           else member(v, rest)

===================================================

The function searches the list from front to back.

7.6.1 Application: Maintaining a table and undoing its updates

The previous examples show that functional programs are adept at building and reconfiguring data structures. What's more, because there is no assignment, it means that whenever a data structure is built in heap storage, it stays the same, forever. At first, this seems like a drawback --- how do we maintain a table that must be repeatedly updated? But in reality the functional technique is preferred for maintaining such tables, because

An update to a table actually constructs a new cons cell that holds the update and a link to the old table.
If there is an error or a security breach, a sequence of updates are easily rolled back (undone) by reverting to the handle of the table as it looked before the updates were appended to it.

This approach is used in all text editors and data bases --- there are always cons-cell style updates until a "checkpoint" time is reached, at which time the complex, updated table is written to secondary storage, reconfigured, and rebuilt fresh in primary storage. It is also used in versioning systems (programs that maintain multiple edits of a family of files).

Here is a small example, where a user can issue update, lookup, and undo commands to a database that holds key,value pairs. Notice that the database is actually a handle to a list, assembled from cons cells, in the heap. An update adds a new cons cell to the database, and an undo resets the database's handle:

===================================================

(* A database of (key,value) pairs, modelled as a list of form,
    (k1,v1) :: ((k2,v2) :: ... :: nil))
*)

(* Auxiliary function update adds a new key,value pair to the database.
   The new pair cancels any existing pair with the same key.
   params:  key, value
   returns: the (handle to) the updated database
*)
fun update(key, value, database) = 
    let val newdatabase = (key, value) :: database
    in  newdatabase 
    end

(* Auxiliary function lookup finds the value corresponding to a key in  the database.
   params:  key, database
   returns: the value such that  (key,value)  lives in the database
*)
fun lookup(k, database) =
    if database = []
    then raise Empty   (* error --- empty database *)
    else let val (key,value) = hd database 
         in  if key = k    (* is desired value the most recent update? *)
             then value
             else lookup(k, tl database)    (* if not, look deeper... *)
         end


(* Main function  processTransaction is a "loop" that reads user requests and
   processes the database accordingly.   The requests are either:
      -- update key value
      -- lookup key
      -- undo most previous update
   Params:  database: (the handle to) the current database
            history: a list of handles to previous databases
*)
fun processTransaction(database, history) =
    (* 3 lines of ugly ML code to read one textline:  *)
    let val text = TextIO.inputLine TextIO.stdIn 
    in  if isSome(text)
        then let val request = valOf(text) in
             (* now, we decode and process the request:  *)
             ...extract command, key, value, etc. from  request...
             if command = "update"
             then let val newDatabase = update(key, value, database)
                  in  (print("update transaction\n");
                      processTransaction(newDatabase,  database :: history)
                      )
                  end
             else if command = "lookup"
                  then (print("lookup transaction\n");
                       print (lookup(key, database));
                       processTransaction(database, history)
                       )
             else if command = "undo"
                  then (print("undo transaction\n");
                       processTransaction(hd history, tl history)
                       )
             else raise Empty  (* bad command *)
             end
        else print "End of Session";
             ... code here to archive the database on disk ...
    end

===================================================

You can see that processTransaction loops, always remembering the handle to the current value of the database plus keeping a list of handles to previous versions of the database in case a rollback is necessary. The rollback step is beautifully simple: reset the handle to the current database back to the handle of the previous version of the database. This technique works because there is no assignment that might alter a value in the heap once the value is stored there!

Of course, Amazon or Google do not use simple list implementations of their databases. Instead, spelling trees ("tries") or hash tables are extended with cons-cell-style update. We look at trees and other structures in the next section.

7.6.2 Inductively defined datatypes

The language we have developed has two types of data: atoms and lists. (If we allow lambda abstractions, there is a third type --- functions.) Some functional languages (e.g., Lisp and Scheme) allow arbitrary combinations of values, e.g., lists that mix atoms and lists, and allow operations on all possible values (e.g., equality comparison of an atom versus a list).

Other languages, like ML and Haskell, use a Pascal/Java-like type system, where each value has a specific type; all the elements of a list must have the same type; and only values of the same type can be compared for equality.

Let's look at ML, which uses a type checker. Here is the syntax of types in ``core ML'':

===================================================

T: Type

T ::=  string  |  T list

===================================================

For example, "a" has type string "a" :: nil, which evaluates to ["a"], has type string list nil :: (("a" :: nil) :: ("a" :: ("b" :: nil))), which evaluates to [[], ["a"], ["a","b"]], has type (string list) list
These requirements are formalized by logic-rule typing laws, defined on the syntax of the language:

===================================================


A : string             E1 : T    E2 : T list
                     ------------------------
                         E1 :: E2 : T list

E : T list             E : T list
-----------          ---------------
hd E : T             tl T : T list


E1 : T list   E2 : T'   E3 : T'
---------------------------------
 if E1 = nil E1 then E2 else E3 : T'

===================================================

These laws are coded into the ML type checker/interpreter. When the ML interpreter examines an expression, it uses the laws to calculate the data type of the expression. It then computes the meaning of the expression. So, if you start the ML interpreter and type, - nil :: (("a" :: nil) :: ("a" :: ("b" :: nil)));
the response is val it = [[], ["a"], ["a","b"]] : (string list) list

One question remains: What is the type of nil? The answer is, T' list, for any type T' whatsoever. So, nil has ``many types,'' depending on where it is inserted in an expression. (See the earlier examples.) For this reason, its ``typing rule'' goes:

===================================================

nil : T List

===================================================

where you can fill in T as you wish. (ML would print val it = [] : 'a list, using 'a as the dummy type name.)

When you define an ML function, the type checker checks the function's code and calculates the type. The interpreter constructs the function's closure and shows the type, e.g.,

- fun double(n) = n * 2;

val double = fn : int -> int

The function's type is int -> int, that is, an int argument is required to produce an int answer.

Sometimes an function can be used with arguments of different types, e.g.,

- fun second(xs) = hd(tl xs);

val second = fn : 'a list -> 'a

The function can be used as second [1,2,3] or as second["a","b","c"] or as second( nil :: (("a" :: nil) :: ("a" :: ("b" :: nil))) ).

If we have types, we can can have ``type abstracts,'' where we give names to types. This idea was used brilliantly by Rod Burstall in the language, Hope and adapted by Luca Cardelli into the modern version of ML, now called SML ("Standard ML"). Here is a type abstract that defines a data type of binary trees that hold ints at their nodes:

===================================================

datatype IntTree = Leaf |  Node of int * IntTree * IntTree

===================================================

The names Leaf and Node are constructors, for constructing trees, just like nil and :: are constructors for lists. Here are some expressions that have data type IntTree:

===================================================

Leaf   which represents a leaf tree,   *

Node(2, Node(1, Leaf, Leaf), Node(5, Leaf, Leaf))  which represents   2
                                                                     / \
                                                                    1   5
                                                                   * * * *
let val t = Node(1, Leaf, Leaf) in
let val u = Node(5, t, t) in
in  Node(3, t, u)
end end                         which represents       3
                                                     /   \
                                                    1     5
                                                   * *   / \
                                                        1   1
                                                       * * * *
   or, for that matter, represents     3
                                     /   \ 
                               +--> 1     5
                               |   * *   / \
                               |________|__|

   because the implementation shares substructure.

===================================================

Because the type is defined in terms of itself (that is, trees can hold other, smaller trees), it is called inductively defined. This means we can assemble trees of arbitrary depth, just like we can construct lists of arbitrary length.

We can use parameter patterns defined by a datatype. Say that we use IntTree to build ordered binary trees. Here is a tree-search algorithm expressed with patterns:

===================================================

(* member(i, t)  searches ordered IntTree t for int i *)
fun member(i, Leaf) = false
|   member(i, Node(j, t1, t2)) = if i = j then true
                                 else if i < j  then member(i, left)
                                      else member(i, right)
;

val member = fn : int * IntTree -> IntTree

===================================================

and here is a function that collects the ints embedded in a tree:

===================================================

(* collect(t)  returns a list holding all the ints in  IntTree  t  *)
fun collect(Leaf) = []
|   collect(Node(i, left, right)) = collect(left) @ [i] @ collect(right)
;

val collect = fn : IntTree -> int list

===================================================

Perhaps most important, here is the function that inserts an int into an ordered tree:

===================================================

(* insert(i, t)  inserts  i  into ordered tree  t
   pre: i  is an int;   t  is an IntTree whose nodes are ordered
   post: returns an ordered IntTree containing t's values and  i
*)
fun insert(i, Leaf) = Node(i, Leaf, Leaf)
|   insert(i, Node(j, left, right)) =
             if i < j then Node(j, insert(i, left), right)
                      else Node(j, left, insert(i, right))
;

val insert = fn : int * IntTree -> IntTree

===================================================

Notice that an insertion constructs a tree with a new root node and a new "spine" along the path of insertion and with shared structure of all the parts unaffected by the insertion.

User-defined datatypes define schemas for data-structure building, much like classes do for object-oriented programming. Here are some types for a library's database, modelled as a list of entries of books and DVDs:

datatype Item =  Book of string * string   (* Book(title,author) *)
              |  Dvd of string             (* Dvd(title)         *)

datatype DBEntry = Entry of int * Item     (* Key(idnumber, item) *)

type Database = DBEntry list

Datatypes work great for defining tree-like structures that mix strings, ints, subtrees, lists, etc. If you review Chapter 1 on grammars, interpreters, and parsers, you see inductively-defined data types everywhere. For example, for this syntax of expressions:

EXPR ::=  DIGIT  |  - EXPR  |  ( EXPR + EXPR )

A parser might generate operator trees of this datatype:

===================================================

datatype ETree = Digit of char  |  Negation of Etree  |  Addition of Etree * Etree

===================================================

The function that interprets ETrees is short and sweet:

===================================================

(* interpretETree computes the int meaning of its ETree argument  *)
fun interpretETree(Digit c)  =  (ord(c) - ord(#"0"))
|   interpretETree(Negation(t))  =   -(interpretETree(t))
|   interpretETree(Addition(t1,t2))  =  interpretETree(t1) + interpretETree(t2)
;

val interpretETree = fn : ETree -> int

===================================================

Using ML-style datatypes, we can write a language's interpreter in about as many lines as the number of the language's operator-tree constructions. Compare this to the bulky interpreters we must write in Java or C or even in Python or Perl.

7.6.3 Map, filter, reduce

In an earlier chapter, we learned that data structures should possess their own custom control structures. These structures are often classified as map (apply some operation to each and every element of the data structure), filter (apply some boolean predicate to extract a subcollection of those elements that make the predicate True), and reduce (supply all the elements in the structure to an operation that ``totals up'' an answer).

It is easy to use parameter patterns to write exactly these structures for an inductively defined data type. Here are three sample instances of these control structures for binary trees of values:

===================================================

datatype 'a Tree = Leaf  |  Node of 'a * Tree * Tree

fun map(f, Leaf) =  Leaf
  | map(f, Node(a, t1, t2)) = Node(f(a), map(f, t1), map(f, t2))

fun filter(b, Leaf) = []
  | filter(b, Node(a, t1, t2)) = (if b(a) then [a] else [])
                                 @ filter(f, t1) @ filter(f, t2)

fun reduce(r, startvalue, Leaf) = startvalue
  | reduce(r, startvalue, Node(a, t1, t2)) =
              let v1 = reduce(r, startvalue, t1) in
              let v2 = reduce(r, v1, t2) in
              r(a, v2)
              end end

===================================================

Notice that f, b, and r are functions that are arguments to the control structures.

ML has built-in versions of map and reduce (two variants: foldl and foldr) for linear lists.

7.7 Conclusion

A functional language lets you compute by substituting equals-for-equals, like you learned when you learned arithmetic and algebra. This works because every identifier denotes a constant value, bound once and forever. It also means that the heap-base implementation can do massive sharing of substructures within a program.

The computation-as-algebra concept fails with imperative languages, because identifiers denote variables whose stored values change from line to line. For example, if we have an imperative program, like this one,

int x;
int y;
x = 0;
if ... {
    x = x + 1 }
else {
    x = 2
}
y = x;

it makes absolutely no sense to use x = 0 to ``substitute'' 0 for ``all occurrences'' of x in the rest of the program. Instead, we must execute the program with an instruction counter and primary storage --- there is no way to understand the program without the storage it manipulates, and there is no way to ``calculate'' the program's meaning by substitution.

In summary, imperative languages work with storage structures that are incrementally updated, for example, a voting table or a graphics system that maintains the pixels on a display.

In contrast, functional languages solve self-contained problems that assemble inputs into a data structure and then process the data structure into an answer. An example is a compiler, which translates an input program into a tree data structure and then processes the tree into output code. Or, a batch payroll system that converts a file of payroll information into a file of paycheks. Or, a library of numerical functions that compute answers to questions in physics or biology.

When you face a problem that can be solved by equational-style calculation, solve it with a functional language.

Chapter 7: The functional paradigm: From an arithmetic core to ML