Copyright © 2010 David Schmidt

Chapter 10:
Lambda calculus


10.1 Motivation: assignment is unnecessary
10.2 Lambda Calculus made precise
    10.2.1 Name clashes and paradoxes
    10.2.2 Exercises
10.3 Encoding arithmetic in lambda calculus
10.4 While-language translated into lambda-calculus
    10.4.1 Exercises


Many modern programming techniques stem from the lambda calculus, a notational system that was invented around 1920.


10.1 Motivation: assignment is unnecessary

In Python, we can write:
x = 2
def f(y): return y + x
x = x + 1
print f(99)
The def is actually an assignment:
x = 2
f = (y): return y + x
x = x + 1
f(99)
The rhs of f = is evaluated to a closure. There is a traditional notation for function-values:
x = 2
f = lambda y: y + x
x = x + 1
f(x)
It's used in Lisp and Scheme, Python, .... (In ML, we write, fn y => y + x.) You can also write a function call with an unnamed function:
x = 2
(lambda y: y + x)(x + 1)
Integer values don't need names; why should functions? (lambda y. y + x) is called a lambda abstraction or an abstraction for short. (From here on, we use a . and not a : to separate the parameter from the body. We will also use lam as a shortcut for lambda, but most textbooks use the Greek letter, λ.)

Lisp programmers quickly discover that assignments are unnecessary, e.g.,

x = 2
f = lam y. y + x
f(x + 1)
is "the same as":
(lam x.
    (lam f.
        f(x + 1)
    )(lam y. y + x)
)2
This is because of the beta-rule for binding of argument to parameter:
===================================================

beta:  (lam x. E1)E2  ==>  [E2/x]E1

===================================================
(Recall that [E2/x]E1 denotes substitution. There is lots more to say about this concept!)

The traditional ordering of assignments is mostly an illusion. We can compute the above program inside-out or outside-in or however we desire. Let's do it inside-out:

===================================================

(lam x.
    (lam f.
        f(x + 1)
    )(lam y. y + x)
)2

==>
(lam x.
   (lam y. y + x)(x + 1)
)2

==>
(lam x.
   (x + 1) + x
)2

==>
(2 + 1) + 2

===================================================
This is a critical insight from modelling with lambda-notation: the order in which you do subsititution does not matter --- the final result is always the same.

By the way, the original example at the beginning of this chapter:

===================================================

x = 2
def f(y): return y + x
x = x + 1
print f(99)

===================================================
looks like this in lambda notation:
===================================================

(lam x.
   (lam f.
      (lam x. (f 99 x) 
      ) (x + 1)
   )(lam y. lam x. y + x)
)(2)

===================================================
Function f requires the "current value" of "global variable" x.

Incredibly, the above codings can even do away with booleans and ints, replacing them by lambda-notation as well:

===================================================

[true]  is  lam t. lam e. t
[false] is  lam t. lam e. e

[0] is   lam s. lam z. z
[1] is   lam s. lam z. s z
[2] is   lam s. lam z. s (s z) 
  etc.

===================================================
Why these codings are sensible will be soon seen. Addition, multiplication, all of arithmetic can be defined on the above codings; we see how later in this chapter. Even repetition can be coded in lambda calculus, defined with this helper operator:
Y  is   lam f.(lam x. f(x x))(lam x. f(x x))
We will have lots more to say about all this.


10.2 Lambda Calculus made precise

Here is the syntax of the Lambda Calculus --- there are identifiers, abstractions, and applications:
===================================================

EXPR : Expression
IDEN : Identifier

    EXPR ::=  IDEN  |  ( lam IDEN . EXPR )  |  ( EXPR1 EXPR2 )

              where IDEN is any string-based identifier, e.g.,  'x' or 'y'

===================================================
It is standard to eliminate unneeded parens by writing a nested application of form (...((E1 E2) E3) ... En) as merely (E1 E2 ... En). Another standard abbreviation is to simplify (lam I1. (lam I2. ... (lam In. E) ...)) by (lam I1. lam I2. ... lam In. E). The outermost parens are also often dropped. (But the best way to write a lambda-expression is to draw its parse tree!)

The primary rewriting rule one uses to "compute upon" a lambda-expression is the beta-rule:

===================================================

    beta:  (lam I. E1)E2  ==>  [E2/I]E1

===================================================
where [E2/x]E1 is defined precisely on the syntax of E1 as follows:
===================================================

[E/I](lam I. E1) = (lam I. E1)
[E/I](lam J. E1) = (lam J. [E/I]E1),
                                if I != J and J is not free in E
[E/I](lam J. E1) = (lam K. [E/I][K/J]E1),
                                if I != J, J is free in E, and K is fresh
[E/I](E1 E2) = ([E/I]E1 [E/I]E2)
[E/I]I = E
[E/I]J = J,  if J != I

===================================================
The notion of "free in" can be defined equally precisely:
===================================================

I is free in I
I is free in (lam J. E)  iff  I != J and I is free in E
I is free in (E1 E2)  iff  I is free in E1  or  I  is free in E2

===================================================

An expression that has form, ((lam I. E1) E2), is called a beta-redex (or just redex for short). It can be simplified with the beta-rule. An expression that has no redexes is in normal form --- it is an "answer". This expression is not in normal form:

(lam x.
    (lam f.
        f x
    )(lam y. (x  y))
)a
but we can apply the beta-rule to reduce it to its normal form:
(lam x. (lam f. f x)(lam y. (x y))) a

==> (lam f. f a)(lam y. (a y))

==> (lam y. (a y)) a  ==>  (a a)
It takes a lot of work to prove, but if an expression can be reduced to a normal form, then the normal form is unique --- the order in which we apply the beta-rule to the expression does not affect the final answer. But some expressions have no normal form, which we see later.

There are two other rewriting rules but they are less important to us:

===================================================

alpha:  (lam I. E) ==>  (lam J. [J/I]E),  if  J not free in E
eta:    (lam I. (E I)) ==>  E,  if  I not free in E

===================================================
IMPORTANT: every phrase written in the lambda calculus can be used as an operator and also as an argument. In a phrase like (E1 E2), or more generally, (E1 E2 E3 ... En), the E1 is the operator and E2 (and E3, etc.) are the arguments. But all of the Ei are just lambda expressions.

Look again at the codings that were proposed earlier for the values, [true] and [false]:

===================================================

[true]  is  lam t. lam e. t
[false] is  lam t. lam e. e

===================================================
We think of the two booleans as data values, but both can be used as operators: e.g., ([True] E2 E3) =>* E2 and ([False} E2 E3) =>* E3. The truth values are selector functions, and a standard if-then-else command, if E1 then E2 else E3, is really just E1 used as the operator: (E1 E2 E3).

The lambda-calculus was the inspiration for Smalltalk, which treated all objects as both data and operators (that can be sent messages).


10.2.1 Name clashes and paradoxes

The lambda calculus was invented by Alonzo Church, a mathemetician-philosopher at Princeton. He was interested in studying name clashes. Here is a programming example of a name clash:

===================================================

x = 5

def f(y): return y + x  end  // what is the value of  x  here?  3?  5?

def g(x): return f(x)   end  // oops --- another  x,  and it's sent to  f...

print g(3)                   // what prints here?  6?  8?

===================================================
What should print --- 6 or 8? In C and Python, it's 8, but in Lisp, it's 6. How might we explain these two outputs? (Hint: use copy-rule semantics of procedure call and see where the two x's get declared.

Like Church did, we can analyze the situation with lambda-notation:

(lam x.
    (lam f.
        (lam g.
            g(3)
        )(lam x. f x)
    )(lam y. y + x)
)5
Do you bind 5 to x and do the subsequent substitutions ? Or, do you bind (lam y. y + x) to f and do the subsequent substitutions ? Does it matter? Should it matter? (NO)

The name clash between the two definitions of x confuse us. But we are not the first: These headaches started appearing in math proofs in the 19th century, and they were causing false results to be proved. They were the reason for the formal development of predicate logic. Here is a simplistic example of the bonehead reasoning that was appearing at that time:

===================================================

"In the domain of ints, every int has an int that is larger:
    forall x, exists y,  y > x  "

"So, 3, is an int, hence   exists y, y > 3"

"Let z be some int (no matter what), then   exists y, y > z"

"Let y be some int (no matter what), then   exists y, y > y"  (?!)

===================================================
Church realized that the issue here is scope --- "local variables" and "global variables." They occur in ordinary language as well as in computer script. Church used lambda as a hypothetical quantifier, e.g.,
 lam x. lam y. > x y
is the "scoping skeleton" of the above statement. You can study the substitutions on the "skeleton". Indeed, there are developments of predicate logic based on this idea:
===================================================

"forall x, exists y,  y > x"  is formally written like this:

FORALL(lam x. EXISTS(lam y. > y x ))

===================================================
FORALL and EXISTS are combinators (operators) that take abstractions as their arguments.

At the same time that scoping problems were found in math proofs, similar strange issues arose in set theory, which made central use of the "comprehension axiom":

===================================================

If  P  is a unary predicate on individuals, then  {x | P(x)}  defines a set.

===================================================
An example: _>0 is a unary predicate, hence {x | x > 0} defines a set.

We write e: S to assert that e belongs to set S. Clearly,

===================================================

e: {x | P(x)}  iff  P(e).       (*)

===================================================
Then, Bertrand Russell proposed this "set":
===================================================

Define  R = {s | not(s: s)}

===================================================
The "Russell set" is the collection of those sets who don't have themselves as members. Since sets can contain sets, this seems sensible. But answer this question: Does R belong to R ?

R: R  iff  not(R: R)     see (*) above

At this point, set theory collapsed, because the comprehension axiom generated this contradiction. Since number theory reduces to set theory, all of mathematics was in danger of collapse. (How this was resolved is another story --- types and classes --- but not the computer-science ones!)

Church was able to study Russell's Paradox: He modelled {x | P(x)} like this: lam x. P(x), and he modelled e: S like this: S(e).

Here is the Russell set, coded in lambda-calculus:

===================================================

R = lam s. not(s s)     where   not = lam b.(lam t. lam e. b e t)

===================================================
Here is Church's "answer" to the question, R: R ?
===================================================

R R

==>  (lam s. not(s s))(lam s. not(s s))

==>  not((lam s. not(s s))(lam s. not(s s)))   that is,  not(R R)

==>  not(not((lam s. not(s s))(lam s. not(s s))))  that is,  not(not(R R))

==>  etc., forever

===================================================
Church's formulation of sets refuses to answer the question, R R ! The expression, (R R), does not have a normal form --- no answer!

By the way, an even simpler "always-looping program" is

(lam x. (x x))(lam x. (x x))
All the problematic examples from set theory involved self-reflection, which in lambda-notation looks like self-application --- a function that takes itself as an argument. Church's student, Stephen Cole Kleene, studied this coding and discovered this strange expression that can clone code:
===================================================

Y == lam f. (lam z. f (z z))(lam z. f (z z))

===================================================
Let's try it: Say that we want to run this repetition in the lambda calculus:
def loop(arg) = ... loop(update(arg))...
that is, loop repeats itself, updating its argument at each "iteration". We write this lambda-calculus expression,
 
CODE =  lam L. lam arg. ... L(update arg) ...
where the recursive call becomes an argument (!), but then we use CODE with Y above, which "clones" CODE:
===================================================

Y CODE  =  (lam f. (lam z. f (z z))(lam z. f (z z))) CODE

=> (lam z. CODE (z z))(lam z. CODE (z z)))

=> CODE ( (lam z. CODE (z z))(lam z. CODE (z z)) )

=  (lam L. lam arg. ... L(update arg) ...)((lam z. CODE (z z))(lam z. CODE (z z)))

=> lam arg.  ... ((lam z. CODE (z z))(lam z. CODE (z z)))(update arg) ...

                     the strange (lam z ...)(lam z ...)  part clones itself 
                     at the exact position where the CODE must repeat!
                     This will repeat as often as needed.

===================================================
Kleene used this pattern to define the partial-recursive functions (functions that can "loop"), which form the basis of modern computing science. We will return later to the uses of Y.


10.2.2 Exercises

Question 1. Recall that a redex is an expression of the form, ((lam x. E1) E2).

Leftmost-outermost reduction repeatedly locates the leftmost occurrence of a redex and applies the β-rule to it. Here is a leftmost-outermost reduction step:

(lam x. (a x))((lam y. y) z)  =>  (a ((lam y. y) z)
The above expression holds two redexes, namely, (lam y. y) z and also (lam x. (a x))((lam y. y) z) --- the first one is embedded in the second. But the second one is "outermost" or "to the left of" the first one.

Sometimes leftmost-outermost reduction is called call-by-name evaluation. Apply leftmost-outermost reduction to these examples to rewrite them to their normal forms:

(i) (lam y. (lam z. z)(y y))((lam x. x) z)
(ii) (lam y. (lam z. a)(y y))(lam x. x x)


Question 2. Say that a redex is visible if it is not embedded within an abstraction expression, that is, the redex is not resting in the E part of some (lam x. E).

Innermost reduction repeatedly locates a redex that is visible and does not contain any other visible redexes and applies the β-rule to it. Here is an innermost reduction step, the only one that can be done:

(lam x. (lam a. a) x)((lam y. y) z)  =>  (lam x. (lam a. a) x) z 
There are three redexes, and two are visible, but only the redex, (lam y. y) z, is visible and contains no other visible redexes.

Sometimes innermost reduction is called call-by-value evaluation. Apply innermost reduction to the expressions (i) and (ii) above to compute the normal forms, if they exist.


10.3 Encoding arithmetic in lambda calculus

Church and Kleene showed that all of constructive mathematics can be coded just in the lambda-calculus. This means that any program written for a Turing machine can also be written in just the lambda-calculus. This means that any C-program or C#-program can also be written in just the lambda-calculus. The lambda-calculus is an alternative machine model to Turing-machines, von Neumann-machines, and Post systems.

We now see how computation on numbers and data structures are done purely in lambda calculus. In what follows, read [E] as "the translation of E into lambda-calculus".

Booleans

The true-false values are represented as lambda abstractions:
===================================================

[true] ==  lam t. lam f. t
[false] ==  lam t. lam f. f

===================================================
This "odd" representation has an important use: A boolean value is a true-false selector, e.g.,
([true] E1 E2)  =  ((lam t. lam f. t) E1 E2)  =>  ((lam f. E1) E2)  =>  E1
That is, as an operator, [true] acts as the control of an if-then-else statement! [false] works in a similar way. We will exploit this trick below.
===================================================

[not B] == NOT [B],
           where  NOT == lam b. b [false] [true]

===================================================
Example:
[not true] ==  NOT [true]
               ==  (lam b. b [false] [true]) [true]
               =>  [true] [false] [true]
               ==  (lam t. lam f. t) (lam t. lam f. f) (lam t. lam f. t)
               =>  (lam f.  (lam t. lam f. f)) (lam t. lam f. t)
               =>  (lam t. lam f. f)
               == [false]
NOT b is saying, ''if b then false else true''. In general, a conditional expression is just this:
===================================================

[if E1 E2 E3] ==  ([E1] [E2] [E3])

===================================================
When E1 rewrites to a normal-form boolean, then it will select either E2 or E3.

Here is conjunction:

===================================================

[E1 and E2] ==  AND [E1] [E2]

            where  AND ==  lam b. lam c. b c [false]

===================================================
that is, (AND e1 e2) is saying ''if e1 then e2 else false''. Now, you can code classical propositional logic.

Nonnegative integers ("natural numbers")

The "Church numerals" go like this:
===================================================

[0] == lam s. lam z. z
[1] == lam s. lam z. s z
[2] == lam s. lam z. s (s z)
   ...
[i] == lam s. lam z. s (s ... (s z)...),   where  s  is repeated  i  times

===================================================
An int encodes how many times some action, s, should be applied to the base value, z. In this way, an int, n, can be used as a looping operator: ([n] f a) means, f(f(...(f(a))...)), for n uses of f. Or, if you like loops: ``ans = a; do n times: ans = f(ans) end; return ans''.

Example: apply negation two times to true:

[2] [not] [true] == (lam s. lam z. s (s z))[not] [true]
    => (lam z. [not] ([not] z)) [true]
    => [not]([not] [true])
    =>* [not] [false]               (see above for the detailed steps)
    =>* [true]
===================================================

[E =0] ==  EQ0 [E]

       where  EQ0 == lam n. n (lam z. [false]) [true]

===================================================
Example:
[2 =0] ==  EQ0 [2]
       ==  (lam n. n (lam z. [false]) [true]) [2]
       => [2] (lam z. [false]) [true]
       == (lam s. lam z. s (s z)) (lam z. [false]) [true]
       => (lam z. (lam z. [false]) ( (lam z. [false]) z)) [true]
       => (lam z. [false])  [true]  =>  [false]
(In order to code two-position relational operations, e.g., E1 > E2, we need a bit more machinery, given below.)
===================================================

[E +1] == SUCC [E]

       where   SUCC == lam n. lam s. lam z. s (n s z) 

===================================================
inserts one more occurrence of s. Example:
[3 +1] == SUCC [3]
           == (lam n. lam s. lam z. s (n s z)) [3]
           ==> lam s. lam z. s ( [3] s z )
           == lam s. lam z. s ((lam s. lam z. s (s (s z))) s z)
           => lam s. lam z. s ( (lam z. s (s (s z))) z )
           => lam s. lam z. s (s (s (s z)))
===================================================

[E1 + E2] ==  ADD [E1] [E2]
          where ADD = lam m. lam n. m SUCC n

===================================================
applies SUCC E1-many times to E2. Example:
[2 + 3] ==  ADD [2] [3] 
            == [2] SUCC [3]
            ==  (lam s. lam z. s (s z)) SUCC [3]
            =>  (lam z. SUCC (SUCC z) ) [3]
            =>  SUCC (SUCC [3])
            ==  SUCC (SUCC (lam s. lam z. s (s (s z))))
            =>* lam s. lam z. s (s (s (s (s z))))
Multiplication is an easy exercise, since it is repeated addition. It takes some effort to code subtraction-by-one (predecessor). (Use pairs, coded below.)

Data structures (pairs)

Kleene, and later, MacCarthy, realized that all data structures can be modelled as nested pairs:
===================================================

[pair E1 E2] == lam b. b [E1] [E2]

[fst E] ==  [E] [true]

[snd E] ==  [E] [false]

===================================================
Now, you can emulate lists, arrays, trees, etc., as nested pairs. (A list, [a,b,c], is a nested pair, [pair a (pair b (pair c nil))], where [nil] is some lambda expression that works with an EQNIL operation such that EQNIL [nil] =>* [true] and EQNIL [pair .. ..] =>* [false]. This is a good exercise for you to do: define NIL and EQNIL.

We can use pairs to define predecessor:

===================================================

[E -1] ==  PRED [E]

   where  PRED == lam n. (n COUNTUP ([pair 0 0])) [true]

   and    COUNTUP == lam p. [pair] ([snd] p) (SUCC ([snd] p))

   # n COUNTUP [pair 0 0]
     counts upwards  n  times starting from:  0,0
                                              0,1
                                              1,2
                                               ... 
                                       to     n-1, n
     The desired answer is  n-1,  which is extracted using  [true].
     That is, we computed  n-1  by counting upwards from  0  up to  n-1.

===================================================
There is a trickier coding of PRED that doesn't use pairing:
===================================================

PRED ==  lam n. lam s. lam z. n (lam g. lam h.h (g s)) (lam u.z) (lam u.u)

===================================================
It builds an answer from argument n that holds one less occurrence of s.

We use PRED for subtraction:

===================================================

[E1 - E2] ==  [E2] PRED [E1]

   # apply  PRED  E2  times to  E1

and

[E1 > E2] ==  [not ((E1 - E2) = 0)]

===================================================
Other arithmetic operations (division, modulo, etc.) follow the same patterns. Try to write them.


10.4 While-language translated into lambda-calculus

Say that we have an assignment language with this syntax:
===================================================

P : Program           E : Expression
C : Command           I : Identifier

P ::= C* return I
C  ::=  I = E  |  if E then C* else C* end
     |  def I1 ( I2* ) globalVars I* : P end  |  call I1 = I2 ( E* )
     |  while E do C* usingVars I*

===================================================
A program is a list of commands that finishes by returning the final answer. Functions can read the current values of global variables but cannot update them. Each loop must state which variables are assigned in its body. The syntax is a bit restrictive because I wanted a translation that generates lambda-expressions that look like the ones at the very beginning of this chapter. (There is a more general translation that builds lists of global variables that are read and updated by each command.) Here is an example program that stores 1+2+...+x into y:
      x = 3;
      def f(a) globalVar x : z = x + a; return z end;
      y = 1;
      while x > 0 do 
                call y = f(y); 
                call x = f(-1) 
      usingVars x, y; 
      return y

Translation definitions

In what follows, we use CL to stand for a list of commands. A list, C ; CL, has command C as its front command and CL as the rest --- semicolon is ''cons''. A command list is always ended with return I --- the "nil" command.

Here are the translations, stated on the command lists:

===================================================

[return I] == I


[x = E ; CL] ==  (lam x. [CL]) [E]

   #  the  CL  is the rest of the program --- the "continuation"


[if E then C1* else C2* end ; CL] == [E] [C1* ; CL] [C2* ; CL]

   # CL is copied and attached to both arms of the conditional.
   # This isn't beautiful but it is simple and correct.


[def I1 (I2) globalVar I3 : P end ; CL]
    ==  (lam I1. [CL] )(lam I2. lam I3. [P])
    
    # the meaning of  def I1  is a lambda abstraction that awaits an arg for  I2
    # and the current value of global variable  I3
    
    # Note: The above definition used just one parameter var and one global variable.
    # For multiple vars, you use multiple arguments,  lam I. lam J. lam K. ...,  etc.
    
    
[call I1 = I2 ( E ); CL ]
    ==  (lam I1. [CL] )(I2 [E] X),   where  X  is the variable named in  I2's  def:
                                            def I2 (...) globalVar X : ... end
                                          
                                          
[(while E do C* usingVar I); CL]
    ==  Y (lam w. lam I. [E] [C* ; (w I)] [CL]) I

    where   Y == lam f. (lam z. f (z z))(lam z. f (z z))
 
    # A while loop "unfolds" while it executes:
    #    while E do C  =>*  if E then (C; while E do C) else pass
 
This behavior is encoded with  Y,  which is an unfolder:
     Y F =>* F (Y F)

Y  is sometimes called a fixed-point combinator or
the "paradoxical combinator"  since it behaves like Russell's paradox.

Here, we desire this behavior:  
    Y W I  =>*  ([E] [C* ; (Y W I)] [CL])        
                     where W == (lam w. lam I. [E] [C* ; (w I)] [CL])

that is, the loop unfolds into a if-command with a fresh copy of the
loop embedded just after the loop-body's code.

Note: The above definition used just one var, I.  For multiple vars,
you use multiple arguments,  I J K ...,  etc.

===================================================
Here is an example program with its translation and execution:
===================================================

[x = 3;  while x > 0 do x = x -1 usingVar x;  return x]

=  (lam x. Y W x) [3]

    where W = lam w. lam x. [x > 0]
                            [x = x -1 ; (w x)]
                            [print x]

            = lam w. lam x. [x > 0]
                            ((lam x. (w x)) [x -1])
                            x
=>  Y W [3]

=>*  W (Y W) [3]

=>  (lam x. [x > 0] 
            ((lam x. (Y W x) [x -1])  
            x
    ) [3]

=>*  [3 > 0] ((lam x. (Y W x)) [3 -1])  [3]

=>*  [true] ((lam x. (Y W x)) [3 -1]) [3]

=>*  (lam x. (Y W x)) [3 -1]

=>*  Y W [3 -1]

=>*  Y W [2]    # compare this to  Y W [3]  at loop entry

=> ... =>  Y W [0]

=>* [false] ((lam x. (Y W x)) [0 -1]) [0]

=>* [0]

===================================================
This is a simple but inelegant translation of imperative programs into lambda-calculus. A better one is used by Christopher Strachey and is the basis of his denotational semantics methodology.


10.4.1 Exercises

These exercises develop the most important families of computable functions.

Question 3. Many simple operations on numbers are simple recursive; here's one:

def isEven(n): if n == 0 : true
               else : not(isEven(n - 1))

The general format of a simple-recursive function is this:

f(0) = j,  where  j  is some number
f(i + 1) = g(f(i)),  where  g is a (closed, nonrecursive or simple-recursive)
                     function that maps a number to a number.
An easy math induction proof shows, for any n >= 0, that f(n) = g(g(...g(j)...)), where g is repeated n times. For this reason, loops of the form, for N times do C, are simple-recursive functions.

We show that isEven is simple recursive by writing it in the equational format like this:

isEven(0) = true
isEven(n+1) = not(isEven(n))

We can easily encode simple-recursive functions in lambda-notation. So, define [isEven] as a lambda-expression using the standard codings of [true], [false], [eq0], and [not]. Then, use it to compute ([isEven] [3]).

Question 4. There is a larger, more computationally powerful class of functions called the primitive-recursive functions..

Let X be a tuple of m parameter names, e.g., (x1,x2,..,xm). An m+1-argument primitive-recursive ("p.r.") function has this format:

  1. It is a constant function, e.g., f(X) = k; or a projection function, e.g., f(X) = X[i], for 0<i≤n

  2. It is the composition of already-defined p.r. functions, e.g., f(X) = h(g1(X),...gn(X)), for m-ary p.r. functions gi and n-ary p.r. function h.

  3. It is defined with this schema:
    f(X, 0) = h(X),    where  h is an n-argument p.r. function
    
    f(X, n+1) = g(X, n, f(X, n)),
                       where  g  is an n+2-argument p.r. function
    
Here is an example, the unary summation function on nonnegative ints:
def sumTo(n) : if n == 0 : 0
               else : n + sumTo(n-1)
and here is its presentation that shows it is primitive recursive in one argument:
sumTo(0) = 0

sumTo(i + 1) = ADD1(i, sumTo(i)),  

      where  ADD1(a,b) = (a + 1) + b ,
               which is a p.r. function in two arguments: it is
                 built by composition using  +  which is also a
                 p.r. function in two arguments.
                (Technically speaking,  a  and  b  are "projection functions"
                 and  1  is a "constant function")

All the fundamental operations of arithmetic are primitive recursive. (This is all of Peano arithmetic.) A primitive recursive function is guaranteed to terminate when called.

With math induction, we can prove, for all n >= 0, that a p.r. function of one argument coded in the recursive schema computes to

f(n) = g(n-1, g(n-2, ... g(0, h(0))...))
where there are n uses of g. We can also prove that f(n) computes exactly the same answer as this while-loop:
count = 0
ans = h(0)
while count != n do:
    ans = g(count, ans);  count = count + 1
print ans

There is a pattern for formatting a primitive-recursive function of one argument into a lambda-expression, by using pairs. Recall that

[pair] == lam e1. lam e2. lam b. b e1 e2

[fst] == lam p. p [true]

[snd] == lam p. p [false]
We see that
[fst]([pair] E1 E2) => E1
[snd]([pair] E1 E2) => E2
We will use [pair] to model a two-variable storage vector, ([pair] count ans), and use it to build the above while-loop!

Here is how we code the pattern for f above:

lam n. [snd] (n  F ([pair] [0] (h [0]))
                  where  F is coded   lam p. ([pair] ([succ]([fst] p)) (G p))
                  and  G  is the coding of function  g
([pair] [0] (h [0])) is the starting value of the storage vector; F is the coding of the loop body, and n is the control of the number of loop iterations.

Here is the complete translation of sumTo into the lambda-calculus:

[sumTo] ==
lam n.  [snd] (n  (lam p. [pair] ([succ]([fst] p)) ([ADD1] p)) ([pair][0][0]) )

        where  [ADD1] == lam q. ([succ]([fst] q)) [succ] ([snd] q)
It is not so easy to see, but
([ADD1] ([pair] [i] [j])) => [i+1+j]
Try a few examples and confirm this.

Compute ([sumTo] [3]).


Question 5. Many functions are not primitive recursive (e.g., Fibonacci's and Ackermans's functions, as well as any function that is partially terminating). The general form of while-loop falls in this group. With simple- and primitive-recursive functions, one knows exactly how many times the computation will repeat/unfold when it starts. This is not true with the general-recursive functions. A function is general recursive if it can be written with equations and recursions of any combination.

Here is a famous example of a general-recursive function that is not primitive recursive --- Ackermann's function, which takes two arguments to compute its answer:

ack 0 n = n+1
ack (m+1) 0 = ack m 1
ack (m+1) (n+1) = ack m (ack (m+1) n)
It is more common to code the equations like this:
def ack(m,n):
    """both  m  and  n  range over the nonnegative ints"""
    if m == 0: return n+1
    elif n == 0: return  ack(m-1, 1)
    else: return ack(m-1, ack(m, n-1))
It is possible to prove that this function always terminates, even though it cannot be formatted as primitive recursive.

Another famous example is Collatz's function, whose input is a positive int:

collatz(1) = 1
collatz(2 * i) =  collatz(i),   where int i > 0
collatz((2*i) + 1) = collatz((6 * i) + 4),  where  int i > 0
This can be coded as a loop!
read(x)
assert x > 0
while x > 1 :
    if x % 2 == 0 :  x = x / 2
    else :           x = 3 * x + 1
print(x)
Collatz claimed, for every positive int, n, that collatz(n) == 1. But he and no one since has been able to prove this. It is a frustrating open problem in computing.

Examples like ack, collatz, and while-loops are coded in lambda-calculus with the Y operator:

[Y] =  lam f. (lam z. f (z z))(lam z. f (z z))
The key property of Y is that Y F => (K K), where K = (lam z. F (z z)), and then K K => F(K K) =>* F(F(K K)) =>* F(F(F(K K))) =>* ..., etc. In this way, (Y F) "clones itself" and feeds itself to F. We use this behavior to model a general recursive function, which "clones itself" each time it restarts with a recursive call.

Code this function in lambda calculus, using Y:

haltEven(n) = if isEven(n) then 0
                           else haltEven(n+2)
(Note: use the coding of isEven from Question 3.) Use your definition of [haltEven] to reduce these expressions:
([haltEven] [0])
([haltEven] [2])
([haltEven] [1])