PART 1: Motivation for studying lambda calculus: In Python, we can write: x = 2 def f(y): return y + x x = f print x(99) The def is actually an assignment: x = 2 f = (y): return y + x f(99) The rhs of f = is evaluated to a closure. There is a traditional notation for function-values: x = 2 f = lambda y: y + x f(x) It's used in Lisp and Scheme. This is also legal: x = 2 (lambda y: y + x)(x) Ints don't need names; why should functions? (lambda y. y + x) is called a *lambda abstraction* or an *abstraction* for short. (From here on, we use a . and not a : to separate the parameter from the body. We will also type lam as a shortcut for lambda.) Lisp programmers quickly discover that assignments are unnecessary: x = 2 f = lam y. y + x f(x + 1) is "the same as": (lam x. (lam f. f(x + 1) )(lam y. y + x) )2 This is because of the beta-rule for binding of argument to parameter: beta: (lam x. E1)E2 ==> [E2/x]E1 (Recall that [E2/x]E1 denotes substitution. There is lots more to say about this concept!) We can now see that the traditional ordering of assignments is just an illusion! We can compute the above program inside-out or outside-in or however we desire. Let's do it inside-out: (lam x. (lam f. f(x + 1) )(lam y. y + x) )2 ==> (lam x. (lam y. y + x)(x + 1) )2 ==> (lam x. (x + 1) + x )2 ==> (2 + 1) + 2 This is a critical insight from modelling with lambda-notation. Incredibly, we can do away with ints and arithmetic, replacing everything by lambda-notation [0] is lam s. lam z. z [1] is lam s. lam z. s z [2] is lam s. lam z. s (s z) etc. [True] is lam t. lam e. t [False] is lam t. lam e. e [if e1 then e2 else e3] is e1 e2 e3, that is, ((e1 e2) e3) Addition, multiplication, all of arithmetic can be defined as well. Arbitrary recursion is defined with Y is lam f.(lam x. f(x x))(lam x. f(x x)) More to come on this.... ============================================================== PART 2: The origin of the Lambda Calculus: Studying name clashes Here is the syntax of the Lambda Calculus --- there are identifiers, abstractions, and applications: EXPR ::= IDEN | ( lam IDEN . EXPR ) | ( EXPR1 EXPR2 ) where IDEN is any string-based identifier, e.g., 'x' or 'y'. It is standard to eliminate unneeded parens by writing a nested application of form (...((E1 E2) E3) ... En) as merely E1 E2 ... En Another standard abbreviation is to simplify (lam I1. (lam I2. ... (lam In. E) ...)) by (lam I1. lam I2. ... lam In. E) The outermost parens are also often dropped. (The best way to write a lambda-expression is to draw its parse tree!) The primary rewriting rule one uses to "compute upon" a lambda-expression is the beta-rule: beta: (lam I. E1)E2 ==> [E2/I]E1 where {E2/x]E1 is defined precisely on the syntax of E1 as follows: [E/I](lam I. E1) = (lam I. E1) [E/I](lam J. E1) = (lam J. [E/I]E1), if I != J and J is not free in E [E/I](lam J. E1) = (lam K. [E/I][K/J]E1), if I != J, J is free in E, and K is fresh [E/I](E1 E2) = ([E/I]E1 [E/I]E2) [E/I]I = E [E/I]J = J, if J != I The notion of "free in" can be defined equally precisely: I is free in I I is free in (lam J. E) iff I != J and I is free in E I is free in (E1 E2) iff I is free in E1 or I is free in E2 An expression that has form, ((lam I. E1) E2), is called a beta-redex (or just "redex" for short). An expression that has no redexes is in *normal form* --- it is an "answer". This expression is not in normal form: (lam x. (lam f. f x )(lam y. (x y)) )a but we can apply the beta-rule to reduce it to its normal form: (lam x. (lam f. f x )(lam y. (x y))) a => (lam f. f a)(lam y. (a y)) => (lam y. (a y)) a => (a a) It takes a lot of work to prove, but if an expression can be reduced to a normal form, then the normal form is unique --- the order in which we apply the beta-rule to the expression does not affect the final answer. But some expressions have no normal form, which we see later.... ( There are two other rewriting rules but they are less important to us: alpha: (lam I. E) ==> (lam J. [J/I]E), if J not free in E eta: (lam I. (E I)) ==> E, if I not free in E ) The lambda calculus was invented by Alonzo Church, a mathemetician-philosopher at Princeton. He was interested in studying name clashes. Here is a programming example of a name clash: x = 5 def f(y): y + x def g(x): f(x) print g(3) // what prints? 6 or 8 ? One expansion of the calls yields 6: x = 5 print g(3) ==> x = 5 print (x = 3 f(x) ) ==> x = 5 print (x = 3 f(3) ) ==> x = 5 print (x = 3 (y = 3 y + x )) ==> x = 5 print ( x = 3 ( y = 3 3 + 3 ) ) But a static-scoping implementation prints 8. We can analyze the situation with lambda-notation: (lam x. (lam f. (lam g. g(3) )(lam x. f x) )(lam y. y + x) )5 Do you bind 5 to x and do the subsequent substitutions ? Or, do you bind (lam y. y + x) to f and do the subsequent substitutions ? Does it matter? Should it matter? There is name clash between the two definitions of x. These headaches started appearing in math proofs in the 19th century, and they were causing false results to be proved. They were the reason for the formal development of predicate logic. Here is a simplistic example of the bonehead reasoning that was appearing at that time: "In the domain of ints, every int has an int that is larger: forall x, exists y, y > x " "So, 3, is an int, hence exists y, y > 3" "Let z be some int (no matter what), then exists y, y > z" "Let y be some int (no matter what), then exists y, y > y" (?!) Church realized that the issue here is scope. He uses lambda as a hypothetical quantifier, e.g., lam x. lam y. > x y is the "scoping skeleton" of the above statement. You can study the substitutions on the "skeleton". Indeed, there are developments of predicate logic based on this idea: "forall x, exists y, y > x" is formally written like this: FORALL(lam x. EXISTS(lam y. > y x )) FORALL and EXISTS are combinators (operators) that take abstractions as their arguments. Similar strange issues arose in set theory, which made central use of the "comprehension axiom": If P is a unary predicate on individuals, then {x | P(x)} defines a set. Example: _>0 is a unary predicate, hence {x | x > 0} defines a set. We write e: S to assert that e belongs to set S. Clearly, e: {x | P(x)} iff P(e). (*) Then, Bertrand Russell proposed this "set": Define R = {s | not(s: s)} The "Russell set" is the collection of those sets who don't have themselves as members. Since sets can contain sets, this seems sensible. But answer this question: Does R belong to R ? R: R iff not(R: R) see (*) above At this point, set theory collapsed, because the comprehension axiom generated this contradiction. Since number theory reduces to set theory, all of mathematics was in danger of collapse. (How this was resolved is another story... types and classes --- but not the C.S. ones!) Church was able to study Russell's Paradox: He modelled {x | P(x)} like this: lam x. P(x) and he modelled e: S like this: S(e) Here is the Russell set: R = lam s. not(s s) where not = lam b.(lam t. lam e. b e t) Here is his "answer" to the question, R: R ? R R ==> (lam s. not(s s))(lam s. not(s s)) ==> not((lam s. not(s s))(lam s. not(s s))) that is, not(R R) ==> not(not((lam s. not(s s))(lam s. not(s s)))) that is, not(not(R R)) ==> etc., forever That is, Church's formulation of sets refuses to answer the question, R R ! The expression, (R R), does not have a normal form --- no answer! By the way, an even simpler always-looping program is (lam x. (x x))(lam x. (x x)) All the problematic examples from set theory involved self-reflection, which in lambda-notation looks like self-application --- a function that takes itself as an argument. Kleene's study of the Russell set led him to discover this strange expression that can clone code: Y == lam f. (lam z. f (z z))(lam z. f (z z)) Let's try it: Say that blahblah... is some expression that we want to repeat over and over, like a loop body does. We define this pattern, CODE == lam z. blahblah... z ... which contains the blahblah... part and the request to "clone" (repeat again) at position z. We use it like this: Y CODE == (lam f. (lam z. f (z z))(lam z. f (z z))) CODE => (lam z. CODE (z z))(lam z. CODE (z z))) => CODE ( (lam z. CODE (z z))(lam z. CODE (z z)) ) => blahblah... ( (lam z. CODE (z z))(lam z. CODE (z z)) ) ... as needed, the strange (lam z ...)(lam z ...) part clones itself: => blahblah... ( CODE ((lam z. CODE (z z))(lam z. CODE (z z))) ) ... => blahblah... ( blahblah...((lam z. CODE (z z))(lam z. CODE (z z)))...) ... You can see that the blahblah... part is repeating/unfolding! Kleene used this pattern to define the partial recursive functions (functions that can "loop"), which form the basis of modern computing science. More to come...