Chapter 9: What is Computable?

Computers do calculation activities. They calculate on numbers (represented as 1s and 0s), on text (represented as numbers), on sounds (represented as numbers), on images (represented as numbers), etc. Are there some calculation activities that are beyond the reach of computers, that they cannot do? The answer is "yes".

MacCormick demonstrates this point with a complex presentation of Turing's Halting Problem. We will rework that material here.

The Starting Law of the Universe

You have to start somewhere. We start with this law:

Something is different from nothing.

That is,

Something ≠ Nothing

Mathematicians use 1 for Something and 0 for Nothing. So,

1 ≠ 0

All mathematics starts from here.

So does Western Thought. A sentence can be an imperative (e.g., "Do your homework!"), a question ("Are we there yet?"), or a declaration ("The sky is blue.") Here, we care about declarations. The declaration, "The earth is round" is a fact; it says something; it is True; it is 1. The declaration, "The moon is made of green cheese" is a falsehood; it says nothing; it is False; it is 0.

True ≠ False

No declaration can be a fact and a falsehood at the same time. A declaration like, "It is daytime." is a fact at some moments in the universe and a falsehood at others, but it is never both at the same moment. If we abuse English and write a declaration that is both a fact and a falsehood at the same moment, it is a paradox and must be discarded from the English language --- it is nonsensical; it cannot be. A paradox makes True = False, which violates the starting law of the universe.

The oldest known example of a paradox is the sentence, "I am lying." --- the declaration is a fact exactly when it is a falsehood (since "lying" means "stating a falsehood"). This is the Liar's Paradox. An alternative formulation is "This declaration is a falsehood". The Liar's Paradox is nonsensical; it cannot be.

Discussion questions

Here is a little story called the Barber's Paradox, due to Bertrand Russell. It will be the centerpiece of the next section:
There is a lone village on an island where the village barber cuts the hair of exactly all those people who do not cut their own hair.
Now, is this declaration a fact or a falsehood: The barber cuts her own hair.
The Liar's Paradox has various forms, and you can read them at http://brainden.com/paradoxes.htm. Consider this variant: A New Yorker comes to your house, rings your bell, and announces, "All New Yorkers are liars!". Is the declaration a paradox? Explain why it might not be.

Russell's Paradox

Paradoxes have infiltrated mathematics. Declarations in mathematics are called predicates; here are examples:

3 > 2 (a fact)

x = (x + 1) (a falsehood, regardless of x's value)

(x + y) = 5 (a contingency --- sometimes a fact, sometimes a falsehood, depending on the values of x and y)

Mathematics is the science of discovering predicates that are facts.

Mathematicians has always worked with numbers, but in the nineteenth century, mathematicians started using predicates to define sets. (Recall that a "set" is a collection --- a "grocery sack" --- of elements. Examples: {4,5,6} is a set of some numbers. {} is a set of no numbers.)

Here are examples of sets of numbers, defined with predicates. (Read the first one as, ''the set of elements, n, such that n > 0 is a fact.'')

{ n | n > 0 } (the positive numbers)

{ m | m + m = 0 } (there is only one element, 0)

{ m | not(m + m = 0) } (all numbers except 0)

{ x | x > 0 and x < 3 } (the numbers between 0 and 3)

In general,

A predicate that mentions variable x, call it P_x, 
  defines the set,  {x | P_x}

For example, n > 0 is a predicate that mentions variable n and it defines the set of positive numbers, {n | n > 0}.

The whole point of defining a set is to ask what's in it. The predicate, e ∈ S, is a fact exactly when element e is contained in set S. (Read as "e is in S".) For example, 5 ∈ {4,5,6} is a fact. When a set is defined via a predicate,

e ∈ {x | P_x}  is a fact exactly when we replace x in P_x with  e  and we find that  P_e  is a fact.

For example, 3 ∈ {n | n > 0} is a fact because, when we replace n in n > 0 by 3, we find that 3 > 0 is a fact. Similarly, 3 ∈ {m | not(m + m = 0)} is a fact because not(3 + 3 = 0) is a fact. Finally, 3 ∈ {m | m + m = 0} is a falsehood because 3 + 3 = 0 is a falsehood.

Mathematicians use sets to hold sets. Here are examples:

{ S | False } (the empty set, {})

{ S | 4 ∈ S } (sets containing 4)

{ S | not(S = {}) } (the nonempty sets)

{ S | True } (the set of all sets --- call this the universal set, U.)

{ S | S ∈ S } (those sets that contain themselves (!))

Some examples of membership predicates for the above sets:

{4,5,6} ∈ {} is a falsehood

{4,5,6} ∈ {S | 4 ∈ S} is a fact

{4,5,6} ∈ {S | not(S = {})} is a fact

{4,5,6} ∈ {S | S ∈ S} is a falsehood

{} ∈ U is a fact

{} ∈ {S | 4 ∈ S} is a falsehood

U ∈ U is a fact (!)

U ∈ {S | S ∈ S} is a fact

The last two facts intrigued Bertrand Russell, and he defined this set, the Russell set:

Let R be the name of the set, {S | not(S ∈ S)}

R is the collection of the sets that do not contain themselves. Clearly, {4,5,6} ∈ R is a fact, and U ∈ R is a falsehood. But what about R ∈ R?

RUSSELL'S PARADOX:

    Let  R  be  {S | not(S ∈ S)}
                        ↑
R ∈ R                   ↑ 
is a fact exactly when not(R ∈ R)
                       is a fact, that is, exactly when  R ∈ R is a falsehood

R ∈ R  is a paradox!

Russell's Paradox is generated by the combination of S ∈ S (self-inspection) and not ("flipping" the answer) --- a real disaster!

We have uncovered something that mathematics cannot do --- define a set of sets that don't hold themselves. Modern versions of set theory essentially disallow self-inspection to remove this form of paradox.

Exercises

Which of the following are facts? falsehoods? contingencies?

(2 + 1) = 4
(2 + x) = 4
x = x
Use a predicate to define the set of all sets that do not contain the number 4. Use a predicate to define the set of all sets that contain 3 or 4.
Review the Barber's Paradox in the Discussion Question at the end of the previous section.
- For each person, P, in the village, define this set:
  CutsP = {Q | Q's hair is cut by P}
- What does X ∈ CutsY mean to say? When is it that X ∈ CutsX, for person X?
- Fill this in: CutsBarber = { P | not(....) }. Explain why CutsBarber is the ``Russell set''.
We saw facts like 4 ∈ {4,5,6} and 4 ∈ {n | n > 0}. It looks silly to ask {n | n > 0} ∈ 4. Why?
In modern set theory, numbers like 0,1,2,... are themselves sets:

0 is the name (abbreviation) for {}
1 is the name (abbreviation) for {{}}, that is, {0}.
2 is the name (abbreviation) for {{}, {{}}}, that is, {0,1}.
3 is the name (abbreviation) for {0,1,2}, and so on.
With these meanings for numbers, are the following facts or falsehoods?
0 ∈ 1 2 ∈ 4
{0} ∈ 2
With these meanings for numbers, the predicate n < m is actually n ∈ m. Explain why.
Define set D = {S | (S ∈ S) or not(S ∈ S)}. Are these facts: U ∈ D? {} ∈ D? Is D ∈ D a paradox?

Computer programs can do self-inspection

Russell's paradox follows us into computer science and causes Turing's halting problem, and we'll see why. First, we note that self-inspection is an important activity in computing. On Page 182, MacCormick shows a picture of MS Word inspecting itself:

That is, a human loaded and started WIN-WORD.EXE on the computer's processor, and the human told the running program to load as the input, WIN-WORD.EXE.

Editor programs, like MS Word, are capable of inspecting all programs, including themselves. Compiler/interpreter/IDE programs are capable of inspecting/compiling/running most all programs, including themselves. Error-finding and performance-analyzing and correctness-checking programs are capable of inspecting/analyzing a wide range of programs, including themselves. A key feature of computer programs, one that leads us into artificial-intelligence-like areas, is that they are capable of inspecting other programs, including themselves.

How far can these inspections go?

Discussion question

Use MS Word or some other text editor (notepad++ or vim or emacs) to look the .exe files of MS Word, Chrome, gcc, etc. Say that you use MS Word to change the contents of the .exe file for MS word --- what would happen? (Note: don't really do this!)

Requirements, implementations, convergence and divergence

We want to develop carefully the answers to the previous questions:

An algorithm is a sequence of instructions for satisfying (implementing) a requirement (specification), like the ones we've seen in earlier chapters. A program is an algorithm written in a computer language, that is, the language that a computer processor can read and perform. Computer language consists of long sequences of 1s and 0s. For this reason, a computer program is one long, huge nonnegative number.
A computer processor runs a program by reading its instructions and doing them. A program requires an input (a number or a pair of numbers) of starting data to start its work. To keep it simple, the numbers are always nonnegative integers (0,1,2,...).
The computer processor does the calculations stated in the program. If all goes well, the program converges (halts, terminates) and emits an output, a number. If something goes wrong and no output is emitted, the program diverges (loops, freezes, goes lost, runs forever, hangs).

The above story is simplified, but even text editors, web browsers, etc., still follow these basic principles.

We will write requirements for programs rather precisely, in a set-like format, { input |-> output ... }, to define how a program should transform its input into its output. The output can be either of ↓answer (converges and emits answer) or ↑ (diverges).

Here are some examples:

A requirement for a program that squares an input number, n:
```
{ n |-> ↓(n * n) }
```
It is easy to implement this requirement with a program that obtains its input number n, multiplies it by itself, then emits the output number:
```
INPUT: a number,  n
OUTPUT: a number
ALGORITHM: 1. obtain n  and  calculate  answer = n * n
           2. output  answer
```
Say that the program we write is named square. We say
run(square)(3)↓9
to signify that the computer ran the program with input 3 and converged with output 9.
A requirement for a program that computes square root for some, not all, numbers (remember, our numbers are 0,1,2,3,...):
```
{ n |->  ↓m  iff  m * m = n
    |->  ↑  iff  there is no  m  such that  m * m = n }

Note: iff means "exactly when"
```
This requirement might be implemented by a program that obtains its input and guesses possible answers (0,1,2, etc.) until an output, m is found, if ever. (If not, the program runs forever.)
Say that sqrt is the name of the program we built as just described. Here are two uses of it: run(sqrt)(25)↓5; run(sqrt)(7)↑.
A requirement for a program that computes exponentiation from two input numbers:
```
{ b,e  |-> ↓(b*b*...*b), repeated  e  times,  iff  e > 0
       |-> ↓1  iff  e = 0 }
```
If exp is a program that implements the specification, then we have run(exp)(2,4)↓16 and run(exp)(3,0)↓1.

The requirements make clear that convergence to an answer is some-answer, something; divergence is no-answer, nothing.

Convergence ≠ Divergence

that is, for every computer program, p, and datum, d, run(p)(d)↓answer ≠ run(p)(d)↑, which is the ``first law'' of the computing universe.

The situation in computing is looking like the one in mathematics. We must be alert for paradoxes.

Exercises

Write a requirement, in the style seen above, for a program that tries to subtract 3 from its input and output the answer. But if the input is too small and 3 cannot be subtracted, the program diverges.
Write an algorithm for the square-root requirement seen earlier in this section.
Not all requirements are completely specified. Explain why this one is not:
```
{n |->  ↓1  iff  (n * n) > n }
```
Write an algorithm for this requirement, anyway. What does the algorithm do for those inputs not covered by the above requirement?

Not all requirements can be implemented. Why can't this one:

{ n |->  ↓0  iff  (n + n) = n
    |->  ↓1  iff  (n * n) = n
    |->  ↑   iff  neither of the above two cases apply to  n }

Modify the requirement so that there is an implementation.

Turing's Halting Problem

Remember that a program is an algorithm written in a computer language of 1s and 0s --- a computer program is one long, huge nonnegative number. This means a computer program (number) can be input to another computer program, which can choose to run the input. Here is a requirement that does this:


{ p |->  ↓(a + 1) iff  run(p)(0)↓a
    |->  ↑        iff  run(p)(0)↑ }

The requirement asks that we run input number/program p with datum 0 and add one to its answer, if there is one. There is an easy algorithm that does this:

INPUT: a program, p, coded as an integer
OUTPUT: a number or ↑
ALGORITHM: 
1. do run(p)(0) and wait....
2. if run(p)(0)↓a, then produce output a+1
   (if run(p)(0)↑, then we are still waiting, and there is nothing we can do)

Not all such requirements can be implemented. In particular, we can restate Russell's paradox as this requirement:

RUSSELL'S PARADOX REPHRASED AS A REQUIREMENT:

{ p |->  ↓1 iff  run(p)(p)↑
    |->  ↑  iff  run(p)(p)↓answer }

If some program, named Russell, implemented this requirement, then

run(Russell)(Russell)↓answer  iff  run(Russell)(Russell)↑

which cannot be.

Notice how the specification uses self-inspection (run(p)(p)) and then "flips" the answer --- in particular, ↑ is flipped to ↓1, which goes against intuition. (How can a computer program ever know that another computer program "has gone lost"?)

It is not surprising that the above requirement can never become a program, but its consequences are many. In particular, there is an incredibly useful program that we can now prove is impossible to implement:

TURING'S HALTING PROBLEM:

Is it possible to implement a program that can analyze another
program and its input and predict in advance
whether the program and input will halt (converge to an answer)?

{ p,d |->  ↓1 iff  run(p)(d)↓answer
      |->  ↓0 iff  run(p)(d)↑ }

Unfortunately, the answer is no:

If some program, named Turing, implemented the above requirement,
then we could build the Russell program, like this:

INPUT: a program, p, coded as an integer
OUTPUT: ↓1 or ↑
ALGORITHM: 
1. run(Turing)(p,p) and obtain the output answer, which must be 0 or 1
2. if answer = ↓1 then diverge: ↑
   if answer = ↓0 then produce output ↓1

This generates an implementation of Russell's paradox, which cannot be.

The halting problem states a requirement that can never be implemented by a computer program. Many other requirements are also impossible to implement:

whether a program is free of all run-time errors
whether two programs produce exactly the same outputs from the same inputs
whether a program implements a requirement correctly

It seems that almost all important features of computer programs can never be checked by other computer programs!

In practice, there are still text-editor, analyzer, and error-finding programs that can examine other programs, even themselves. But they cannot answer all the interesting questions about programs that we might like them to answer. For difficult requirements, such as the halting program, the best one might do is build a program that converges to a useful answer some of the time, like the sqrt program seen earlier does, and diverges lots of other times (because the program cannot discover an answer and "goes lost".)

Computing is "constructive mathematics," and since mathematics has its limitations, so does computing.

Exercises

We do not have all the technical tools to answer these problems precisely, but justify, as best as you can, why we will never be able to implement the following requirements:

{ p |->  ↓1  iff  for each input, n, there is some m such that  run(p)(n)↓m
    |->  ↓0  iff  there is some k such that  run(p(k)↑ }

( p,q  |->  ↓1  iff  for all inputs, n,  run(p)(n) = run(q)(n)
       |->  ↓0  iff  there is some k such that  run(p)(k) ≠ run(q)(k) }

We might be tempted to alter the requirement for the Halting Problem to read like this:
```
{ p,d |->  ↓1 iff  p ≠ d  and  run(p)(d)↓answer
      |->  ↓0 iff  p ≠ d  and  run(p)(d)↑
      |->  ↑  iff  p = d }
```
That is, we don't try to answer if we are asked the outcome of run(p)(p). Why is the above requirement still impossible to implement? (Hint: consider the second requirement in the previous question!)

Optional material: The Universal Turing Machine

Alan Turing was an English mathematician, employed by the British government to break the codes the Nazi armies used for communication. Turing and his team succeeded, and the Allies used the decoded Nazi communications to win World War II.

Turing was interested in mathematical games and machines for calculation. His Turing machine was a controller with a read-write head that traversed the squares of an unbounded roll of toilet paper, where only a 0 or a 1 can be written in a square:

Turing realized that the controller could be wired as a "processor" that understood these six instructions:


* move Left/Right one square
* erase the square (make it blank)
* write 0/1 on the square
* if square is blank/0/1, then goto Instruction #n
* goto Instruction #n
* stop

Here is a sample program that scans a binary number from left to right and flips all 0s to 1s and all 1s to 0s:


#1: if square is 0, then goto #4
#2: if square is 1, then goto #6
#3: if square is blank, then goto #9
#4: write 1
#5: goto #7
#6: write 0
#7: move Right
#8: goto #1
#9: stop

 ... => ...

The nine instructions have number codings, and the program can be copied onto a tape as one huge (Base-2) number to be read by the controller. The Universal Turing Machine was configured like this:

The Universal Turing Machine is the origin of

CPU
linear storage
stored program
binary machine language

Turing wrote programs for the Universal Turing machine, and he believed that all facts/knowledge/problems that can be calculated/deduced/solved by a machine can be done by a Universal Turing Machine. This is now known as the Church-Turing thesis. (The "Church" part refers to Alonzo Church, a Princeton mathematician who developed a symbol system, the lambda-calculus, which has the same computational powers as Turing's machines.)

Turing asked: Are there some problems whose solutions cannot be calculated by his machine? This led to his formulation of the Halting Problem, which was developed earlier in this note.

Von Neumann machines

Modern computers use architectures based on Universal Turing Machines. The adaptation is due to Eckart, Mauchley and von Neumann, at the University of Pennsylvania, where they built the ENIAC electronic computer:

The first electronic computers were "non universal" machines, where the computer program was wired into the CPU. The wires were moved around when a new program was desired. The ENIAC people implemented Turing's machine language, stored programs, and "universal" CPUs. That basic architecture, the von Neumann computer architecture, has survived to this day, because it is simple, relatively cheap to manufacture, and satisfies the Church-Turing thesis.

Exercise

Use the instructions for the Universal Turing Machine to write a program for it that will read a binary int and add one to it. The machine starts at the first blank square to the left of the int,

 V
| |1|0|1|1| |   and it moves to the right of the int, 
                till it finds the rightmost digit:
         V             
| |1|0|1|1| |   Then it adds one and does the needed carries 
                until it is finished:
     V                                                      V
| |1|1|0|0| |   Finally, it resets to its start position:  | |1|1|0|0| |

Optional material: The Lambda Calculus

Earlier, we saw that sets are defined in format, {x | P_x}, and a set is ``run'' by doing a membership check. For example, {n | n > 2} defines a set of numbers, and we ``run'' the set by supplying an argument, e.g., 3 ∈ {n | n > 2} => 3 > 2 => True. In this sense, sets are ``programs'' that can be ``run'' and produce True-False ``answers.'' That's the reason why we used set-like expressions to state requirements for programs.

About 100 years ago, Alonzo Church, a philosopher and mathematician at Princeton, saw the same pattern, and he developed a programming language based on sets. (This was before anyone had heard of programs and computers, by the way.) He called his language the lambda calculus.

Church made a set, (x | P_x}, into a program by writing it like this:

(λx (P_x))

and he made a membership check, like e ∈ S, look like this:

S(e)

(Stop --- remember functions from algebra class? A function, like squaring:

square(n) = n * n

is called/run like this: square(3). The answer is 3 * 3 => 9.)

Here is an example of a lambda-calculus program and running it:

Define Square =  (λn (n * n))

Run it: Square(3) =>  (λn (n * n))(3) =>  3 * 3 =>  9

The above example is ``sort of like'' 3 ∈ {n | n * n} but produces a number as an answer, not True/False. The above example is ``sort of like'' square(3) => 3 * 3 => 9, but the names, square and Square, aren't needed; we can just do this:

(λn (n * n))(3) => 3 * 3 => 9

But lambda-calculus can do lots more than sets and ordinary algebra functions can: A program that runs with two arguments takes the arguments one at a time:

Define Mult = (λm (λn (m * n)))
Run it:
Mult(3)(4) =>  (λm (λn (m * n)))(3)(4)
           =>  (λn (3 * n)))(4)
           =>  (3 * 4) =>  12

Lambda-calculus lets programs run with other programs as arguments, like this:


DoTwice =  (λp (λa (p(p(a)))))

DoTwice(Square)(3) =>  (λp (λa (p(p(a)))))(Square)(3)
                   =>  (λa (Square(Square(a))))(3)
                   =>  Square(Square(3))
                   =>  Square( (λn (n * n))(3) )
                   =>  Square(3 * 3) =>  Square(9)
                   =>  (λn (n * n))(9) =>  9 * 9 =>  81

And self-inspection is as easy to state as the earlier examples:


Identity =  (λx (x))

Identity(Identity) =>  (λx (x))(Identity)
                   =>  Identity =  (λx (x))

And we can code this version of Russell's Paradox:

Say that not is a program that behaves like this:
  not(True) =>  False            not(False) =>  True

Define Russell =  (λS (not(S(S))))

Run it:
Russell(Russell) =>  (λS (not(S(S))))(Russell)
                     =>  not(Russell(Russell))
                     =>  not((λS (not(S(S))))(Russell))
                     =>  not(not(Russell(Russell)))
                     =>  not(not(λS (not(S(S))))(Russell))
                     =>  not(not(not(Russell(Russell))))
                     ...

Will the "paradox" ever be decided as True or False? This is an example of divergence --- no answer. It avoids the paradox by refusing to answer! Divergence was not an option in mathematics, but it is an option in computing!

How is this coding different from the specification of Russell's Paradox seen earlier? (Hint: look at the definition of not.)

The previous example also shows the technique one uses to repeat an action over and over in the Lambda-calculus: if op must be repeated, then code (λS (op(S(S))))(λS (op(S(S)))). (There are some details missing; see the exercise at the end of this section.)

Church's Thesis is the claim that all computable activities can be written as lambda-calculus expressions. Later in time, many people proved that any program written in the language of the universal Turing machine can be written in lambda-calculus, and vice-versa. Hence, we have the Church-Turing thesis.

In 1958, John MacCarthy made the lambda-calculus into the computer language, LISP, which became the primary language for the writing of artificial-intelligence programs (an area that MacCarthy helped to invent).

Exercises

Compute these lambda-calculus expressions:
(λx (Square(x * x)))(2) (λx (λy (Square(y)))(x * x))(2) (λx (5))(Square(2)) (λx (Mult(x)(x)))(2 + 1) (λf (f(2)))(Square) (λf (f(3)))(Mult(2)) (λx (x(x)))(λx (x(x))) (λy (2))((λx (x(x)))(λx (x(x)))) DoTwice(DoTwice(Square))(2)
DoTwice(DoTwice(Square))
Pretend that

True is the name of this expression: (λx (λy (x)))
and False is the name of this expression: (λx (λy (y)))
Now, write an expression, named not, such that not(True) => False and not(False) => True:
not = (λb (......))
These codings of True/False act as selector functions: Verify that True(a)(b) => a and that False(a)(b) => b.
Next, pretend that we have this operation, eqZero, such that eqZero(0) => True, and eqZero(n) => False when n > 0. (It is possible to code eqZero with a lambda-calculus expression, but we won't do it here.)
Use eqZero to implement this requirement as a lambda-calculus expression:
```
{ n  |->  Square(n)  iff  n ≠ 0
     |->  ↓1         iff  n = 0  }
```
(Difficult:) If you worked the previous question, you may try this one. Here is a requirement:
{ n |-> ↓(0 + 1 + 2 + ... + n) }
We want a lambda-calculus expression, named Summation, that implements the requirement, e.g., Summation(4) => 10. Complete the definition:

Summation = Repeat (λop (λn (eqZero(n) (....) (....))))
where
Repeat = λop (λa (op(a(a)))(op(a(a))))
Repeat is using the pattern within Russell's paradox to repeat an operation over and over.