Computing Science: the study of how to solve problems efficiently


(It's not engineering, not technology --- It's science: the application of methodology and experimentation to obtain knowledge of a phenomenom.)
  1. Solvability: computability theory
  2. Efficiency: computational complexity theory
Modern computing technology is a consequence of computing science....


Solvability: Computability theory

In the 19th century, mathematicians were the computers; they computed with deduction laws. They computed proofs, which output new knowledge from input knowledge (hypotheses).

Have you taken a Geometry class? If so, you computed the classical way. A tiny example: deduction laws for "less-than":

===================================================

         E1 > E2                              E1 > E2    E2 > E3
(add):------------------    (transitivity): -----------------------
       E1 + N > E2 + N                             E1 > E3

===================================================
Two proofs using these laws:
===================================================

PROVE: |-  x + 2 > x                   PROVE: a > 0, a > b |-  a + a > b

1.   2 > 0          arithmetic fact    1.  a > 0          hypothesis
2.   x + 2 > x + 0  add 1              2.  a > b          hypothesis
3.   x + 2 > x      arithmetic on 2    3.  a + a > a + 0  add 1
                                       4.  a + 0 > b + 0  add 2
                                       5.  a + a > b + 0  transitivity 3,4
                                       6.  a + a > b      arithmetic on 5

===================================================
The last line of each proof is the "output", new knowledge generated from hypotheses.

Hilbert's Program

David Hilbert was perhaps the most influential mathematician of the 19th (and 20th!) centuries. Hilbert believed strongly in the computing approach to mathematics, like above, and in the 1920's he employed many mathematicians to realize "Hilbert's program":

all true facts of mathematics can be computed by proofs using deduction laws.

This implies that all of mathematics can be done (by machines!) that mechanically apply deduction laws in all possible combinations until all true facts are computed as answers.

Will this work out? Is human creativity unnecessary to mathematics?


Gödel's Incompleteness Proof: "The most important result of the 20th century" and the beginning of modern computing science

In 1931, an Austrian PhD student, Kurt Gödel ("Goedel"), showed that Hilbert's program was impossible --- he showed that mathematics was unsolvable --- there are true facts in math that cannot be computed mechanically.

In the process, Gödel invented the modern notions of machine coding, program, recursive computation, and program analyzer (a program that consumes other programs as inputs).

Greatly simplified, here's what Gödel did:

  1. He invented machine coding (like ASCII coding): he represented symbols like x, >, and 2, as numbers (in ASCII, these would be #120, #062, #050) so that a mathematics formula like x > 2 is machine-coded as 120,062,050 --- one long number.

  2. He machine-coded deduction laws, so that proofs are coded as huge numbers.

  3. He wrote computer programs, using a logic formulas as a programming language, called primitive-recursive functions. (It works a lot like Prolog!)

    He wrote programs that (i) check if a number decodes (parses) into a formula; (ii) check if a number decodes (parses) into a proof; (iii) check if a numbered proof proves a numbered formula.

So, every mathematical formula must be a (very huge) number, and every proof of a mathematics fact must be a (very huge) number, so all mathematics facts and their proofs are "embedded" in the nonnegative integers, 0, 1, 2, ....

Gödel was ready to answer the big question: Will the deduction laws prove all the true facts of mathematics or just some of them? That is, is Hilbert's program possible?

Gödel studied the deduction laws for schoolboy (Peano) arithmetic: the nonnegative ints with +, *, >, =, and he proved there are true facts of arithmetic that cannot be proved by any (sound) set of deduction laws for arithmetic. Gödel did so by coding this formula, which he showed has some index number, #g:

#g : Formula #g cannot be proved

Formula #g says, ''I cannot be proved!'' Informally stated (see below for details!), here's how Gödel did it:
  1. Since formulas have integer indexes, a formula can look at an index number of another formula and parse and even evaluate the formula that the index represents. What if a formula looks at its own index number?

    Inspired by Russell's paradox (see the end of this paper), Gödel coded this formula in logic:

    One cannot prove the claim that Formula Number __ makes about its own index number.
    
    (Gödel used a free variable ("logical variable"), 'Z', to stand for the __ part: One cannot prove the claim that Formula Number Z makes about its own index number.) The above formula, "the not-provable formula", has an index number, say, #f.

  2. Now Gödel wrote in logic,
    One cannot prove the claim that Formula Number #f makes about its own index number.
    
    The "not-provable formula" is stating that the "not-provable formula" (itself!) is not provable. Gödel showed that this self-referencing claim, Formula #g, is logically equivalent (although not syntactically identical) to One cannot prove the claim that Formula Number #g makes --- that is, "Formula #g cannot be proved."
TECHNICAL NOTES: here are the predicates that lead to Formula #g, written Prolog-style, where we use no functors, only ints for arguments, just like Gödel:
/* These first three predicates must be coded as complete parsers for
   the grammar laws of predicate-logic formulas and their proofs, like
   you did for CIS505 Exercise 7. Gödel did this, first.         */
parseForm(N) :- "index (ascii) number  N  parses into a formula φN".
parseProof(N) :- "index (ascii) number  N  parses into a proof πN".

proves(P, N) :- parseForm(N),  
                parseProof(P),
                "φN is the last line of πP".

/* Here are some baby examples to help you understand the uses of 
   the index numbers of formulas and proofs:

   The formula,  5 > 2,  is a true fact of arithmetic.  Also, it can be proved
   using Peano's deduction laws.  Say that the (ascii) index number of  5 > 2
   is  #88.

   The formula,  Z > 2,  uses a free variable ("logical variable"), Z.
   The formula works like a function --- when we substitute for  Z,  it is
   like calling the function.  For example, if we substitute  1  for  Z,
   we get  1 > 2,  which is false.  (And, there is no proof of  1 > 2 !)
   Say that the index number of  Z > 2  is  #99.

   We can do this trick:  substitute index number, #99, for Z  in  Z > 2:
   99 > 2.  This is a true fact.  It also has a proof. 
   The formula has "examined" its own index!

   Consider this formula:  ∃p proves(p, Z).  The formula is a function
   that says "the formula whose index is  Z  has a proof using 
   Peano's deduction rules."  Say that this formula's index number is #777.

   Consider this call/substitution:  ∃p proves(p, 88) --- "formula
   #88 has a proof".  There is indeed a proof of Formula #88 (viz., 5 > 2), 
   so this formula is a true fact.
   Consider this call/substitution:  ∃p proves(p, 777)  --- WHAT DOES
   IT SAY??   Formula #777 is "examining" itself!   Can you guess: is the
   formula a true fact?   Is it false?   (It must be one or the other!)
   What is the consequence if the formula is true?  If it is false?
   
   Now we are ready to study the rest of Gödel's development:
*/
   

/* How to substitute Arg for free variable  Z  in formula #M to get formula #N:
   it's like a function call:  φN =  φM(Arg) */
substitute(Arg, M, N) :- parseForm(M),  parseForm(N),
                         "[Arg/Z]φM  =  φN".
                         /* this must be coded like a parser */
          
/*  Gödel wrote this formula that examines itself. It is like  
    (X X)  in lambda-calculus.  M is the index of formula  φX(X): */
self(X, M) :-  substitute(X, X, M).

/* How to say that a self-formula  Z  CANNOT be proved:
   Z  is a free ("logical") variable --- a "parameter" of a "function".
   In lambda-calculus, this clause is coded  (lam Z. F(Z Z)),
   where  F  is the  "not-provable"  part.    */
notProveSelf(Z) :-  not(∃P ∃N (self(Z, N), proves(P, N))).

/* The *right-hand side* of  notProveSelf  is a formula in logic, and
   it has an index number --- say it's  #f :  φ#f <=> notProveSelf(Z) 
   Here is the "I cannot be proved" formula,  notProveSelf(#f):

      not(∃P ∃N (self(#f, N), proves(P, N)))

   The formula has a *unique* index,  #g,  so self(#f, #g)  holds true.
   Hence,  ∃N self(#f, N)  holds true, and  N  *must* be  #g.  
   So, the above formula, Formula #g, is logically equivalent 
  (but not syntactically identical) to
      not(∃P proves(P, #g))).

   Note: in logic, there are many logically equivalent formulas
   that are not syntactically identical, e.g.,   A  <=>  A ∨ A.

   Formula #g says "I am true exactly when I am not provable".
   It is like this lambda-calculus expression:
     (lam X. F(X X))(lam X. F(X X)).
   Stephen Kleene discovered this pattern and used it in the lambda
   calculus and also in his Second Recursion Theorem for the μ-recursive functions.
*/
It's time to use Formula #g.

The deduction laws for arithmetic (Peano's deduction laws) build proofs of only true formulas --- the laws are sound. Now, if the deduction laws can build proofs for all true formulas of arithmetic, then the laws are complete. Can the deduction laws for arithmetic be both sound and complete? For the sake of discussion, assume they are --- what happens??

Alas, Formula #g is a logical bomb that blows up our assumption of completeness. Once again, we start with this logical equivalence:

Formula #g <=> Formula #g cannot be proved
The only way to escape the contradictions is to conclude that Formula #g is true and cannot be proved:

Any set of sound deduction laws that extend Peano arithmetic must be incomplete.


Turing machines and von Neumann machines

Gödel's work shattered Hilbert's program and it generated an intensive study of what machines might (and might not) be capable of solving. One person who took interest was Alan Turing.
Turing was an English mathematician, employed by the British government to break the codes the Nazi armies used for communication. Turing and his team succeeded, and the Allies used the decoded Nazi communications to win World War II.

Turing was interested in mathematical games, and he was interested in the machine codings of Gödel. He conceived a machine for working with the codes. His Turing machine was a controller with a read-write head that traversed the squares of an unbounded roll of toilet paper, where only a 0 or a 1 can be written in a square:

    
Turing realized that the controller could be "wired" as a "processor" that understood this instruction set:
move Left/Right one square
erase the square (make it blank)
write 0/1 on the square
if square is blank/0/1, then goto Instruction #n
goto Instruction #n
stop
Here is a sample program that scans a binary number from left to right and flips all 0s to 1s and all 1s to 0s:
#1: if square is 0, then goto #4
#2: if square is 1, then goto #6
#3: if square is blank, then goto #9
#4: write 1
#5: goto #7
#6: write 0
#7: move Right
#8: goto #1
#9: stop

             ... => ...  
These instructions are given number codings, so that a program written in the instructions could be copied on a tape and read by a controller.

The Universal Turing Machine was configured like this:

    
Gödel's machine codings were used to represent a program as a long sequence of 0s and 1s. The Universal Turing Machine is the origin of
  1. CPU
  2. linear storage
  3. stored program
  4. binary machine language

Turing wrote programs for the Universal Turing machine, and all the programs of Gödel's primitive-recursive-function language were coded for the Turing machine. Turing believed that all facts/knowledge/problems that can be calculated/deduced/solved by a machine can be done by a Universal Turing Machine. This is now known as the Church-Turing thesis.

(The "Church" part refers to Alonzo Church, a Princeton mathematician who developed a symbol system, the lambda-calculus, which has the same computational powers as Turing's machines.)

Turing asked: Are there some "mechanical problems" whose solutions cannot be calculated by his machine?

Here is the "mechanical problem" Turing considered: It is the first example of a program analyzer: Is there a program that can read the machine coding of any other program and correctly predict whether the latter will loop when it runs?

This is called the halting problem.

Turing showed that the answer is NO --- there is no such program.

TECHNICAL ASIDE: Gödel showed there there is true mathematical knowledge that cannot be mechanically discovered; Turing is asking whether there are machine problems that cannot be programmed. There is this analogy between Gödel's approach and Turing's:

    Gödel                      Turing
  --------------------------------------------
    logical formula       ==   program
    proof building        ==   execution
    formula has a proof   ==   program halts when executed
    formula has no proof  ==   program loops when executed
Gödel's counterexample was the formula, "I cannot be proved"; Turing's will be the program, "I halt exactly when I loop."

Turing showed that it is impossible to build a program analyzer that predicts whether a program will halt when executed.

  1. If such an analyzer, H, existed, it would have a machine coding, say, #H, and it would have to behave like this:
    H(#P) = print 1, if  program #P halts when it runs
            print 0, if  program #P loops when it runs
    
    (If Program #P itself needs input data, the input data number is appended on the tape to the end of int #P.)
  2. We use analyzer program #H to build this ``evil-twin program'', E:
    E(#P) = start program #H with #P on the same tape as the input.  
            Wait for the answer.
            If the answer's 1, then  LOOP!
            If the answer's 0, then  print 1
    
  3. Now, what is the outcome of running program E with its own coding, #E?
    E(#E) loops exactly when H(#E) prints 1, that is, when E halts! And, if E(#E) halts, printing 1, it is exactly because E loops! Both cases are impossible --- program H cannot exist, because it would let us easily build evil-twin program E and generate the logical and physical impossibility that follows.
The halting problem specifies a problem that can never be solved by a machine. As a consequence, many other program-analysis problems are readily proved impossible to implement:
  1. whether a program is free of all run-time errors
  2. whether two programs produce exactly the same outputs from the same inputs
  3. whether a program matches a logical requirement of correctness behavior.
None of these problems are solvable in the general case, but computer scientists have identified subcases of the above problems and have developed programs/tools/logics that can do useful program analysis. This is what I do for my paychecks (in addition to teaching, doing paperwork and administration, etc.).

Von Neumann machines

Modern computers use architectures based on Universal Turing Machines. The adaptation is due to Eckart, Mauchley and von Neumann, at the University of Pennsylvania, where they built the ENIAC electronic computer:

The first electronic computers were "non universal" machines, where the computer program was wired into the CPU. The wires were moved around when a new program was desired. The ENIAC people implemented Gödel's and Turing's machine language, stored programs, and "universal" CPUs.

Your computers and electronic gadgets are the offspring of Gödel's and Turing's thoughts.

The developments described here crystallized as computability theory.

Classes of solvable problems

Gödel and Turing showed that many critical math and programming problems will never be solved by machine. But many can be, and the classes of solvable problems have been intensively studied.

First, think of a problem as a math function: you give an input argument to the function, and the function tells you an answer (output). The function defines what you must solve/implement. Now, which math functions can be mechanically computed by programs? Here are two famous groups that can:

  1. the primitive-recursive functions (Gödel): These are functions on nonnegative integer inputs that are computed by 0, +1, calling other primitive-recursive functions, and recursive call (mathematical induction) with arguments that are smaller by one. This class of functions includes most of arithmetic, and their programs always halt --- an input will always produce an output.

  2. the μ-recursive functions (S.C. Kleene): These are the primitive-recursive functions plus one more operation, minimalization: (μX P(X)) (``find the least nonnegative int, X, such that true-false property, P(X), holds true''). This class of functions matches exactly the functions computed by programs written for Turing machines. The programs are not guaranteed to halt.
Within these two groups lie many subclasses: (i) within the primitive-recursive functions lies the Grzegorczyk hierarchy of solvability; (ii) within the μ-recursive functions lies the modern time-complexity hierarchy; see the next section.


Efficiency: Computational Complexity Theory

Say that a problem can be solved by a computer (that is, it is one of the μ-recursive functions) --- how good is the solution (program)? How fast? How much storage space is needed? Is the solution the best possible for solving the problem?

These questions are studied within computational complexity theory. Their answers define subclasses of the μ-recursive functions. We learn about this topic by studying some simple algorithms that work with arrays.

Linear-time algorithms

Say you have a database that is an array of int keys connected to data records. The keys aren't arranged in any special way, so finding a data record by its key requires a linear search of an array of ints, e.g., find key 8 in array, [6, 10, 2, 4, 12, 16, 8, 14]:

Step 1:  V
        [6, 10, 2, 4, 12, 16, 8, 14]

Step 2:      V 
        [6, 10, 2, 4, 12, 16, 8, 14]
...

Step 7:                       V
        [6, 10, 2, 4, 12, 16, 8, 14]
We search the array from front to back. How fast is this algorithm? Without worrying about speeds of chips, we measure the speed of an array algorithm by how many array lookups/updates it must do.

Obviously, with an array of length 8, at most 8 lookups is needed; in general, if the array has length N, at most N lookups is needed. This is the worst-cast time complexity, and in this case is linearly proportional to the length of the array. We say that sequential search is a linear-time algorithm, order-N.

Here is a table of how a linear-time algorithm performs:

===================================================

array size, N        worst-case time complexity, in terms of lookups
----------------------------------------------
64                   64
512                  512
10,000               10,000

===================================================
A linear-time algorithm looks "slow" to its user but is tolerable if the data structure is not "too large".

When you write a one-loop program that counts through the elements of an array, your algorithm runs in linear time.

Log-time algorithms

What if the array of keys was sorted so that the keys were in ascending order? In this case, we can use a "dictionary lookup" algorithm, called binary search, which jumps into the middle of the array, looks at the key there, and then jumps into the middle of the left half or right half, depending whether the desired key is less-than or greater-than the key in the middle:
Find 8 in  [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22]:

Step 1: look in middle:     V
           [2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22]

Step 2: look in middle of left half:     V
                                  [2, 4, 6, 8, 10 ... ]

Step 3: look in middle of left's right subhalf:    V
                                            [ ..., 8, 10, ... ]
The subarray searched shrinks by half each time there is a lookup, and the key is quickly found --- in worst case, at most log2N lookups are needed to find the key. (Recall that log2N = M means that 2M = N.) This is log-time, order log-N. Here's a table:
===================================================

array size, N   linear-time  log-time   
----------------------------------------------
64               64          6
512              512         8
10,000           10,000      12

===================================================
The speed-up is spectacular; this algorithm is markedly faster in theory and in practice. In practice, regardless of how a database is configured, the search operation must be order log-time or faster.

When you write an algorithm that "discards" half of a data structure during each processing step, you are writing a log-time algorithm.

Quadratic-time algorithms

How much time does it take to sort an array of keys into ascending order? A standard technique, called selection sort, works like this:
Input array                     Output array
---------------------------    --------------------
[6, 10, 2, 4, 12, 16, 8, 14]    []

[6, 10, 4, 12, 16, 8, 14]       [2]

[6, 10, 12, 16, 8, 14]          [2, 4]

[10, 12, 16, 8, 14]             [2, 4, 6]

 . . .

[16]                            [2, 4, 6, 8, 10, 12, 14]

[]                              [2, 4, 6, 8, 10, 12, 14, 16]
The algorithm scans the N-element array N-1 times, and 1/2(N2 - N) lookups are done; add to this the N updates to the output array. This is a quadratic-time algorithm, order N2. Here is the comparison table:
===================================================

array size, N   linear     log2N   N2          1/2(N2 + N)
-------------------------------------------------------
64               64        6     4096         2064
512              512       8     262,144      131,200
10,000           10,000    12    100,000,000  50,002,500

===================================================
(Note there isn't a gross difference between the numbers in the last two columns. That's why selection sort is called a quadratic-time algorithm.)

A quadratic-time algorithm runs slowly for even small data structures. It cannot be used for any interactive system --- you run this kind of algorithm while you go to lunch. When you write a loop-in-a-loop to process an array, you are writing a quadratic algorithm.

The standard sorting algorithms are quadratic-time, but there is a clever version of sorting, called "quicksort", that is based on a divide-and-conquer strategy, using a kind of binary search to replace one of the nested loops in the standard sorting algorithms. Quicksort runs in order N * log2N time:

===================================================

N      log2N     N2          N*log2N
-------------------------------------------------------
64     6       4096         384
512    8       262,144      4096
10,000 12      100,000,000  120,000

===================================================

Intractable problems and their algorithms

Say you have a maze, and you want to calculate the shortest path of all the paths from the maze's entry to its exit. Or say you have a deck of cards, and you want to print all the shuffles of the deck. Or you have a roadmap of cities, and you want to calculate the shortest tour that lets you visit each city (this is the travelling salesman problem):

These problems require that you calculate all possible combinations of a data structure before you select the best one. Many important industrial and scientific problems, which ask for optimal solutions to complex constraints, are just the problems stated here.

The problems stated above are solvable --- indeed, here is a Python function that calculates all the permutations (shuffles) of a deck of size-many cards:

===================================================

def permutations(size) :
    """computes a list of all permutations of [1,2,..upto..,size]
       param: size, a nonnegative int
       returns:  answer, a list of all the permutations defined above
    """
    assert size >= 0
    if size == 0 :   
        answer = [[]]  # there is only the "empty permutation" of a size-0 deck
    else :
        sublist = permutations(size - 1)  # compute perms of [1,..upto..,size-1]
        answer = []   # build the answer for  size-many  cards in a deck
        for perm in sublist:  
            # insert  size  in all possible positions of each  perm:
            for i in range(len(perm)+1) :
                answer.append( perm[:i] + [size] + perm[i:] )
    return answer

===================================================
It uses a recursion and a nested loop. For example, permutations(3) computes [[3, 2, 1], [2, 3, 1], [2, 1, 3], [3, 1, 2], [1, 3, 2], [1, 2, 3]]. Try this function on your computer and see how large an argument you can supply until the computer refuses to respond.

The solution to the above is written in just a few lines of Prolog, which is an ideal language for calculating all combinations that satisfy a set of constraints:

===================================================

/* permutation(Xs, Zs)  holds true if list Zs is a reordering of list Xs */
permutation([], []).
permutation(Xs, [Z|Zs]) :- select(Z, Xs, Ys),  permutation(Ys, Zs).

/* select(X, HasAnX, HasOneLessX)  "extracts" X from  HasAnX, giving HasOneLessX.  
   It's a built-in Prolog operation, but here's the code:  */
select(X, [X|Rest], Rest).
select(X, [Y|Ys], [Y|Zs]) :- select(X, Ys, Zs).

/* This query computes all the permutations and saves them in  Answer:  */
?- findall(Perm, permutations([0,1,2,...,size], Perm), Answer).

===================================================

One way of sorting a deck of cards is by computing all the shuffles (permutations) and keeping the one that places the cards in order. But this strategy will be very slow! How slow?

Permutation problems, like all-N-shuffles and travelling-salesman-to-N-cities, require 1*2*3*..upto..*N data-structure lookups/updates to compute their answers. This number is abbreviated as N!, called factorial N, and it grows huge quickly:

N      log2N     N2          N*log2N   N!
-------------------------------------------------------
64     6       4096         384       126886932185884164103433389335161480802865516174545192198801894375214704230400000000000000
512    8       262,144      4096      347728979313260536328304591754560471199225065564351457034247483155161041206635254347320985033950225364432243311021394545295001702070069013264153113260937941358711864044716186861040899557497361427588282356254968425012480396855239725120562512065555822121708786443620799246550959187232026838081415178588172535280020786313470076859739980965720873849904291373826841584712798618430387338042329771801724767691095019545758986942732515033551529595009876999279553931070378592917099002397061907147143424113252117585950817850896618433994140232823316432187410356341262386332496954319973130407342567282027398579382543048456876800862349928140411905431276197435674603281842530744177527365885721629512253872386613118821540847897493107398381956081763695236422795880296204301770808809477147632428639299038833046264585834888158847387737841843413664892833586209196366979775748895821826924040057845140287522238675082137570315954526727437094904914796782641000740777897919134093393530422760955140211387173650047358347353379234387609261306673773281412893026941927424000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
10,000 12      100,000,000  120,000   ????????????????????????????????????????????????????????????????????????????????

===================================================
No computer, no matter how fast, can use an algorithm that requires factorial time. A problem that has no faster solution than this is called intractable --- unsolvable in practice.

There is a way to improve the performance of permutation problems: remember partial solutions and reuse them. You can do this in the travelling-salesman problem, where the distances over subpaths in the graph can be saved in a table and reused when computing longer paths. (This technique is called dynamic programming.) The resulting algorithm uses about 2N lookups --- exponential time:

N      log2N     N2          N*log2N   2N
-------------------------------------------------------
64     6       4096         384       18446744073709551616
512    8       262,144      4096      13407807929942597099574024998205846127479365820592393377723561443721764030073546976801874298166903427690031858186486050853753882811946569946433649006084096
10,000 12      100,000,000  120,000   ????????????????????????????????????????????????????????????????????????????????
Exponential-time algorithms are still too slow to use in practice. At best, you start one and come back in a day. (I am not kidding; people do this.)

Computational complexity theory is the study of the fundamental time and space requirements for solving important problems.


Gödel returns: P = NP?

To prove the incompleteness of arithmetic, Gödel constructed a twisted counterexample. Nonetheless, there are important claims that appear to be true and appear not to have proofs of their truth. In computer science, there is a prime example: the P = NP claim.

First, here is a diagram that summarizes the previous section, where algorithms are organized based on their execution speed for an input set of size N:

The arrow labelled dynamic programming indicates that some N!-time algorithms can be optimized to 2N-time, moving the problems they solve to a faster time class. There is another arrow, labelled P = NP?, that we consider here:

A program whose execution consumes Nk-time, where N is the size of the input data set and k > 0, is called a (deterministic) polynomial time algorithm, or P-time for short. If a "mechanical problem" has a P-time solution, the problem itself is called P-time.

If a P-time problem is a "general case" (that is, its algorithm can be used to solve all other P-time problems), then it is P-time complete.

Sorting is a P-time problem; so is parsing and compiling. Here are examples of P-time complete problems: computing whether a circuit built with AND-OR gates outputs a 1 from its inputs; computing the first possible maximal path in a graph. As a rule, P-time problems have "fast enough" solutions on a computer.

Intractible problems, like the travelling salesman problem, have exponential-time (2N-time) algorithms. There is a technical explanation of exponential-time algorithms in terms of non-deterministic polynomial time --- NP-time for short.

The travelling-salesman problem is an NP-time problem and is general enough to solve all other NP-time problems --- it is also NP-complete. Another NP-complete problem is checking whether an arbitrary formula in propositional logic is satisfiable (that is, it computes to True for some input set). Many optimization problems (bin packing, maze searching, flight scheduling) are NP-time problems, and there would be huge financial gain if someone could show how to optimize an NP-complete problem into a polynomial-time algorithm (showing that NP-time is really just a subclass of P-time and therefore possess "fast enough" solutions).

That is,

Does P-time equal NP-time? Or not?
No one has been able to prove or refute that NP-time problems are distinct from P-time problems! It is strongly believed that P ≠ NP is a true fact of computing science and also that there can be no proof of it in a sound logic.


Postlude: Russell's paradox

Gödel's and Turing's constructions are similar in that they both rely on self-reflection --- the ability of a program or definition to discuss itself. This idea is best learned via Russell's paradox.

Since antiquity, there has been the "Liar's Paradox", where a person says, "I am lying,'' or ``This sentence is false.''. These claims use self-reference ("I", "this") and negation. It is difficult to assign a true-false interpretation to either of these two assertions. (Here, it's critical that negation --- not --- computes a unique opposite. Real life and some mathematics treat negation differently. What is the weather when ``It is not sunny"? Clouds? Rain? Fog? Night?)

In the 19th century, mathematicians started working with sets, defining them in terms of membership properties, like this:

===================================================

Pos = { n | n > 0 }

W3 = { s | s is a string of length 3 }

EvenPos = { n |  n > 0  and  n / 2  has remainder 0 }

===================================================
You can ask if a value, v, is a member of a set, S, like this: v ∈ S, e.g.,
3 ∈ Pos  holds True

"xyz" ∈ Pos  holds False,  which means  not("xyz" ∈ Pos)  holds True

"xyz" ∈ W3  holds True
You can use in the membership property, too:
===================================================

EvenPos = { n |  n ∈ Pos  and  n / 2  has remainder 0 }

===================================================
A computer person thinks of a set as a kind of "function" that is "called" when you ask it a membership question, e.g., 3 ∈ Pos is like calling Pos(3) and expecting back an answer of True or False.

Sets can hold other sets, which makes the concept really interesting. Consider this example:

===================================================

NE = { S | not(S = {}) }

===================================================
NE collects those values --- including sets --- that aren't the empty set. For example,
Pos ∈ NE

2 ∈ NE

but   {} ∈ NE   holds False
Now, is it that
NE ∈ NE   ?
Well, yes, because NE is a nonempty set. So, NE contains itself as well. The membership question just stated is an example of self-reflection. All sets can do self-reflection, e.g., not(Pos ∈ Pos) holds True and is a reasonable answer to the question.

Bertrand Russell observed that this convenient way of defining sets via membership properties was flawed in the sense that some sets cannot be defined. Here is Russell's paradox, a clever adaptation of the liar's paradox:

Russell's paradox was the first simple example that showed how a "programming language" can be limited in what it can "compute". The paradox underlies the classic results of unsolvability due to Gödel, Turing, and Church.

Russell explained his paradox to non-mathematicians as the "Barber's Paradox":

There is a village where there is one barber who lives there. There is a firm rule: The barber cuts the hair of a villager exactly when the villager does not cut their own hair.
Question: Who cuts the barber's hair?


Exercises

  1. Use the instructions for the Universal Turing Machine to write a program for it that will read a binary int and add one to it. The machine starts at the first blank square to the left of the int,
     V
    | |1|0|1|1| |
    
    and it moves to the right of the int, till it finds the rightmost digit:
             V             
    | |1|0|1|1| |
    
    Then it adds one and does the needed carries until it is finished:
         V                                                      V
    | |1|1|0|0| |   Finally, it resets to its start position:  | |1|1|0|0| |
    

  2. Explain as precisely as you can why there is no possible answer to the question, is R ∈ R either True or False?
    Next, study Reference 2 at the end of this paper and match its explanation to the Prolog code I wrote to express Gödel's proof. Also answer this question: "What is ω-incompleteness?"

  3. Run these tests with the permutations function given earlier:
    permutations(0)
    permutations(1)
    permutations(2)
    permutations(3)
    permutations(4)
    Notice that each answer is larger than the previous. How much larger? Well, the size of the answer for permutations(4) is 1*2*3*4 = 24, which is 4 times larger than the answer for permutations(3) (which is sized 1*2*3). And, the size of the answer to permutations(5) will be 5 times larger (and take 5 times as long to compute!) than the one for permutations(4).

    Finish this exercise as follows:

    Compute permutations(5), permutations(6), etc., as far as your computer lets you go. Time each computation. While you wait for each answer to compute and then print, calculate how long the computation will take and how large each answer will be. Remember that each next test case, permutations(i+1), will be i+1 times slower and its answer is i+1 times larger than its predecessor, permutations(i).

    How far did you progress with your experiments until your computer failed to respond? Is the computer lost? tired? worn out? What if you bought a new computer that is twice as fast or even ten times as fast as the one you are now using --- would it help here? Is it realistic to compute all the permutations of an ordinary 52-card deck of cards so that you can prepare for your next trip to Las Vegas? Can't computers do things like this?

  4. If a list of ints is mixed up, we can sort it by finding a permutation of it that is ordered. An int list is ordered if its ints are arranged from smallest to highest, e.g., [1, 2, 4, 5, 6] is ordered, but [6, 5, 4, 1, 2] is not. Write this function in Python:
    def isOrdered(nums) :
        """isOrdered looks at the int list,  nums,  and returns True if 
           nums is ordered.  It returns False if nums is not ordered."""
    
    Use your function with this one:
    sortList(nums) :
        """sortList  finds the sorted permutation of int list  nums  by brute force"""
        allPermutationsOfIndexes = permutations(len(nums))
        for indexpermutation in allPermutationsOfIndexes :
            possibleanswer = []
            for i in indexpermutation :
                possibleanswer = possibleanswer.append(nums[i-1])
            print "Possible sorted version is", possibleanswer
            if isOrdered(possibleanswer) :
                print "Found it!"
                return possibleanswer
    
    Find a better (provably faster!) way of sorting a list of ints.

    By the way, the above ugly Python code can be coded as a one-liner in Prolog! :

    /* sorted(Xs, Ys)  holds true when  Ys  is the ordered variant of Xs */
    sorted(Xs, Ys) :- permutation(Xs, Ys), ordered(Ys).
    
    /* ...where  ordered(Xs)  holds true if the elements in Xs are ordered by  < */
    ordered([]).
    ordered([X]).
    ordered([X,Y|Rest]) :- X =< Y,  ordered([Y|Rest]).
    

  5. You can code Russell's paradox on the computer and see what happens! Recall that a set, S, is a kind of True-False function and that n ∈ S is coded as a function call, S(n).

    Start with these Python codings of the sets listed earlier:

    def Pos(n) :
        return isinstance(n, int) and n > 0
    
    def W3(s) :
        return isinstance(s, string) and len(s) == 3
    
    def Empty(v) : return False
    
    Try these: Pos(2); Pos("abc"); W3("abc"); W3(pos); Pos(Pos); Empty(2); Empty(W3).

    A function like Pos can be coded in Python with a lambda abstraction, where the parameter is associated with the code body:

    Pos = lambda n : isinstance(n, int) and n > 0
    
    The function is called the same way, e.g., Pos(3) and Pos(Pos). Here are more examples:
    EvenPos = lambda n : Pos(n) and  n % 2 == 0
    W3 = lambda s : isinstance(s, str) and len(s) == 3
    
    Empty = lambda S : False
    E2 = lambda S : S != S
    
    U = lambda S : True
    NE = lambda S : S != Empty
    
    R = lambda S : not(S(S))
    
    import collections
    R2 = lambda S : isinstance(S, collections.Callable) and not(S(S))
    
    Retry the previous examples and also these:
    EvenPos(Pos)
    U(Empty)
    NE(Empty)
    NE(E2) # what happened ?!  How does Python check equality of functions?
    NE(NE)
    R(NE)
    R(Empty)
    R(W3)
    R(R)  # have we 'escaped' Russell's paradox??
    R(3)
    R2(3)
    R2(R2)
    


References

  1. Herbert Enderton. A Mathematical Introduction to Logic. Academic Press, 1972.
  2. Peter Suber. Gödel's Proof. Earlham College, http://legacy.earlham.edu/~peters/courses/logsys/g-proof.htm, 1997.
  3. Nigel J. Cutland. Computability. Cambridge University Press, 1980.
  4. Robert Sedgewick. Algorithms, 2d ed. Addison-Wesley, 1988.
  5. Niklaus Wirth. Algorithms + Data Structures = Programs. Prentice-Hall, 1976.


David Schmidt   das at ksu.edu     This work is licensed under a Creative Commons License. Creative Commons License