Copyright © 2008 David Schmidt

Chapter 00:
Introduction: Why bother?




That's a good question! After all, you can hack a spreadsheet program or build an interactive game by writing a lot of code, experimenting with it, and patching it. After awhile, the program you wrote does more or less what you wanted.

But imagine if the rest of the world worked that same way --- would you want to drive a car or fly an airplane that was ``hacked together''? How about travelling in a bus across a bridge that fell down a few times already and was repeatedly patched till it (seemed) to hold?

Perhaps these analogies are a bit extreme, but professional scientists and engineers rely on planning, design, and calculation so that they are certain the products they want to build will work before anyone starts building them. Professionals rely on an intellectual foundation to plan, design, and calculate. For example,

If you develop significant expertise in software engineering, perhaps you will work at a firm or lab that develops safety-critical software, that is, software upon which people's money or safety or lives depend. (An example is the flight-control software that lives in the nose of a jet and flies it. Another example is the navigation software in a satellite that talks to the GPS device in someone's car.) Software of this nature has to be working correctly from the beginning --- there is no freedom to hack-and-patch the code once it is in use. Software engineers must use algebra and logic to plan and calculate how the software will behave before the software is built and installed.

This story is not an idle one: As you probably know, computer processor chips are planned out in a programming language that looks a lot like C. When Intel designed its first Pentium chip, there was a programming error in one of the chip's coded hash tables. The coding was burned into hardware, and millions of chips were manufactured. The error was quickly detected --- the chip did not always perform multiplication correctly. As a result, Intel lost a lot of money recalling the faulty chips and manufacturing a patched replacement. These days, Intel uses techniques for validating chip designs much like the one you will learn in this course.

If you have taken the CIS501 course, Software Architecture, you know that large systems can be drawn out, or ``blueprinted,'' with diagrams that show the components and how they connect together by means of method calls, event broadcast, and message passing. What we will learn in this course is lower level and more basic --- we will learn how to calculate how the lines of coding in each component compute internal knowledge as they convert inputs into outputs.

To understand the idea, let's think about electronics. When an electronic device, like a TV-set or radio, is designed, the parts of the device and their wirings are drawn out in a diagram called a schematic. Here is a schematic of a vacuum-tube guitar amplifier, the kind used by recording studios to produce a warm sound with good sustain:


Notice that the wires to the vacuum tubes (the globes labelled V1 through V5) are labelled with voltages, and there is a table in the lower left corner of the schematic that lists the correct resistances that will hold at each of the wires (``pins'') that connect to the tubes.

The voltage and resistance calculations are both an analysis and a prediction of how the circuit should behave. The numbers were calculated with mathematics and algebra, and if the electronics parts are working correctly, then these voltage, amperage, and resistance levels must occur --- the foundations of electronics (math and algebra) demand it.

When the circuit is built, the actual levels are measured with a multimeter and compared to the calculations; if there is a discrepency, this is a signal that some part within the circuit is faulty.

A computer program is a ``circuit'' that ``runs on'' knowledge, and when we design the parts (lines) of a computer program, we should include ``knowledge checks'' that assert the amount of knowledge computed by the program at various points. We will learn how to write and insert such knowledge checks, called assertions, into programs and use the laws of symbolic logic to prove that the assertions will hold true.

You will see many examples of ``program schematics'' in the upcoming chapters. Here are two. First, this little code fragment apparently selects the larger of two integers and prints it:

x = readInt()
y = readInt()
if x > y :
    max = x
else :
    max = y
print max
Think of the program as a ``circuit'' whose lines are ``wired'' together in sequence. Instead of voltage, information or knowledge ``flows'' from one line to the next. Here is the program's ``schematic'' where the internal ``knowledge levels'' are written in symbolic logic and are inserted within the lines of the program, enclosed by set braces, """{ ... }""":
===================================================

x = readInt()
y = readInt()
if x > y :
    """{ 1. x > y     premise  }"""
    max = x
    """{ 1. x > y     premise
      2. max == x  premise
      3. max >= x  algebra 2
      4. max >= y  algebra 1 3
      5. max >= x  ^  max >= y   ^i 3 4
    }"""
else :
    """{ 1. ~(x > y)       premise
      2. y >= x       algebra 1
    }"""
    max = y
    """{ 1. max == y    premise
      2. y >= x      premise
      3. max >= y    algebra 1
      4. max >= x    algebra 1 2
      5. max >= x  ^  max >= y   ^i 4 3
    }"""

"""{ 1. max >= x  ^  max >= y   premise  }"""
print max

===================================================
The last annotation, """{ max >= x ^ max >= y }""", is a symbolic-logic statement that max is guaranteed to be greater-or-equal to both inputs. We now know, once the program is implemented, it will behave with this logical property.

Here is a second example, a complete analysis of a function that squares all the integers in an array that is passed to it as its argument:

===================================================

def square(a):
    """Updates array  a  in place so that each of its ints are squared"""
    """{ pre  isArray(a)
      post  forall 0 <= i < len(a), a[i] == a_in[i] * a_in[i]
    }"""
    x = 0
    """{ 1. forall 0 <= _i < len(a), a[_i] == a_in[_i]  premise
      2. forall 0 <= i < len(a), a[i] == a_in[i]  substindex 1
      3. x == 0                                   premise
      4. forall x <= i < len(a), a[i] == a_in[i]  subst 3 2
      5. forall 0 <= i < x, a[i] == a_in[i] * a_in[i]   foralli 3
      5. return 4 5
    }"""
    while x != len(a) :
        """{ invariant  (forall 0 <= i < x, a[i] == a_in[i] * a_in[i]) and (forall x <= i < len(a), a[i] == a_in[i])
          modifies x, a
        }"""
        """{ 1. (forall 0 <= i < x, a[i] == a_in[i] * a_in[i]) and (forall x <= i < len(a), a[i] == a_in[i])     premise
          2. forall 0 <= i < x, a[i] == a_in[i] * a_in[i]    ande 1
          3. forall x <= i < len(a), a[i] == a_in[i]         ande 1
          4. return 2 3
        }"""
        assert x >= 0  # this is an invariant
        assert x < len(a)  # this is an invariant
        a[x] = a[x] * a[x]
        """{ 1.  a[x] == a_old[x] * a_old[x]                        premise
          2. forall 0 <= i < x, a[i] == (a_in[i] * a_in[i])      premise
          3. forall x <= i < len(a_old), a_old[i] == a_in[i]     premise
          4. a_old[x] == a_in[x]                                 foralle x 3
          5. a[x] == a_in[x] * a_in[x]                           subst 4 1
          6. forall 0 <= i < (x+1), a[i] == (a_in[i] * a_in[i])  foralli 2 5
          7. forall (x+1) <= i < len(a), a[i] == a_in[i]         premise
          8. return 6 7
        }"""
        x = x + 1
        """{ 1. forall 0 <= i < (x_old+1), a[i] == (a_in[i] * a_in[i]) premise
          2. forall (x_old + 1) <= i < len(a), a[i] == a_in[i]     premise
          3. x == x_old + 1                                       premise
          4. forall 0 <= i < x, a[i] == (a_in[i] * a_in[i])       subst 3 1
          5. forall x <= i < len(a), a[i] == a_in[i]              subst 3 2
          6. return 4 5
        }"""

    """{ 1. (forall 0 <= i < x, a[i] == (a_in[i] * a_in[i])) and (forall x <= i < len(a), (a[i] == a_in[i]))     premise
      2. not (x != len(a))                      premise
      3. x == len(a)                            algebra 2
      4. forall 0 <= i < x, a[i] == (a_in[i] * a_in[i])   ande 1
      5. forall 0 <= i < len(a), a[i] == (a_in[i] * a_in[i])   subst 3 4
    }"""

===================================================
You are not expected to understand the above, but the function's precondition and postcondition list the requirements of what comes in and the guarantees of what goes out. In this case, ``what goes out'' is an array whose elements are squared --- it is guaranteed to work, because it was analyzed the same way an electronics engineer analyzes a circuit.