Time complexity measures

First, here is some review:

Sorting

Sorting is the process of ordering the elements in an array so that they are in ascending order with respect to the elements' keys.

Selection sort orders an array's elements by repeatedly finding the least element in the unsorted segment of the array and exchanging that element with the leftmost element of the unsorted segment. See Section 8.7.3 for a detailed example.

Insertion sort moves each array element to the left until it finds its proper place in the sorted left end of the array. Again, Section 8.7.3 gives details and examples.

Binary search

The reason for sorting an array is that we search the array ``quickly.''

An unsorted array is searched by a naive linear search that scans the array elements one by one until the desired element is found. If the array is sorted, we can employ binary search, which brilliantly halves the size of the search space each time it examines one array element. Binary search is described and illustrated in Section 8.7.4.

Binary search is lots faster than linear search. Here are some comparisons:

              NUMBER OF ARRAY ELEMENTS EXAMINED:
array size   |     linear search       binary search
             |      (avg. case)         (worst case)
--------------------------------------------------------
        8    |        4                  4
      128    |       64                  8
      256    |      128                  9
     1000    |      500                 11
   100,000   |     50,000               18

Time complexity

The preceding numbers are so startling that they demand closer scrutiny. The equation that defines the elements examined for linear search is of course:

L(N) = N / 2
Read this equation as saying, ``for an array of length N, the number of array lookups needed, L(N), equals N / 2.'' On the average, the desired element will be found in about the middle of the array---N/2 elements will be examined.

Binary search has a markedly different behaviour: In the worst case, the number of lookups, B(n) for an array of length N is defined as

B(N) = 1 + B( N/2 )
because searching an array sized N requires one examination of the element in the middle, followed by a binary search of either the left half or the right half of the array, each of which has N/2 elements.

Finally, we note that

B(1) = 1
because a one-element array requires just one element to be examined.

The calculations found in the earlier lecture, in Section 8.7.4, show that the above equations define this quantity: That is, pretending that the array's length is a power of 2, that is, N = 2M, for some positive M, we have that

B(2M) = 1 + B(2M-1),  for  M > 0
B(20) = 1
Using a mathematical induction prove, we can show that these equations simplify to merely
B(N) = log2N + 1
(Recall that log2N is the base 2 logarithm of N. For example, log28 is 3, because 23 equals 8.)

The above analysis shows that binary search can handle huge sorted arrays remarkably well. This behaviour is typical of divide and conquer algorithms, which repeatedly halve the search space.

Basic time-complexity classes

Linear search has linear-time complexity; binary search has log-time complexity. There are other classes of complexity, for example, quadratic-time complexity and exponential-time complexity.

Here is a table that provides some intuition about the running speeds of algorithms that belong to these classes:

          Logarithmic: Linear: Quadratic:  Exponential:
array size   |  log2N       N        N2      2N   
--------------------------------------------------------
        8    |    3         8        64      256
      128    |    7       128    16,384    3.4*1038
      256    |    8       256    65,536    1.15*1077
     1000    |   10      1000   1 million  1.07*10301
   100,000   |   17   100,000   10 billion    ....

Binary search and other divide-and-conquer algorithms have logN time complexity; we say that they have ``order logN'' and sometimes write this as O(logN) (the ``O'' means ``order of''.)

Linear search and the Java compiler (and most compilers) have linear time complexity: O(N)

Selection and insertion sorts have quadratic time complexity: O(N2).

Exhaustive searching and naive game-playing programs (e.g., chess) have exponential time complexity: O(2N). (To get a rough intuition for this time-complexity class, consider a naive sorting algorithm that would generate all possible orderings of the elements in an array and keep those orderings that it is building that remain sorted. This would generate roughly 2N combinations of the array's elements!)

In practice, an on-user becomes impatient waiting on a result from an algorithm that has quadratic time complexity or slower. This poses a problem for sorting algorithms. Fortunately, there are clever variants on sorting that employ divide-and-conquer. These algorithms operate in the range, O(N*log2N) (``order N log N''). Merge sort and quicksort are two elegant examples.

You should see Section 8.7.6 for the details for merge sort and quicksort. But the basic idea is this:

  1. divide the unsorted array, of size N, into two halves.
  2. sort each subarray, each of size N/2.
  3. merge the two sorted subarrays into the sorted, full array
Incredibly, the division of labor speeds the sorting: Look at the above table to see that sorting two arrays of size N/2 goes much faster than sorting one array of size N! And the merging of two sorted arrays takes only linear time. This goes faster than quadratic time.

The complexity class, O(N*log2N), of merge sort and quicksort is slower than linear time but faster than quadratic time. In practice, quicksort is the algorithm of choice for sorting arrays whose elements are completely mixed up.