CIS 301. Minimal-sum section problem.

Minimal-sum section.

We consider an integer array a[1..n] of length n. A section of a[1..n] is a sub-array a[i..j] of a[1..n] that has "no gaps". For example, the array [-2,3,-6,7,-8,5,-6] has length 7, and some of its sections are

[-2,3,-6]
[-6,7,-8,5] and
[3].

Note that [-2,-6] is not a section of that array, for 3 is missing. Also, [-6,9,7] is not a section, since 9 is not in the original array. For any section of an array, we can form its sum. For example, the sum of [-6,7,-8,5] is -6 + 7 - 8 + 5 = -2, and the sum of [3] is 3.

The minimal-sum section problem is this:

Given an array a[1..n] of length n, compute the sum of one of its sections with minimal sum.

In the example above, there is only one section that has a minimal sum, the section [-8,5,-6], with sum -9. An array can have multiple sections with minimal sum. For example, the array [-9,12,-2,3,-6,7,-8,5,-6] has two sections, [-9] and [-8,5,-6] with minimal sum -9.

There is a naive computational solution to this problem:

Generate all sections of a given array,
compute the sum of each of these sections, and
record the minimal sum found (so far).

This approach, if carried out properly, results in a correct solution to the problem. However, it won't be of practical use for reasonably big arrays. For example, if you had to look through an array of one million integer entries, this algorithm would require non-trivial computation steps of the order 1,000,000,000,000,000,000. Fortunately, one can come up with a really slick problem decomposition and its corresponding invariants such that we can compute this minimal-sum section in linear time (i.e. the program only has to read the entire array once). We present the code for this approach before discussing its idea. This should illustrate that understanding other people's code without any documentation of the involved ideas may be quite difficult (implying that you should always take time to include insightful comments into your source code).

  k = 2;
  t = a[1];
  s = a[1];
  while (k != n + 1) {
     t = min(t + a[k], a[k]);
     s = min(s,t);
     k = k + 1;
  }

Ignoring the construct for array-field access above, this program is written in our core programming language and we mean to show its partial (and later on its total) correctness. If we write this program, say, in Java, we add the usual language-specific jargon, but the essence of the algorithm, and our correctness argument prevail. A possible Java code for this algorithm is:

public class Sum_Section {

  private static KeyboardReader keyboard = new KeyboardReader();

  public static void main(String[] args) {

    int n;

    do { n = keyboard.readInt("number of entries in the array ( > 1): "); }
    while (n <= 1);

    int[] a = new int[n+1];              // we won't use a[0]

      for (int i = 1; i < n+1; ++i) {
        a[i] = keyboard.readInt("array entry number " + i + ": ");
      }

    int t = a[1];
    int s = a[1];
    System.out.println("initial t = " + t + " and initial s = " + s);

    for (int k = 2; k != n+1; ++k) {
      
      if (t+a[k] < a[k]) { t = t+a[k]; }
      else { t = a[k]; }
    
      if (t < s) { s = t; }
      System.out.println("t = " + t + " and s = " + s);
   }

    System.out.println("The sum of a minimal sum-section is " + s);
  }
}

We consider a run of this program:

nunk% java Sum_Section
number of entries in the array ( > 1): 8
array entry number 1: 4
array entry number 2: -8
array entry number 3: 3
array entry number 4: -4
array entry number 5: 8
array entry number 6: -6
array entry number 7: -3
array entry number 8: 5
initial t = 4 and initial s = 4
t = -8 and s = -8
t = -5 and s = -8
t = -9 and s = -9
t = -1 and s = -9
t = -7 and s = -9
t = -10 and s = -10
t = -5 and s = -10
The sum of a minimal sum-section is -10

This run demonstrates that determining a minimal-sum section "by hand" may to tricky. Indeed, our textbook (Example 4.14, page 253) claims that the array above has two minimal-sum sections with value -9. As our program shows, this is incorrect!

We now reason informally why our core program is correct:

  k = 2;
  t = a[1];
  s = a[1];
  while (k != n + 1) {
     t = min(t + a[k], a[k]);
     s = min(s,t);
     k = k + 1;
     // invariant1: t is the minimal-sum of all sections that end with a[k-1]
     // invariant2: s is the minimal-sum of all sections in the array a[1..k-1]
  }

Base case for invariant1: Right after the initial assignment to s, invariant1 is true since t, being a[1], is the minimal-sum of all sections that end in a[k-1]. This is so since k equals 2, and this there is only one section, namely a[1..1], that ends in a[k-1].
Base case for invariant2: Right after the initial assignment to s, invariant2 is true since s, being a[1], is the minimal-sum of all sections in the array a[1..k-1]. This is so since k equals 2, and this there is only one section, namely a[1..1], in the array a[1..k-1].
Inductive step for invariant1: Let us assume that invariant1 holds after l many iterations of the while-body. We need to show that it remains to be true after l+1 many iterations. Note that only the first and last assignment of the while-body are relevant for this invariant. Applying the proof rule for assignment to the last statement of the while-body, invariant1 reads as "t is the minimal-sum of all sections that end with a[k]". Pushing this trough the assignment to s leaves the statement unchanged. But the assignment to t changes this statement to "min(t + a[k], a[k]) is the minimal-sum of all sections that end with a[k]". The question is now whether this follows from our hypothesis! This is indeed the case, but requires some genuine thought. Try it!
Inductive step for invariant2: Let us assume that invariant2 holds after l many iterations of the while-body. We need to show that it remains to be true after l+1 many iterations. Note that all assignments of the while-body are relevant for this invariant. Applying the proof rule for assignment to the last statement of the while-body, invariant1 reads as "s is the minimal-sum of all sections in the array a[1..k]". Pushing this true the assignment to s results in "min(s,t) is the minimal-sum of all sections in the array a[1..k]" Pushing this through the assignment to t, gives us "min(s,min(t + a[k], a[k])) is the minimal-sum of all sections in the array a[1..k]". The question is now whether this follows from our hypothesis that "s is the minimal-sum of all sections in the array a[1..k]"! The subtle thing about showing this implication is that we also need the other invariant, the assertion about t, to make this work. (Why may we assume this invariant to hold?) Try it!

Michael Huth (huth@cis.ksu.edu)