CIS 301. Minimal-sum section problem.
Minimal-sum section.
We consider an integer array a[1..n] of length n. A
section of a[1..n] is a sub-array a[i..j] of a[1..n]
that has "no gaps". For example, the array [-2,3,-6,7,-8,5,-6] has
length 7, and some of its sections are
- [-2,3,-6]
- [-6,7,-8,5] and
- [3].
Note that [-2,-6] is not a section of that array, for 3 is
missing. Also, [-6,9,7] is not a section, since 9 is not in the
original array. For any section of an array, we can form its
sum. For example, the sum of [-6,7,-8,5] is -6 + 7 -
8 + 5 = -2, and the sum of [3] is 3.
The minimal-sum section problem is this:
Given an array a[1..n] of length n, compute the sum of one of its sections
with minimal sum.
In the example above, there is only one section that has a minimal sum,
the section [-8,5,-6], with sum -9. An array can have multiple sections
with minimal sum. For example, the array [-9,12,-2,3,-6,7,-8,5,-6]
has two sections, [-9] and [-8,5,-6] with minimal sum -9.
There is a naive computational solution to this problem:
- Generate all sections of a given array,
- compute the sum of each of these sections, and
- record the minimal sum found (so far).
This approach, if carried out properly, results in a correct solution to
the problem. However, it won't be of practical use for
reasonably big arrays. For example, if you had to look through an array
of one million integer entries, this algorithm would require non-trivial
computation steps of the order 1,000,000,000,000,000,000.
Fortunately, one can come up with a really slick problem decomposition
and its corresponding invariants such that we can compute this minimal-sum
section in linear time (i.e. the program only has to
read the entire array once). We present the code for this approach before
discussing its idea. This should illustrate that understanding other people's
code without any documentation of the involved ideas may be quite difficult
(implying that you should always take time to include insightful comments into
your source code).
k = 2;
t = a[1];
s = a[1];
while (k != n + 1) {
t = min(t + a[k], a[k]);
s = min(s,t);
k = k + 1;
}
Ignoring the construct for array-field access above, this program is
written in our core programming language and we mean to show its
partial (and later on its total) correctness. If we write this program,
say, in Java, we add the usual language-specific jargon, but the essence of
the algorithm, and our correctness argument prevail.
A possible Java code for this algorithm is:
public class Sum_Section {
private static KeyboardReader keyboard = new KeyboardReader();
public static void main(String[] args) {
int n;
do { n = keyboard.readInt("number of entries in the array ( > 1): "); }
while (n <= 1);
int[] a = new int[n+1]; // we won't use a[0]
for (int i = 1; i < n+1; ++i) {
a[i] = keyboard.readInt("array entry number " + i + ": ");
}
int t = a[1];
int s = a[1];
System.out.println("initial t = " + t + " and initial s = " + s);
for (int k = 2; k != n+1; ++k) {
if (t+a[k] < a[k]) { t = t+a[k]; }
else { t = a[k]; }
if (t < s) { s = t; }
System.out.println("t = " + t + " and s = " + s);
}
System.out.println("The sum of a minimal sum-section is " + s);
}
}
We consider a run of this program:
nunk% java Sum_Section
number of entries in the array ( > 1): 8
array entry number 1: 4
array entry number 2: -8
array entry number 3: 3
array entry number 4: -4
array entry number 5: 8
array entry number 6: -6
array entry number 7: -3
array entry number 8: 5
initial t = 4 and initial s = 4
t = -8 and s = -8
t = -5 and s = -8
t = -9 and s = -9
t = -1 and s = -9
t = -7 and s = -9
t = -10 and s = -10
t = -5 and s = -10
The sum of a minimal sum-section is -10
This run demonstrates that determining a minimal-sum section "by hand"
may to tricky. Indeed, our textbook (Example 4.14, page 253) claims
that the array above has two minimal-sum sections with value -9. As our
program shows, this is incorrect!
We now reason informally why our core program is correct:
k = 2;
t = a[1];
s = a[1];
while (k != n + 1) {
t = min(t + a[k], a[k]);
s = min(s,t);
k = k + 1;
// invariant1: t is the minimal-sum of all sections that end with a[k-1]
// invariant2: s is the minimal-sum of all sections in the array a[1..k-1]
}
- Base case for invariant1:
Right after the initial assignment to s, invariant1 is true since t, being
a[1], is the minimal-sum of all sections that end in a[k-1]. This is so
since k equals 2, and this there is only one section, namely a[1..1], that ends
in a[k-1].
- Base case for invariant2:
Right after the initial assignment to s, invariant2 is true since s, being
a[1], is the minimal-sum of all sections in the array a[1..k-1]. This is so
since k equals 2, and this there is only one section, namely a[1..1], in the
array a[1..k-1].
- Inductive step for invariant1:
Let us assume that invariant1 holds after l many iterations of the while-body.
We need to show that it remains to be true after l+1 many iterations.
Note that only the first and last assignment of the while-body are relevant
for this invariant. Applying the proof rule for assignment to the last
statement of the while-body, invariant1 reads as
"t is the minimal-sum of all sections that end with a[k]". Pushing this
trough the assignment to s leaves the statement unchanged. But the
assignment to t changes this statement to
"min(t + a[k], a[k]) is the minimal-sum of all sections that end with a[k]".
The question is now whether this follows from our hypothesis!
This is indeed the case, but requires some genuine thought. Try it!
- Inductive step for invariant2:
Let us assume that invariant2 holds after l many iterations of the while-body.
We need to show that it remains to be true after l+1 many iterations.
Note that all assignments of the while-body are relevant
for this invariant. Applying the proof rule for assignment to the last
statement of the while-body, invariant1 reads as
"s is the minimal-sum of all sections in the array a[1..k]".
Pushing this true the assignment to s results in
"min(s,t) is the minimal-sum of all sections in the array a[1..k]"
Pushing this through the assignment to t, gives us
"min(s,min(t + a[k], a[k])) is the minimal-sum of all sections in the array
a[1..k]".
The question is now whether this follows from our hypothesis
that "s is the minimal-sum of all sections in the array a[1..k]"!
The subtle thing about showing this implication is that we also need
the other invariant, the assertion about t, to make this
work. (Why may we assume this invariant to hold?) Try it!
Michael Huth
(huth@cis.ksu.edu)