Divide and Conquer Algorithms

Student Outcomes

Define divide and conquer approach to algorithm design

Describe and answer questions about example divide and conquer algorithms

Binary Search
Quick Sort
Merge Sort
Integer Multiplication
Matrix Multiplication (Strassen's algorithm)
Maximal Subsequence

Apply the divide and conquer approach to algorithm design

Analyze performance of a divide and conquer algorithm

Compare a divide and conquer algorithm to another algorithm

Essence of Divide and Conquer

Divide problem into several smaller subproblems

Normally, the subproblems are similar to the original

Conquer the subproblems by solving them recursively

Base case: solve small enough problems by brute force

Combine the solutions to get a solution to the subproblems

And finally a solution to the orginal problem

Divide and Conquer algorithms are normally recursive

Binary Search

Recursive Binary Search

A Divide and Conquer Algorithm to find a key in an array:

    -- Precondition: S is a sorted list
    index binsearch(number n, index low, index high, 
            const keytype S[], keytype x)
        if low ≤ high then
            mid = (low + high) / 2
            if x = S[mid] then
                return mid
            elsif x < s[mid] then
                return binsearch(n, low, mid-1, S, x)
            else
                return binsearch(n, mid+1, high, S, x)
        else
            return 0
    end binsearch

Divide: search lower or upper half

Divide: select lower or upper half
Conquer: search selected half
Combine: None

Performance:

$T(n) = T(n/2) + \Theta(1) $
$T(n) = \Theta(\lg n)$ (proved earlier)

Merge Sort

Recursive Merge Sort

A Divide and Conquer Algorithm to sort an array:

    void mergesort(S: array of keytype)
        len = S'length
        if len > 1 then
            -- Divide: Copy the arrays
            mid: constant int := len / 2
            rest: int len - mid
            U: array(1..mid) of keytype := S(1..mid)
            V: array(1..rest) of keytype := S(mid+1..n)

            -- Conquire: Recursively sort
            mergesort(U)
            mergesort(V)

            -- Combine: merge sorted arrays
            merge(U, V, S)
    end mergesort
    void merge(L, R, S: array(<>) of keytype)
        i, j, k: int := 1
        lenL: constant int := L'length
        lenR: constant int := R'length
        while i <= lenL and j <= lenR 
            if L(i) < R(j) then
                S(k) := L(i);  i := i + 1;
            else -- R(i) ≤ L(j) then
                S(k) := R(j);  j := j + 1;
            k := k + 1
        if i > lenL then S(k..lenL+lenR) := R(j..lenR)
        else                S(k..lenL+lenR) := L(i..lenL)

Time Performance:

$T(n) = 2T(n/2) + \Theta(n) $
$T(n) = \Theta(n\lg n)$ (by induction or master method)

Space Performance:

Requires $\theta(k/2)$ new space for each recursive call
Number of recursive calls until size 1: $\lg n$
Space used for call with size $n$ array: $\displaystyle n\sum_{i=0}^{\lg n} \frac{1}{2^i} = n(1+1/2 + 1/4+\dots) = n\Theta(2) = \Theta(2n)$
This extra space can be reused for next recursive call (think about the diagram)
Total space: $\Theta(3n)$
Can we do better?

Recursive Merge Sort - Version 2

Uses global variable S

    void mergesort2(low, high: int)
        if low  < high then
            mid: constant int := (low + high) / 2
            mergesort(low, mid)
            mergesort(mid+1, high)

            merge2(low, mid, high)
    end mergesort2
    -- Merges from S(LOW..MID) and S(MID+1..HIGH) to U(LOW..HIGH)
    -- Then copies U(LOW..HIGH) back to S(LOW..HIGH)
    void merge2(LOW, MID, HIGH: int)
        U: array(LOW..HIGH) of keytype
        i: int := LOW
        j: int := MID
        k: int := HIGH
        while i <= MID and j <= HIGH
            if S(i) < S(j) then
                U(k) := S(i);  i := i + 1;
            else -- S(j) < S(i) then
                U(k) := S(j);  j := j + 1;
            k := k + 1

        if i > MID then U(k..HIGH) := S(j..HIGH)
        else               U(k..HIGH) := S(i..MID)

        S(LOW..HIGH) := U(LOW..HIGH)

Time Performance:

$T(n) = 2T(n/2) + \Theta(n) $
$T(n) = \Theta(n\lg n)$ (by induction or master method)

Space Performance:

Never uses more than $\theta(n)$ extra space for call to merge
Extra space is reused for next call
Total space: $\Theta(2n)$

But, how much space is required for the recursion?

Answer: ???

Another improvement: make S size 2n, and merge from one side to the other

Quick Sort

Quick Sort - Algorithm

Quicksort is another example of divide and conquer
Recursive Algorithm:

    -- Assume global array S
    void quicksort(low, high: int)
        if low  < high then
            mid: constant int := (low + high) / 2
            partition(low, high, pivot)
            quicksort(low, pivot-1)
            quicksort(pivot+1, high)
    end quicksort

    void partition(LOW, HIGH: int; PIVOTPOINT: out int)
        j: int := LOW
        pivot_item: keytype := S(j)
        for i in LOW+1 .. HIGH loop
            if S(i) < pivot_item then
                j := j + 1
                swap(S(i), S(j))
            end if
        end loop
        PIVOTPOINT := j
        swap(S(LOW), S(PIVOTPOINT))
    end partition

Quick Sort - Worst Case Time Performance

Time Performance - Worst Case:

$T(n) = T(n-1) + T(0) + \Theta(n) $
$T(n) = \Theta(n^2)$ (by induction)

Assume $T(k) = k^2 + \Theta(k)$ for $k < n$

Actually: how do we know that this is worst case?

It could be proved, but we won't do it

Quick Sort - Expected Case Time Performance

Time Performance - Expected Case:
Assume all problem instances are equally likely

This means that any element is equally likely to be the pivot (ie returned by partition

Thus the expected value of $T(n)$ is
See text for solution

Quick Sort - Best Case Time Performance

Time Performance - Best Case:

$T(n) = 2T(n/2) + \Theta(n) $
$T(n) = \Theta(n\lg n)$ (by induction or master method)

Why is this the best case: think about a recursion tree

Quick Sort - Constant Performance Time Performance

Time Performance: Constant proportion

What if pivot always divides by a constant proportion: say 10:90
$T(n) = T(9n/10) + T(n/10) + \Theta(n) = O(n\lg n)$
Think about the tree:

The shallow side of the tree terminates in $\log_{10}n$ levels

Each of these levels has weight $cn$

The deep side of the tree terminates in $\log_{10/9}n$ levels

Each of these levels has weight $≤ cn$

The entire tree has weight $O(cn\log_{10})n + cn\log_{10/9}n = O(n\lg n)$

Quick Sort - Space Performance

Space Performance:

No extra space needed
Total space: $\Theta(n)$
But, ... what about the space for ...?

Iterative Quick Sort

Can we write quicksort iteratively?
Push and pop the bounds of the half not currently sorted

Requires an explicit stack (or array to simulate a stack)
Explicit stack replaces recursion
Should you stack smaller or larger?

Quick Sort - Improving Time Performance

Partition from both ends

How can we avoid the worst case?

Better choice of pivot

Randomized: choose a random element as the pivot

Swap random element with first and use same algorithm
Each element has equally likely chance of being pivot
Particular characteristics of the data won't cause worst case

Median of 3: choose 3 elements at random and take their median

Or median of some other k

Use insertion sort when list gets small

Don't sort small lists and use insertion sort on entire list

One-Way, Stackless Quicksort!

Repeat pivoting until positions 1 .. L are filled with pivots
Store pivots as negative
Works for positive only

Matrix Multiplication

Divide and Conquer Example 2 - Matrix Multiplication

Input: A, B: Array(1 .. $n$) of number
Output: C: Array(1 .. $n$) := A x B

Remember how to multiply matrices?

Obvious algorithm: C has $n^2$ entries, each of which multiples $n$ pairs

$c_{ij} = \displaystyle \sum_{k=1}^n a_{ik}b_{kj}$
Nested loop algorithm
Performance:

Simple Divide and Conquer Method

Break A into A11, A12, A21, A22
Break B into B11, B12, B21, B22
Break C into C11, C12, C21, C22

Let C = product of 2 n/2 by n/2 arrays

$T(n) = \Theta(1) + 8T(n/2) + \Theta(n^2)$

$\Theta(1)$ time to partition matrices
8 Multiplications of arrays of size $n/2$
$\Theta(n^2)$ time to add $n\times n$ matrices

$T(n) = \Theta(n^3)$ by master method

Strassen's Algorithm

Uses 7 rather than 8 multiplications of n/2 by n/2 arrays

Has more additions and subtractions of n/2 by n/2 arrays

Algorithm:

Create 10 $n/2$ by $n/2$ sum arrays: S1 .. S10
Recursively compute 7 product arrays: $m_1$ .. $m_7$
Compute $c_{11}, c_{12}, c_{21}, c_{22}$ by adding and subtracting combinations of $m_i$

Solution (by Master Method): $T(n) = \Theta(n^{\lg 7}) = \Theta(n^{2.81}) $
Notice: Has higher constants

Faster solutions are known: $T(n) = O(n^{2.38})$

Large Integer Multiplication

Assume integers with $n$ digits

Implement with array with one digit per element [Ada bignumpkg]

Grade school algorithm: $\Theta(n^2)$

Naive divide and conquer:

$1234 \times 5678 = (12\times 56)\times 10^4 + [(12\times 78) + (34\times 56)]\times 10^2 + (34 \times 78)\times 10^0$
Performance: $T(n) = 4T(n/2) + \Theta(n) = \Theta(n^2) $
More generally: $xy \times wz = xw\times 10^{2k} + (xz+yw)\times 10^k + yz$
Assume $xy$ has $2k$ digits
This requires 4 multiplications

Improved divide and conquer - calculate three products to find result:

$r = (x+y) \times (w+z) = xw + (xz + yw) + yz$
$p = xw$
$q = yz$
$xy\times wz = p\times 10^{2k} + (r-p-q) \times 10^k + q$
Performance: $T(n) = 3T(n/2) + \Theta(n) = \Theta(n^{\lg 3})$ by master method

Maximal Subarray

Example Divide and Conquer: Maximal-subarray Problem

Problem:

Input: A: Array(1 .. n) of numbers
Output: Indices i and j such that sum(A(i .. j)) has the maximum value

Assume some are negative

Otherwise problem is trivial

Example: A := (1, -100, 10, 20, -1, -5, 16, -23, 5)

Solution: i = ??, j = ??

Maximal Subarray: Example Scenario

Stock prices over n days (in the past)
Maximize profits
Will maximum solution sell/buy at global max/min? Not necessarily

Maximal Subarray: Brute Force Solution

How to solve by brute force?
Performance?

Maximal Subarray: Divide and Conquer Solution

Try splitting in half and solve each half
How to use partial solution in entire solution
What complications occur?
How to solve complications

Maximal Subarray Algorithm

Divide and Conquer Algorithm

    maxsub(int[] S; low, high: int) return (lowIndex, highIndex, sum)
        if low = high then
            return (low, high, S(low))
        else
            mid = (low + high) / 2
            (llow, lhigh, lsum) = maxsub(S, low, mid)
            (rlow, rhigh, rsum) = maxsub(S, mid+1, high)
            (mlow, mhigh, msum) = middlemaxsub(S, low, mid, high)
        end if;
        return triple with highest sum
    end maxsub

    middlemaxsub(int[] S; low, high, int) return (lowIndex, highIndex, sum)
        start at mid and find bestleft and leftsum
        start at mid and find bestright and rightsum

        return (bestleft, bestright, rightsum+leftsum)
    end middlemaxsub

Performance of Divide and Conquer Solution

Performance:

Base Case: $T(1) = \Theta(1)$
Divide: $ \Theta(1)$
Conquer: $2T(n/2)$ and $\Theta(n)$
Combine: $\Theta(1)$

$T(n) = 2T(n/2) + \Theta(n)$

Closed form: $T(n) = \dots$

Can we do better?

Yes! A $\Theta(n)$ algorithm exists! (Dynamic programming!)