Divide and Conquer Algorithms
Student Outcomes
- Define divide and conquer approach to algorithm design
- Describe and answer questions about example divide and conquer algorithms
- Binary Search
- Quick Sort
- Merge Sort
- Integer Multiplication
- Matrix Multiplication (Strassen's algorithm)
- Maximal Subsequence
- Apply the divide and conquer approach to algorithm design
- Analyze performance of a divide and conquer algorithm
- Compare a divide and conquer algorithm to another algorithm
Essence of Divide and Conquer
- Divide problem into several smaller subproblems
- Normally, the subproblems are similar to the original
- Conquer the subproblems by solving them recursively
- Base case: solve small enough problems by brute force
- Combine the solutions to get a solution to the subproblems
- And finally a solution to the orginal problem
- Divide and Conquer algorithms are normally recursive
Binary Search
Recursive Binary Search
- A Divide and Conquer Algorithm to find a key in an array:
-- Precondition: S is a sorted list
index binsearch(number n, index low, index high,
const keytype S[], keytype x)
if low ≤ high then
mid = (low + high) / 2
if x = S[mid] then
return mid
elsif x < s[mid] then
return binsearch(n, low, mid-1, S, x)
else
return binsearch(n, mid+1, high, S, x)
else
return 0
end binsearch
Divide: search lower or upper half
- Divide: select lower or upper half
- Conquer: search selected half
- Combine: None
Performance:
- $T(n) = T(n/2) + \Theta(1) $
- $T(n) = \Theta(\lg n)$ (proved earlier)
Merge Sort
Recursive Merge Sort
- A Divide and Conquer Algorithm to sort an array:
void mergesort(S: array of keytype)
len = S'length
if len > 1 then
-- Divide: Copy the arrays
mid: constant int := len / 2
rest: int len - mid
U: array(1..mid) of keytype := S(1..mid)
V: array(1..rest) of keytype := S(mid+1..n)
-- Conquire: Recursively sort
mergesort(U)
mergesort(V)
-- Combine: merge sorted arrays
merge(U, V, S)
end mergesort
void merge(L, R, S: array(<>) of keytype)
i, j, k: int := 1
lenL: constant int := L'length
lenR: constant int := R'length
while i <= lenL and j <= lenR
if L(i) < R(j) then
S(k) := L(i); i := i + 1;
else -- R(i) ≤ L(j) then
S(k) := R(j); j := j + 1;
k := k + 1
if i > lenL then S(k..lenL+lenR) := R(j..lenR)
else S(k..lenL+lenR) := L(i..lenL)
Time Performance:
- $T(n) = 2T(n/2) + \Theta(n) $
- $T(n) = \Theta(n\lg n)$ (by induction or master method)
Space Performance:
- Requires $\theta(k/2)$ new space for each recursive call
- Number of recursive calls until size 1: $\lg n$
- Space used for call with size $n$ array:
$\displaystyle n\sum_{i=0}^{\lg n} \frac{1}{2^i} = n(1+1/2 + 1/4+\dots) = n\Theta(2) = \Theta(2n)$
- This extra space can be reused for next recursive call (think about the
diagram)
- Total space: $\Theta(3n)$
- Can we do better?
Recursive Merge Sort - Version 2
void mergesort2(low, high: int)
if low < high then
mid: constant int := (low + high) / 2
mergesort(low, mid)
mergesort(mid+1, high)
merge2(low, mid, high)
end mergesort2
-- Merges from S(LOW..MID) and S(MID+1..HIGH) to U(LOW..HIGH)
-- Then copies U(LOW..HIGH) back to S(LOW..HIGH)
void merge2(LOW, MID, HIGH: int)
U: array(LOW..HIGH) of keytype
i: int := LOW
j: int := MID
k: int := HIGH
while i <= MID and j <= HIGH
if S(i) < S(j) then
U(k) := S(i); i := i + 1;
else -- S(j) < S(i) then
U(k) := S(j); j := j + 1;
k := k + 1
if i > MID then U(k..HIGH) := S(j..HIGH)
else U(k..HIGH) := S(i..MID)
S(LOW..HIGH) := U(LOW..HIGH)
Time Performance:
- $T(n) = 2T(n/2) + \Theta(n) $
- $T(n) = \Theta(n\lg n)$ (by induction or master method)
Space Performance:
- Never uses more than $\theta(n)$ extra space for call to merge
- Extra space is reused for next call
- Total space: $\Theta(2n)$
- But, how much space is required for the recursion?
- Another improvement: make S size 2n, and merge from one side to the other
Quick Sort
Quick Sort - Algorithm
- Quicksort is another example of divide and conquer
- Recursive Algorithm:
-- Assume global array S
void quicksort(low, high: int)
if low < high then
mid: constant int := (low + high) / 2
partition(low, high, pivot)
quicksort(low, pivot-1)
quicksort(pivot+1, high)
end quicksort
void partition(LOW, HIGH: int; PIVOTPOINT: out int)
j: int := LOW
pivot_item: keytype := S(j)
for i in LOW+1 .. HIGH loop
if S(i) < pivot_item then
j := j + 1
swap(S(i), S(j))
end if
end loop
PIVOTPOINT := j
swap(S(LOW), S(PIVOTPOINT))
end partition
Quick Sort - Worst Case Time Performance
- Time Performance - Worst Case:
- $T(n) = T(n-1) + T(0) + \Theta(n) $
- $T(n) = \Theta(n^2)$ (by induction)
- Assume $T(k) = k^2 + \Theta(k)$ for $k < n$
$\begin{align*}
T(n) & = T(n-1) + \Theta(n) \\
& = (n-1)^2 + \Theta(n) \\
& = n^2 - 2n + 1 + \Theta(n) \\
& = n^2 + \Theta(n) + \Theta(n) \\
& = n^2 + \Theta(n) \\
& = \Theta(n^2)
\end{align*}
$
- Actually: how do we know that this is worst case?
- It could be proved, but we won't do it
Quick Sort - Expected Case Time Performance
- Time Performance - Expected Case:
- Assume all problem instances are equally likely
- This means that any element is equally likely to be the pivot (ie
returned by partition
- Thus the expected value of $T(n)$ is
$\displaystyle
\begin{align*}
T(n) & = \frac{1}{n} \sum_{p=1}^n [ T(p-1) + T(n-p)] + n-1 \\
& = \sum_{p=1}^n \frac{1}{n} [ T(p-1) + T(n-p)] + n-1 \\
& = \Theta(n \lg n)
\end{align*}
$
- See text for solution
Quick Sort - Best Case Time Performance
- Time Performance - Best Case:
- $T(n) = 2T(n/2) + \Theta(n) $
- $T(n) = \Theta(n\lg n)$ (by induction or master method)
- Why is this the best case: think about a recursion tree
Quick Sort - Constant Performance Time Performance
- Time Performance: Constant proportion
- What if pivot always divides by a constant proportion: say 10:90
- $T(n) = T(9n/10) + T(n/10) + \Theta(n) = O(n\lg n)$
- Think about the tree:
- The shallow side of the tree terminates in $\log_{10}n$ levels
- Each of these levels has weight $cn$
- The deep side of the tree terminates in $\log_{10/9}n$ levels
- Each of these levels has weight $≤ cn$
- The entire tree has weight $O(cn\log_{10})n + cn\log_{10/9}n = O(n\lg
n)$
Quick Sort - Space Performance
- Space Performance:
- No extra space needed
- Total space: $\Theta(n)$
- But, ... what about the space for ...?
Iterative Quick Sort
- Can we write quicksort iteratively?
- Push and pop the bounds of the half not currently sorted
- Requires an explicit stack (or array to simulate a stack)
- Explicit stack replaces recursion
- Should you stack smaller or larger?
Quick Sort - Improving Time Performance
- Partition from both ends
- How can we avoid the worst case?
- Better choice of pivot
- Randomized: choose a random element as the pivot
- Swap random element with first and use same algorithm
- Each element has equally likely chance of being pivot
- Particular characteristics of the data won't cause worst case
- Median of 3: choose 3 elements at random and take their median
- Or median of some other k
- Use insertion sort when list gets small
- Don't sort small lists and use insertion sort on entire list
One-Way, Stackless Quicksort!
- Repeat pivoting until positions 1 .. L are filled with pivots
- Store pivots as negative
- Works for positive only
Matrix Multiplication
Divide and Conquer Example 2 - Matrix Multiplication
- Input: A, B: Array(1 .. $n$) of number
- Output: C: Array(1 .. $n$) := A x B
- Remember how to multiply matrices?
- Obvious algorithm: C has $n^2$ entries, each of which multiples $n$ pairs
- $c_{ij} = \displaystyle \sum_{k=1}^n a_{ik}b_{kj}$
- Nested loop algorithm
- Performance:
Simple Divide and Conquer Method
- Break A into A11, A12, A21, A22
- Break B into B11, B12, B21, B22
- Break C into C11, C12, C21, C22
- Let C = product of 2 n/2 by n/2 arrays
$C_{11} = A_{11}\times B_{11} + A_{12}\times B_{21}$
$C_{12} = A_{11}\times B_{12} + A_{12}\times B_{22}$
$C_{21} = A_{21}\times B_{11} + A_{22}\times B_{21}$
$C_{22} = A_{21}\times B_{12} + A_{22}\times B_{22}$
- $T(n) = \Theta(1) + 8T(n/2) + \Theta(n^2)$
- $\Theta(1)$ time to partition matrices
- 8 Multiplications of arrays of size $n/2$
- $\Theta(n^2)$ time to add $n\times n$ matrices
- $T(n) = \Theta(n^3)$ by master method
Strassen's Algorithm
- Uses 7 rather than 8 multiplications of n/2 by n/2 arrays
- Has more additions and subtractions of n/2 by n/2 arrays
- Algorithm:
- Create 10 $n/2$ by $n/2$ sum arrays: S1 .. S10
- Recursively compute 7 product arrays: $m_1$ .. $m_7$
- Compute $c_{11}, c_{12}, c_{21}, c_{22}$
by adding and subtracting combinations of $m_i$
$
T(n) =
\begin{cases}
\Theta(1), & \text{if} n = 1 \\
7T(n/2) + \Theta(n^2), & \text{if} n > 1
\end{cases}
$
-
Solution (by Master Method): $T(n) = \Theta(n^{\lg 7}) = \Theta(n^{2.81}) $
- Notice: Has higher constants
- Faster solutions are known: $T(n) = O(n^{2.38})$
Large Integer Multiplication
Large Integer Multiplication
- Assume integers with $n$ digits
- Implement with array with one digit per element [Ada bignumpkg]
- Grade school algorithm: $\Theta(n^2)$
- Naive divide and conquer:
- $1234 \times 5678 = (12\times 56)\times 10^4 + [(12\times 78) +
(34\times 56)]\times 10^2 + (34 \times 78)\times 10^0$
- Performance: $T(n) = 4T(n/2) + \Theta(n) = \Theta(n^2) $
- More generally: $xy \times wz = xw\times 10^{2k} + (xz+yw)\times 10^k +
yz$
- Assume $xy$ has $2k$ digits
- This requires 4 multiplications
- Improved divide and conquer - calculate three products to find result:
- $r = (x+y) \times (w+z) = xw + (xz + yw) + yz$
- $p = xw$
- $q = yz$
- $xy\times wz = p\times 10^{2k} + (r-p-q) \times 10^k + q$
- Performance: $T(n) = 3T(n/2) + \Theta(n) = \Theta(n^{\lg 3})$ by master
method
Maximal Subarray
Example Divide and Conquer: Maximal-subarray Problem
- Problem:
- Input: A: Array(1 .. n) of numbers
- Output: Indices i and j such that sum(A(i .. j)) has the maximum value
- Assume some are negative
- Otherwise problem is trivial
- Example: A := (1, -100, 10, 20, -1, -5, 16, -23, 5)
Maximal Subarray: Example Scenario
- Stock prices over n days (in the past)
- Maximize profits
- Will maximum solution sell/buy at global max/min? Not necessarily
Maximal Subarray: Brute Force Solution
- How to solve by brute force?
- Performance?
Maximal Subarray: Divide and Conquer Solution
- Try splitting in half and solve each half
- How to use partial solution in entire solution
- What complications occur?
- How to solve complications
Maximal Subarray Algorithm
- Divide and Conquer Algorithm
maxsub(int[] S; low, high: int) return (lowIndex, highIndex, sum)
if low = high then
return (low, high, S(low))
else
mid = (low + high) / 2
(llow, lhigh, lsum) = maxsub(S, low, mid)
(rlow, rhigh, rsum) = maxsub(S, mid+1, high)
(mlow, mhigh, msum) = middlemaxsub(S, low, mid, high)
end if;
return triple with highest sum
end maxsub
middlemaxsub(int[] S; low, high, int) return (lowIndex, highIndex, sum)
start at mid and find bestleft and leftsum
start at mid and find bestright and rightsum
return (bestleft, bestright, rightsum+leftsum)
end middlemaxsub
Performance of Divide and Conquer Solution
- Performance:
- Base Case: $T(1) = \Theta(1)$
- Divide: $ \Theta(1)$
- Conquer: $2T(n/2)$ and $\Theta(n)$
- Combine: $\Theta(1)$
- $T(n) = 2T(n/2) + \Theta(n)$
- Closed form: $T(n) = \dots$
- Can we do better?
- Yes! A $\Theta(n)$ algorithm exists! (Dynamic programming!)