Click this box to toggle showing all answers!

- Problem:
- Sequence: $X = [ x_1, x_2, ..., x_m ]$
- Sequence: $Y = [ y_1, y_2, ..., y_m ]$
- Find longest subsequence that is common to both
- Subsequence elements do not need to be adjacent
- Examples:
- breath / conservative = eat
- springtime / pioneer = pine
- horseback / snowflake = oak
- maelstrom / becalm = elm
- heroically / scholarly = holly

- Brute force algorithm:
For every subsequence of $X$, see if it is a subsequence of Y

- How many subsequences of $X$?
$2^n$. Why?

- How long to check each subsequences of $X$?
$\Theta(n)$. Why?

- Scan Y for first letter of subsequence, then second, ...
- Total time?
$\Theta(n2^n)$.

Click this box to toggle showing all answers!

- Example table(s) for BREATHER and CONSERVATIVES:
- Stare at the table a while - what do you notice
- Make up your mind: Is it "table" or "tables"
This one table shows two arrays

- Inconceivable!
No, easily conceived! One array stores the optimum values, the other stores information needed to reconstruct a solution that gives that optimum value. Both are shown below in one grid.

- That's beginning to make sense - tell me more ...
The numbers are the length of the best subsequence at that point, and the arrows show which which neighbor was best

- Hmmm, what do you mean "at that point" ...?
Ponder the next question ...

- What did you say was the most important thing to know about a table?
To be crystal clear on what the table elements represent

- Is that all?
To know how to find a cell's value from other cells

- Is that really all?
To know how the order in which to build the table

- Okay, I'll bite; what are those answers for this example?
Cell(i,j) is the LCS for words (1..i) on the left and (1 .. j) on the top.

- By row, by column, either? Diagonal?
Either by row or by column.

- Okay I believe you, but how was I ever supposed to know that?
What cells are needed? West, North, Northwest. You can find those three whether you do by row or by column.

- But not by diagonal?
Well, one of diagonals would work, but by row or column is much simpler.

- Should I think about what it means to fill in by row?
You bet! Great idea.

- Is this too many questions?
You bet!

- Example: c(5, 3) := 1, from the North:
- c(5, 3) represents the cost of LCS of X(1..5) and Y(1..3)
- X(1..5) = BREAT
- Y(1..3) = CON
- Since X(5) = T ≠ Y(3) = N and c(4,3) = c(5,2) = 0:
- Then c(5, 3) := c(4,3) = 1
- This is the value from the North
- In other words: LCS(BREAT, CON) := LCS(BREA, CON) = 0
- Example: c(5, 9) := 3, from the NorthWest:
- c(5, 9) represents cost of LCS of X(1..5) and Y(1..9)
- X(1..5) = BREAT
- Y(1..9) = CONSERVAT
- Since X(5) = T = Y(9):
- Then c(5, 9) := c(4,8) + 1 = 1 + 2 = 3
- This is the value from the NorthWest
- In other words: LCS(BREAT, CONSERVAT) := LCS(BREA, CONSERVA) + 1
- Example: c(5, 12) := 3, from the West:
- c(5, 12) represents cost of LCS of X(1..5) and Y(1..12)
- X(1..5) = BREAT
- Y(1..12) = CONSERVATIVE
- Since X(5) = T ≠ Y(12) = E and c(4,12) = 2 and c(5,11) = 3:
- Then c(5, 12) := c(5,11) = 3
- This is the value from the West
- In other words: LCS(BREAT, CONSERVATIVE) := LCS(BREAT, CONSERVATIV)
- What happens to ties
In the table, go North

- Could they be handled differently?
Sure, Go West, young CS Student

- Try it!

$ \begin{array}{cc|ccccccccccccc} & & & C & O & N & S & E & \color{red}{R} & V & \color{green}{A} & \color{blue}{T} & I & V & \color{orange}{E} & S\\ \hline & i, j & 0 & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13\\ \hline &0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ B &1 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \color{red}{\uparrow 0} & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 \\ \color{red}{ R}&2 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \color{red}{\nwarrow 1} & \color{red}{\leftarrow 1} & \leftarrow 1 & \leftarrow 1 & \leftarrow 1 & \leftarrow 1 & \leftarrow 1 & \leftarrow 1 \\ E &3 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \nwarrow 1 & \uparrow 1 & \color{red}{\uparrow 1} & \uparrow 1 & \uparrow 1 & \uparrow 1 & \uparrow 1 & \nwarrow 2 & \leftarrow 2 \\ \color{green}{ A}&4 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 1 & \uparrow 1 & \uparrow 1 & \color{red}{\nwarrow 2} & \leftarrow 2 & \leftarrow 2 & \leftarrow 2 & \uparrow 2 & \uparrow 2 \\ \color{blue}{ T}&5 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 1 & \uparrow 1 & \uparrow 1 & \uparrow 2 & \color{red}{\nwarrow 3} & \color{red}{\leftarrow 3} & \color{red}{\leftarrow 3} & \leftarrow 3 & \leftarrow 3\\ H &6 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 1 & \uparrow 1 & \uparrow 1 & \uparrow 2 & \uparrow 3 & \uparrow 3 & \color{red}{\uparrow 3} & \uparrow 3 & \uparrow 3\\ \color{orange}{ E}&7 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 1 & \uparrow 1 & \uparrow 1 & \uparrow 2 & \uparrow 3 & \uparrow 3 & \uparrow 3 & \color{red}{\nwarrow 4} & \leftarrow 4 \\ R &8 & 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 0 & \uparrow 1 & \nwarrow 2 & \leftarrow 2 & \uparrow 2 & \uparrow 3 & \uparrow 3 & \uparrow 3 & \color{red}{\uparrow 4} & \color{red}{\leftarrow 4}\\ \end{array} $

- This solution fills two tables:
- c(i, j) = length of longest common subsequence of X(1..i) and Y(1..j)
- b(i, j) = direction (either N, W, or NW) from which value of c(i,j) was obtained
- Length of LCS for X(1..m) and Y(1..n) is in c(m, n)
- LCS-Length(X, Y)

m, n := X.length, Y.length b(1..m, 1..n) c(0..m, 0..n) := (others => (others => 0)) for i in 1 .. m loop for j in 1 .. n loop if x_{i}= y_{j}c(i, j) := c(i-1, j-1) + 1 b(i, j) := "NW" else if c(i-1, j) ≥ c(i, j-1) then c(i, j) := c(i-1, j) b(i, j) := "N" else c(i, j) := c(i, j-1) b(i, j) := "W" end if end if end loop end loop

You know the answer ...

It's $\Theta(m, n)$!

- Print-LCS(b, X, i, j)

if i > 0 and j > 0 then if b(i, j) = "NW" then print-LCS(b, X, i-1, j-1) print x_{i}elsif b(i, j) = "N" then print-LCS(b, X, i-1, j) else print-LCS(b, X, i, j-1) end if end if

You know the answer ...

It's the same: $\Theta(m, n)$!

- Notation:
- $X_i = [ x_1, x_2, \dots, x_i ]$
- $Y_i = [ y_1, y_2, \dots, y_i ]$
- Need to find a subproblem whose solution can be used to find solution to given problem
- Define: $c(i,j) = \textrm{length of LCS of } X_i \text{ and } Y_j $
- Recursive definition: $ c(i,j) = \begin{cases} 0, & \text{ if } i=0 \text{ or } j=0 \\ c(i-1, j-1) + 1, & \text{ if } i, j > 0 \text{ and } x_i = y_j \\ \max(c(i-1, j), c(i, j-1), & \text{ if } i, j > 0 \text{ and } x_i ≠ y_j \end{cases} $
- We must find $c(m, n)$
- Could we write a routine to find it?
- What's the time complexity?
Do you know the answer ...?

- Hmmm ... ?!?
Close enough - whatever it is, it's slow!

- What does a recursion tree look like for m=4, n=3
- Notice the repeated subproblems

Click this box to toggle showing all answers!