Longest Common Subsequence

Given a set of strings (in our case two) find the longest common subsequence between them and return the length of the longest. If there is no common subsequence then return 0.

A subsequence of a string is new string (which can be empty) that is generated from the original string by deleting some characters of it while maintaining the relative order of it.

For example "ace" is subsequence of "abcde".

A common subsequence between two or more strings is the subsequence which is present in all the strings.

Input:
"ABCD"
"ACBAD"

Output:
3
"ABD"
"ACD"

Longest Common Subsequence

Applications of LCS

  • Compression of genome resequencing data
  • Authenticate users within their mobile phone through in-air signatures.

A simple way to solve this problem (for two strings) is by verifying that each subsequence of str1 with length M is also subsequence of str2 with length N. As there are 2 ^ M possible subsequence for str1 which has length M, the time complexity of this problem would be N * (2 ^ M).

Thus the problem is considered as NP-hard which can be optimized to be solved in polynomial time using dynamic programming.

Naïve recursive solution: Longest common subsequence

We can solve the problem using the following recurrence relation where m and n are the size of both the strings

if (m === 0 || n === 0) 
    return 0; 
if(str1[m-1] === str2[n-1]) 
    return 1 + lcs(str1, str2, m-1, n-1); 
else
    return max(lcs(str1, str2, m, n-1), lcs(str1, str2, m-1, n)); 

Conceptually this is how it works

  • Start from the last character of both the strings.
  • If the index of the both the strings have reached to its limit (or end) then return 0.
  • When the characters of both the strings at the given indexes are same then recur for the remaining indexes and return it by adding one, as it is subsequence.
  • Else if the characters at the given index does not match then return the max among them by recurring with decreased index of alternate string.
 const LCSLength = (str1, str2, m, n) => {
  // return if there no more characters to examine in either strings
  if (m == 0 || n == 0) {
    return 0;
  }

  // if the character at the given index of str1 and str2 matches
  if (str1[m - 1] === str2[n - 1]) {
    return LCSLength(str1, str2, m - 1, n - 1) + 1;
  }

  // else, if the character of str1 and str2 don't match
  return Math.max(LCSLength(str1, str2, m, n - 1),
                     LCSLength(str1, str2, m - 1, n));
}
Input:
const str1 = "ABCD";
const str2 = "ACBAD";

console.log(LCSLength(str1, str2, str1.length, str2.length));

Output:
3

Time complexity: O(2 * (m + n)) where m and n are the size of the both the strings.
Space complexity: O(2 * (m + n)) considering the call stack.


Using dynamic programming.

If you notice in the above solution, we are repeatedly solving the same subproblems again and again which indicates that the problem is showing the overlapping subproblems property.

And also it can be seen that problem can be broken down into smaller subproblems which can be further broken down into subproblems and so on. So it also has optimal substructure property.

Thus it can be solved efficiently by storing the solutions of already solved subproblems and reusing them with dynamic programming.

Top down

 const LCSLength = (str1, str2, m, n, lookup) => {
  // return if there no more characters to examine in either strings
  if (m == 0 || n == 0) {
    return 0;
  }

  //unique key 
  const key = `${m}-${n}`;
    
  // if not cached
  if(!lookup[key]){ 
  // if the character at the given index of str1 and str2 matches
  if (str1[m - 1] === str2[n - 1]) {
    lookup[key] = LCSLength(str1, str2, m - 1, n - 1, lookup) + 1;
  }else{
    // else, if the last character of str1 and str2 don't match
    lookup[key] = Math.max(LCSLength(str1, str2, m, n - 1, lookup),
                     LCSLength(str1, str2, m - 1, n, lookup));
   }
  }
   
  return lookup[key];
}
Input:
const str1 = "ABCD";
const str2 = "ACBAD";

const lookup = new Map();
console.log(LCSLength(str1, str2, str1.length, str2.length, lookup));

Output:
3

Bottom up

In the bottom up approach using the following steps and a table to track, we can find the longest common subsequence between two strings.

  1. Create a matrix of the size of the both strings (n+1) * (m+1) and fill the first column and row with 0.
  2. Now we have to keep filling this matrix by using following logic.
  3. Traverse both the strings and if the character at the current column and row matches then fill that cell by adding 1 to the value of its left top diagonal element.
  4. If it does not matches then fill the cell by taking max of its previous row and column’s value.
  5. Repeat this for all the rows and columns of the matrix.
  6. The last cell in the matrix (last row and last column) will have the length of the longest common subsequence.

Longest Common Subsequence dp table

const LCSLength = (str1, str2, m = str1.length, n = str2.length) => {
 
  // matrix caches already computed subproblems,
  const T = new Array(m + 1);
  for(let i = 0; i < T.length; i++){
    T[i] = new Array(n + 1).fill(0);
  }

  
  for (let i = 1; i <= m; i++) {
    for (let j = 1; j <= n; j++) {
      
      // if the current character of str1 and str2 are same
      if (str1[i - 1] === str2[j - 1]) {
        T[i][j] = T[i - 1][j - 1] + 1;
      }
      // else, if the current character of str1 and str2 don't match
      else {
        T[i][j] = Math.max(T[i - 1][j], T[i][j - 1]);
      }
    }
  }

  // the last cell in the matrix has the LCS
  return T[m][n];
}
Input:
const str1 = "ABCD";
const str2 = "ACBAD";
console.log(LCSLength(str1, str2));

Output:
3

Time complexity: O(m * n) where m and n are the size of both the strings.
Space complexity: O(m * n).

Also see,
Longest common subsequence | Print all LCS