Longest repeated subsequence

Given a string, find the longest repeated subsequence in it.

A subsequence of a string is new string (which can be empty) that is generated from the original string by deleting some characters of it while maintaining the relative order of it.

For example "ace" is subsequence of "abcde".

The longest repeated subsequence is a subsequence that appears at least twice in the string such that the two subsequence don’t have same string character at same position.

Input:
"AABEBCDD"

Output:
3
//"ABD"

Longest Repeated Subsequence

This LRS problem is a variation of longest common subsequence problem.

Recursive solution: Longest repeated subsequence.

We can solve this problem using the following recurrence relation where m and n both are the size of the same string.

if (m === 0 || n === 0) 
    return 0; 
if(str[m - 1] == str[n - 1] && m !== n) 
    return LRS(str, m - 1, n - 1) + 1; 
else
    return max(LRS(str, m, n - 1), LRS(str, m - 1, n)); 

Conceptually this is how it works

  • Begin evaluation from the last character of the string.
  • If the character at the index m and n matches, but the index are not same (m !== n), then return the result by recurring for remaining characters and adding one to it (as a subsequence character is found).
  • Else, return the max among them by recurring with decreased index on alternate places.
const LRS = (str, m, n) => {
  // If there are no more characters in the string to evaluate
  if (m == 0 || n == 0) {
    return 0;
  }

  // incase characters at index m and n are same
  // but the index of the character are different
  if (str[m - 1] == str[n - 1] && m !== n) {
    return LRS(str, m - 1, n - 1) + 1;
  }

  // else, return the max of the lrs at different indexes.
  return Math.max(LRS(str, m, n - 1), LRS(str, m - 1, n));
}
Input:
const str = "AABEBCDD";
console.log(LRS(str, str.length, str.length));

Output:
3

Time complexity: O(2 ^ n) where n is the size of the string.
Space complexity: O(2 ^ n) considering the call stack.


Using dynamic programming.

This problem is exhibiting both the properties of dynamic programming

  • Optimal Substructure: The problem can be broken down into subproblems which can be further broken down into subproblems and so on.
  • Overlapping subproblems: Same subproblems are getting re-computed again and again.

Thus we can cache the solution of already solved subproblems and reuse it rather than re-computing it again.

Top down

const LRS = (str, m, n, lookup) => {
  // If there are no more characters in the string to evaluate
  if (m == 0 || n == 0) {
    return 0;
  }
  
  // unique key to store the cache
  const key = `${m}-${n}`;
  
  // if the solution is not present in the cache,
  // compute it and store it
  if (!lookup.has(key)){
    // incase characters at index m and n are same
    // but the index of the character are different
    if (str[m - 1] == str[n - 1] && m !== n) {
        const computed = LRS(str, m - 1, n - 1, lookup) + 1;
        lookup.set(key, computed);
    }else{
        const computed = Math.max(LRS(str, m, n - 1, lookup), LRS(str, m - 1, n, lookup));
        lookup.set(key, computed);
    }
  }
  
  // return from cache
  return lookup.get(key);
}
Input:
const str = "AABEBCDD";
const lookup = new Map();
console.log(LRS(str, str.length, str.length, lookup));

Output:
3

Bottom up

const LRS = (str, n = str.length) => {
 
  // cahce matrix stores already computed subproblem's solution,
  const T = new Array(n + 1)
  for(let i = 0; i < T.length; i++){
    T[i] = new Array(n + 1).fill(0);
  }

  // cache in matrix in a bottom-up way
  for (let i = 1; i <= n; i++) {
    for (let j = 1; j <= n; j++) {
      // incase characters at index m and n are same
      // but the index of the character are different
      if (str[i - 1] === str[j - 1] && i != j) {
        T[i][j] = T[i - 1][j - 1] + 1;
      }
      // else
      else {
        T[i][j] = Math.max(T[i - 1][j], T[i][j - 1]);
      }
    }
  }

  // LRS will be the in the last cell of the matrix
  return T[n][n];
}
Input:
console.log(LRS("ATACTCGGA"));

Output:
4

Time complexity: O(n ^ 2).
Space complexity: O(n ^ 2).


Print the longest repeated subsequence.

We can use the similar technique which we had used while printing the longest common subsequence for LRS.

const LRS = (str, m, n, T) => {
  // If there are no more characters in the string,
  // return an empty string
  if (m == 0 || n == 0) {
    return "";
  }

  // incase characters at index m and n are same
  // but the index of the character are different
  if (str[m - 1] === str[n - 1] && m != n) {
    return LRS(str, m - 1, n - 1, T) + str[m - 1];
  }
  // else
  else {
    if (T[m - 1][n] > T[m][n - 1]) {
      return LRS(str, m - 1, n, T);
    }
    else {
      return LRS(str, m, n - 1, T);
    }
  }
}

const LRSLength = (str, T, n = str.length) => {
 
  // cache in matrix in a bottom-up way
  for (let i = 1; i <= n; i++) {
    for (let j = 1; j <= n; j++) {
       // incase characters at index m and n are same
       // but the index of the character are different
      if (str[i - 1] === str[j - 1] && i != j) {
        T[i][j] = T[i - 1][j - 1] + 1;
      }
      // else
      else {
        T[i][j] = Math.max(T[i - 1][j], T[i][j - 1]);
      }
    }
  }
}
Input:
const str = "AABEBCDD";
const n = str.length;

  // lookup table stores already solved subproblems solution,
const T = new Array(n + 1)
for(let i = 0; i < T.length; i++){
  T[i] = new Array(n + 1).fill(0);
}

// fill lookup table
LRSLength(str, T);
 
// find the longest repeating subsequence
console.log(LRS(str, n, n, T));

Output:
"ABD"