Find all anagrams substring in a string.

Learn how to create an algorithm to find all anagrams substring in a string in javascript.

Anagram: An anagram of a string is another string that contains same characters, only the order of characters can be different.

Given a string S and another string P we have to find all the anagrams of P in S and return their start index in S.

Example

Input:
'cbaebabacd'
'abc'

Output:
[0, 6]   
// 'cbaebabacd' is the anagram of 'abc' which starts at index 0
// 'cbaebabacd' is the anagram of 'abc' which starts at index 6

We cannot solve this with the pattern search algorithms because here we have to find the anagram of them.

Another way of solving this is by generating the HASH value of P and then check if the hash exists at the given window in S.

There are two problem with this approach.

First, we will have to create an efficient HASH function which will generate unique which has very less probability.

Second, it will run in O(S * P) time as we will be generating the HASH value of P length for all the substrings of S.

We can achieve O(S) time by optimizing the algorithm with few assumptions.

Implementation

  • Assuming the ASCII value of all the alphabets are 256, we create two arrays of length 256.
  • In the first array we store the count of all characters of string P.
  • In the second array we store all the characters of substring of S of length P.
  • Now for each character in the stirng S we remove the count of previous character from the array and we add the count of new character. This way we will be able to compare all substrings at the window of length P.
  • At last check if both the arrays are equal or not. If they are equal then anagram substring found.

Now as the size of our array is constant (256). Comparing both the array will take O(256) time and for all the characters of string S it will take O(S * 256) which is basically O(S).

Finding all anagram substring in a string.

//Function to compare two arrays
    const compare = (arr1, arr2) => {
        for (let i = 0; i < arr1.length; i++) {
          if (arr1[i] !== arr2[i]) {
            return false;
          }
        }

        return true;
    };

    const findAnagrams = (str, ptr) => {
    //Get the length of string and anagarm 
      const S = str.length;
      const P = ptr.length;
      if(S < P){
          return [];
      }
      
      //Create two empty arrays
      const sCount = new Array(256).fill(0);
      const pCount = new Array(256).fill(0);
      
      //Array to store found indexes
      const arr = [];
        
      //Initally count the characters of pattern and substring of length P in str
      for (let i = 0; i < P; i++) {
        sCount[s[i].charCodeAt()]++;
        pCount[p[i].charCodeAt()]++;
      }
        
      //Get substring of length P by sliding remove first character and adding new character at the window of P.
      for (let i = P; i < S; i++) {
        if (compare(sCount, pCount)) {
          arr.push(i - P);
        }

        //Add the next character count;
        sCount[s[i].charCodeAt()]++;

        //Remove the first charater count;
        sCount[s[i - P].charCodeAt()]--;
      }
    
      //Compare the last substring
      if (compare(sCount, pCount)) {
        arr.push(S - P);
      }
      
      //Return the array
      return arr;
  };
Input:
console.log(findAnagrams('cbaebabacd', 'abc'));

Output:
[0, 6]

Time complexity: O(S) where S is the size of the string.
Space complexity: O(256).