Script to fetch top 3 results from Google search

I was approached by a fresher on Linkedin, he was looking for a solution to a question that was asked to him in a recent Interview.

The question was to “Write a script to fetch top 3 results from Google search” in JavaScript.

Write a script to fetch top 3 results from google search

I didn’t had the answer to it right away but as a curious engineer I decided to solve this.

Being the most popular search engine, I knew that Google would provide different results for different search queries and it may also include Ads, Featured posts, Q&A, News, etc.

For example, when I queried “Titanic“, I got two relevant links and then the remaining search result.

Titanic Result

For another query “last element of string javascript“, I got one featured post and then the relevant results.

Featured Post

And then there is a normal result for the query “last element of string java” but it contained a suggestion section.

Nomal Post

I am sure there must be many more types of search results, thus I decided not to consider all the edge cases as they would be redundant, I decided to go for the above three only and return the links of the only relevant results (ones with heading).

The first thing I did after this was inspected the DOM elements to find out how HTML elements are generated so that I can write a script around it to get the links.

What I found out was that Google search results were generated inside a div with the id “search” that contained another div with id “rso” and inside that the results were differentiated with the classes based on their type. For example, the result with the heading (normal results) was placed inside the class “MjjYud”.

Google search DOM

And inside this, the links were placed inside a div with attribute “[data-header-feature=’0′]

Google Search DOM links

I got a hook and based on this I can write a script that will fetch me the links.

const getTop3Links = () => {
  const result = [];
  
  const multipleLinksResults = document.querySelectorAll("#rso [data-header-feature='0'] a");
  
  for(let link of multipleLinksResults){
    const href = link.getAttribute('href');
    if(href) result.push(href);
  }
 
  return result.slice(0, 3);
}

This div with the attribute was common for the Normal post as well as for multiple results like for Titanic.But for the third case Featured post, it was a little different, the featured post has a hidden h2 with the text “Featured snippet from the web“, based on this I had to write the script to fetch the nearest class of “MjjYud” and get the link.

Google Search DOM Featured Post

Thus I had to write a different script to cover this case.

const getFeaturedLinks = () => {
  const h2s = document.getElementsByTagName('h2');
  for(let h2 of h2s){
    if(h2.innerText === 'Featured snippet from the web'){
        const parent = h2.closest('.MjjYud');
        return parent.querySelector('.yuRUbf  a').getAttribute('href');
    }
  }
  
  return undefined;
}

Combining both the scripts together, we can get the top 3 links from Google search, with the featured post if it exists otherwise normal heading links.

const getTop3Links = () => {
  const result = [];
  
  const featured = getFeaturedLinks();
  if(featured) result.push(featured);
  
  const multipleLinksResults = document.querySelectorAll("#rso [data-header-feature='0'] a");
  
  for(let link of multipleLinksResults){
    const href = link.getAttribute('href');
    if(href) result.push(href);
  }
 
  return result.slice(0, 3);
}

const getFeaturedLinks = () => {
  const h2s = document.getElementsByTagName('h2');
  for(let h2 of h2s){
    if(h2.innerText === 'Featured snippet from the web'){
        const parent = h2.closest('.MjjYud');
        return parent.querySelector('.yuRUbf  a').getAttribute('href');
    }
  }
  
  return undefined;
}

Try it out yourself and cover the remaining edge cases.