Script to fetch top 3 results from Google search

I was approached by a fresher on Linkedin, he was looking for a solution to a question that was asked to him in a recent Interview.

The question was to โ€œWrite a script to fetch top 3 results from Google searchโ€ in JavaScript.

Write a script to fetch top 3 results from google search

I didnโ€™t had the answer to it right away but as a curious engineer I decided to solve this.

Being the most popular search engine, I knew that Google would provide different results for different search queries and it may also include Ads, Featured posts, Q&A, News, etc.

For example, when I queried โ€œTitanicโ€œ, I got two relevant links and then the remaining search result.

Titanic Result

For another query โ€œlast element of string javascriptโ€œ, I got one featured post and then the relevant results.

Featured Post

And then there is a normal result for the query โ€œlast element of string javaโ€ but it contained a suggestion section.

Nomal Post

I am sure there must be many more types of search results, thus I decided not to consider all the edge cases as they would be redundant, I decided to go for the above three only and return the links of the only relevant results (ones with heading).

The first thing I did after this was inspected the DOM elements to find out how HTML elements are generated so that I can write a script around it to get the links.

What I found out was that Google search results were generated inside a div with the id โ€œsearchโ€ that contained another div with id โ€œrsoโ€ and inside that the results were differentiated with the classes based on their type. For example, the result with the heading (normal results) was placed inside the class โ€œMjjYudโ€.

Google search DOM

And inside this, the links were placed inside a div with attribute โ€œ[data-header-feature=โ€™0โ€ฒ]โ€

Google Search DOM links

I got a hook and based on this I can write a script that will fetch me the links.

const getTop3Links = () => {
  const result = [];
  
  const multipleLinksResults = document.querySelectorAll("#rso [data-header-feature='0'] a");
  
  for(let link of multipleLinksResults){
    const href = link.getAttribute('href');
    if(href) result.push(href);
  }
 
  return result.slice(0, 3);
}

This div with the attribute was common for the Normal post as well as for multiple results like for Titanic.But for the third case Featured post, it was a little different, the featured post has a hidden h2 with the text โ€œFeatured snippet from the webโ€œ, based on this I had to write the script to fetch the nearest class of โ€œMjjYudโ€ and get the link.

Google Search DOM Featured Post

Thus I had to write a different script to cover this case.

const getFeaturedLinks = () => {
  const h2s = document.getElementsByTagName('h2');
  for(let h2 of h2s){
    if(h2.innerText === 'Featured snippet from the web'){
        const parent = h2.closest('.MjjYud');
        return parent.querySelector('.yuRUbf  a').getAttribute('href');
    }
  }
  
  return undefined;
}

Combining both the scripts together, we can get the top 3 links from Google search, with the featured post if it exists otherwise normal heading links.

const getTop3Links = () => {
  const result = [];
  
  const featured = getFeaturedLinks();
  if(featured) result.push(featured);
  
  const multipleLinksResults = document.querySelectorAll("#rso [data-header-feature='0'] a");
  
  for(let link of multipleLinksResults){
    const href = link.getAttribute('href');
    if(href) result.push(href);
  }
 
  return result.slice(0, 3);
}

const getFeaturedLinks = () => {
  const h2s = document.getElementsByTagName('h2');
  for(let h2 of h2s){
    if(h2.innerText === 'Featured snippet from the web'){
        const parent = h2.closest('.MjjYud');
        return parent.querySelector('.yuRUbf  a').getAttribute('href');
    }
  }
  
  return undefined;
}

Try it out yourself and cover the remaining edge cases.