Javascript SEO – Challenges and Best Practices

Posted on Dec 13, 2019
Javascript SEO – Challenges and Best Practices

The popularity of javascript frameworks and libraries is increasing every year. The majority of websites nowadays are using some web Javascript frameworks or libraries such as Angular, Vue.js, and React. Even though it’s ideal from a User experience perspective for the web developers and end-users, the real challenge this leave apart is for SEOs and website admins, as they used to fail to rank websites built using these Javascript frameworks and libraries, in the search results even though they deserve a good position otherwise. Even though this was a serious concern during the initial years of Javascript development, things have changed a lot from there. However, the confusions are not fully over yet as many websites still fail on SEO rankings. Let’s discuss more Javascript SEO in this article.

Crawling and Rendering Javascript by Google

Crawling and Rendering Javascript by Google

According to Google, they can crawl and render websites with Javascript pretty well nowadays. But still, there are some complications as Google always advises admins to be cautious in this matter. Google says “Sometimes things don’t go perfectly during rendering”. This may lead to a negative impact on the search results for the web site.

Crawling and rendering depend upon three factors:

  1. Crawlability: It defines if Google can crawl the website with a structured way
  2. Rendaribilty: It defines if Google can render the website without struggling
  3. Crawl budget: It defines the time Google is taking to render and crawl the website.

Also Read: Top Features of Angular

Rendering

There are two types of rendering – Client side and Server side.

Server-side rendering is the traditional approach to rendering. It is very simple and does not have complications. The browser, or in some cases the Googlebot, receives an HTML. This HTML tells the structure of the page that a content copy that is already there and all that browser needs to do is download the CSS and display the content as defined. Search engines do not have much problem with this approach

But client-side rendering is different. It has more complications and most of the time, search engines like Google and Bing really struggle with client-side rendering. You may have noticed a black page during the initial loading. And then suddenly, content starts appearing on the screen. This happens because javascript downloads the copy of the content asynchronously from the server and then display it on the web screen. If the website is client-side, you should always make sure that Google (or any other major search engine) is able to perform the crawling and rendering of the pages in a proper manner.

Remember, Javascript is error sensitive, very error sensitive. One simple error and Google won’t be able to render the page. HTML and Javascript are way different when it comes to error handling.

Javascript Crawling Against HTML Crawling

Javascript crawling against HTML crawling

As discussed earlier, traditional HTML is very straight forward. Let’s see how Google crawls traditional HTML.

  1. An HTML file is downloaded by the Googlebot.
  2. The links are extracted by the Googlebot, from the source. The Googlebot can visit these links simultaneously.
  3. Then it downloads the CSS files.
  4. All the files are sent to the Caffeine indexer.
  5. And finally, the indexer indexes the page.

This is a fast process and it does not have many complications. But it is different for javascript-based websites.

  1. An HTML file is downloaded by the Googlebot.
  2. The CSS and javascript files are downloaded by the Googlebot.
  3. Unlike the traditional HTML approach, it does not send the downloaded file to the indexer, instead, Googlebot uses the Google Web Rendering Service to parse, compile and execute the javascript code. The Web Rendering Service is a part of the Caffeine indexer.
  4. Then only, the indexing on the content can be done by the indexer.
  5. At last, Google can find new links to add them to the Googlebot’s crawling queue.

You can observe there are many differences between the two processes. While the HTML one is simple and straightforward, the javascript one is long and complicated. Everything okay till the CSS and Javascript files is downloaded, but the next part, i.e. the parsing and compiling and executing the javascript code is a very protracted task. Google cannot just perform the indexing. It then requires to wait until every step is done. Moreover, discovering new links is also a slow process. Mostly, Google cannot find new links until the page is properly and fully rendered.

Also Read: Common WordPress Errors

Crawl Budget

What is Crawl Budget

Now, you need to understand what is crawl budget. Crawl budget is the number of the page that is crawled and indexed on a website by the Googlebot in a given timeframe. Why crawl budget is important? Well, if Google does not index a page, it’s not going to rank for anything. To understand this, if the number of pages on your website exceeds the crawl budget(crawl budget of your website), there will be pages that will not be indexed.

Google is very good at finding and indexing pages. The vast majority of websites do not need to worry about the crawl budget. But there are few cases where the crawl budget needs to be considered.

  1. If the website is big, say 10k+ pages.
  2. If you added a new section to your website, say a section containing 100 pages.
  3. If there lots of redirects on your page.

If your website comes under the above cases, Google may struggle with crawling and indexing it.

Also Read: Features of React JS

Google’s Technical Limitations

Google uses older chrome 41 for rendering

Chrome 41 is used by Google to render websites and considering the fact that this is an old browser, which was released in 2015 March. Yes! four years old browser, especially at this time when Javascript has evolved a lot. This is the reason for technical limitations.

Newly added features of the javascript cannot be accessed by the Googlebot. Let’s have a look at the major drawback of this.

  1. ES6 is not completely supported by chrome 41. Features such as let are not properly supported.
  2. Interfaces including WebSQL and IndexedDB are disabled.
  3. Cookies, local storage, session storage, etc are actually cleared across the page loads.

Chrome 77 is the latest release (while writing this), and Google uses its Chrome 41, as mentioned earlier, for the rendering. You can see the difference. For more clarity, download the chrome 41 and try to render your website. You may notice what I am trying to say more clearly.

But this does not mean you cannot use the modern features of javascript.

Javascript is growing at a rapid pace but as mentioned earlier, Google uses a 4-year-old browser for rendering. So how can we use modern features? There is a technique known as graceful degradation. For modern browsers that support these features, everything is fine, but you’ve got to make sure that the website performs perfectly for older browsers, so everything can work finely there too. You can perform feature detection to verify if the browser does support any of the features or not. So if the feature is not supported by a browser, you can offer it a feature that is supported by it. This technique is called polyfill.

Googlebot

Googlebot

We have been talking about Google using a browser or Googlebot for crawling and rendering. But Googlebot does not actually work as any of the real browsers do. A real browser such as Chrome, Mozilla, etc downloads every file -script, image, and stylesheet- to render the view. But Googlebot only downloads the resources required for crawling. 

As the internet is so huge, Google optimizes its crawler for better performance. Obviously, visiting every page will affect performance. Another reason why Googlebot does not visit every page is the algorithm it uses. It checks if a particular resource is required to be rendered, if not, Googlebot will ignore it.

So if something is not crawled or rendered, it may be because Googlebot’s algorithm decided it was not necessary, or simply there was a performance issue.

5-Second Rule

Actually, there is no time limit, but it is often believed that Google cannot wait for more than 5 seconds for a script to load. It is not easy to make an assumption on this topic, but here are a few factors that are considered.

  1. Importance of the page
  2. The current server load of Google
  3. Amount of URLs present in the queue for rendering

If the website is slow, there can be losses such as:

  1. No user wants to visit a slow website. The user may get irritated and leave the website.
  2. A slow website can lead to slow rendering.
  3. Crawling can get effect badly. If the website is slow, the crawler will get slower.

It is always a better option to create a lightweight website and make the server response fast. Don’t make Googlebot’s task more complex, as we know it is already difficult for Googlebot to crawl and render.

Also Read: Top WordPress SEO Plugins

Prerendering and Isomorphic Javascript

Prerendering and Isomorphic Javascript

Prerendering: Sometimes you may notice that search engine crawlers cannot render the website. During such situations, you do it from your side. You have to feed an HTML snapshot to the crawler when it visits your website. Remember, this HTML snapshot does not contain any javascript. At the very same time, the user receives the javascript version. But note one point here, HTML snapshot is only for bots. Also, there are external services such as Prerender.io, and tools such as PhantomJS or Chrome Headless on the server-side.

Isomorphic: It is a popular approach and it is recommended over Prerendering. In this approach, both search engines and the user receives the page containing full content at the initial load.

Also Read: WooCommerce SEO Tips

Overcome Common Problems with Javascript Websites

Make Sure Googlebot Does not Block Javascript and CSS files

As we discussed, Googlebot crawls the javascript and renders the content on the screen. Make sure, any internal or external resources that are required for rendering are not blocked by Googlebot.

Use Google Search Console

If there is a problem in rendering a robust website, use the fetch and render, to check if Google can still perform rendering.

Avoiding Hashes in URL

You may have noticed “hashes” in the URLs. It is common and it can cause problems. Googlebot may not crawl such hashtags. For example:

xyz.com/#/abc/

xyz.com#URL

Avoid such URLs. The following is an example of a good URL.

xyz.com/abc/xxx

Avoid Slow Scripts

As we discussed earlier, in the case of a javascript-based website, Google has to download, parse, and execute. This is time-consuming and if the scripts are slow, then it is much more time-consuming. The crawl budget will burn in such cases. Make sure the scripts are fast and Google does not have to wait long to fetch them.

Using Canonical Tags

If you want to use canonical tags, make sure you place them in plain HTML. If you inject them via javascript, there are chances that Google will ignore them.

Conclusion

So there are issues when it comes to javascript-based websites. It is just a beginning for developers to work with SEO. Moreover, Google is still struggling in crawling javascript-based website. It will take time. No website is the same. So if you are preparing for a javascript rich website, make sure the developer has enough knowledge about javascript SEO.

Acowebs are developers of WooCommerce plugins that will help you personalize your stores. It supports the additional option with feature-rich add-ons which is WooCommerce Product Addons, that are lightweight and fast. Update your store with these add-ons and enjoy a hassle-free experience.

WRITTEN BY
Rithesh Raghavan

Rithesh Raghavan, is a seasoned Digital Marketer with more than 17+ years in Digital Marketing & IT Sales. He loves to write up his thoughts on the latest trends and developments in the digital world, especially related to WordPress, Woocommerce and Digital Marketing.