This post is for technical SEOs that want to understand how Googlebot renders a website.
I will help you understand what is rendering and show you in detail how chrome renders a web page as Googlebot uses evergreen Chromium renderer.
If you want to learn more, a lot of this content comes from Martin’s Splitt conference on Web Rendering Services at TechSEOBoost 2019.
You can also look at this presentation by Erik Hendriks, Software Engineer at Google to get some of the most common pitfalls of rendering.
What is Rendering?
According to Martin Splitt at Google, rendering is the process of turning (hyper)text into pixels.
How Does a Browser Display a Website to a User?
For a browser to display a web page to a user, it follows a few steps.
- Sends an HTTP Request to the server
- Parse the documents received by the server
- Create a Layout from the elements of the DOM tree.
- Paint the HTML
The browser makes an HTTP Request to the server and the server sends data back to the browser.
In the case of a web page, the data sent back to the browser is HTML.
Then, the browser parses the HTML file.
and show the page to the user.
Parsing of the HTML
When a browser renders an HTML file to the user, takes each element one by one and parse it into DOM tree.
What is Parsing
Parsing means to analyze a sentence or a document into its parts. Parsing of an HTML means to analyze one by one each element of the HTML to understand their syntactic roles.
Simply put, parsing the HTML is to understand the structure of the HTML file.
How Chrome Parse the HTML File?
When Chrome Parse the HTML, it sends events to load each element into a DOM tree.
The Document Object Model, or DOM, represents the structure of the content.
As you can see in the picture above, Chrome uses events to parse the HTML. It sends an event when the DOM starts to load, another one to load the DOM, and another one when it ends.
In the performance report, the Onload event is shown by the red line and the DOMContentLoaded Event is show by the blue line.
Note: Make sure that you load all critical content before the DCL event.
Creating the HTML Layout
After the Parsing of the HTML, Chrome knows the structure of your page (what is the H1, H2, where are your p elements, etc.)
Now, Chrome will need to get the layout of the page. It maps each element to a specific position in a browser.
During the layout-ing, the browser is busy with the task and can’t do anything else. This is where the user has no choice but to wait.
These are the red bars that you can see in the graph.
If you continuously do layouting by moving things around in the page inefficiently, that’s a bad idea.Martin Splitt (Google) – TechSEO Boost 2019
Painting of the HTML
Now, Chrome knows where each box goes and what content goes in each box.
Chrome will then paint the HTML tree with pixels. This is where Chrome adds the content to your page. It writes text, loads your images, adds colors, etc.
This is the step where (in Lighthouse) you start to see stuff like “First Paint”, “First Contentful Paint”, “First Meaningful Paint” and “Largest Contentful Paint” (shown by the acronyms in the image below).
The thing is, painting requires Graphical Processing Units (GPUs), which is expensive. Also, Googlebot doesn’t really need to show pixels of the page to understand our content.
If the changes are too big to fit in the initial layout, it can force Googlebot to re-layout, which as we have seen is bad for the user.
ctrl+U) versus what you can find in the DOM (
Martin Splitt explains that the indexing process is made of many microservices that talk to each other and provide this screenshot.
- The dup eliminator processes canonicalization of pages;
- The robots parser analyses the robots.txt file;
- The crawler crawls the web by following links;
- The web rendering service
- The link parser is used by the crawler to list and “click” on all links found on the page
- The Content Parser serves to interpret the syntax of the content.
Web Rendering Service (WRS)
An amazing part of the Googler’s presentation is when he talked about the Web Rendering Service.
In this part, he talks about some cryptic developer stuff like Puppeteer, Service Wrapper and Shadow DOM… oooooouuuuuuhh!!!
I will not go into all the details of it.
Instead, just note these interesting points about WRS:
- The indexing pipeline calls WRS, which renders;
- WRS skips the pixels;
- WRS makes connection requests to read HTML, CSS and JS files using single connections. Each connection is what we call the “crawl budget”;
- WRS relies heavily on the cache and has certain rules to timeout if a page takes too much time to load.
- WRS is used by all testing tools (mobile-friendly test, page speed, etc.). In exception, it will have an aggressive timeout and will not cache the page.
- WRS is stateless (cookies are overridden after each page load).
Other interesting features of Web Rendering Service are explained by Jamie Alberico.
- WRS is the name that represents the collective elements used in Google’s rendering service.
- WRS reads HTTP response and header;
- Googlebot’s first priority is not to degrade the user experience so it uses a “crawl rate limit”.
The end result would look something like this.
How Can You Improve Rendering for SEO?
A lot of stuff can help your crawl budget and the rendering of the page. Remember this, what is good for the user, Googlebot will like it.
Here are a few advises that Martin Splitt gave:
Tip #2: Use Tree shaking to remove unused code
Tip #3: Use simple and clean caching by adding content caches hashes. You can do this by using Webpack.
Tip #4: Bundle your code, but split it into reasonable bundles.
This is it!
Other Useful SEO Content on Rendering
A very special thanks to Martin Splitt for sharing this amount of useful information on what is rendering and how Google renders a website.