I’m going to do a massive bit of oversimplification here and explain how Google works first. Anyone with any experience please don’t shout at me that this is so simple, but we need to start from the beginning:
Googlebot is a robot that goes to page after page on the internet. It’s job is to look at the page, find the content (text, images, video, et), catalogue it and then move on to the next page in its list.
When you come to a page, you see something like this:
Traditionally when Googlebot came to the site, it wouldn’t see this as it wouldn’t render the page. Instead it would appear as you’d imagine the page to when you view the source code:
If you view the source of a page then you’ll see something similar to this. Effectively this tells your browser what the page should look like, by giving it all the words, the links, where the images are and then pointing at any other files that are important for the browser.
- Cascading Style Sheets are usually one main file, but sometimes several files that tell your browser how to organise the words and images on the page: which font to use, which size font, what colour font, what colour background, etc. If I want everything that I set up as Title 3 to appear in the same way on all pages, then I can dictate this from the style sheet by including a bit in there that says everything tagged as Title 3 should appear in this way.
Having collected all that information about each page, Google then does its magic ‘algorithm’.
The magical algorithm is a series of 100s of different factors that Google runs through each time you do a search to work out which pages are the most valuable to give the results. These factors include things like the words appearing on the page, the words appearing in links pointing to the page, the page loading quickly, the words appearing in the title tag of the page, etc, etc. If I knew what they all were and how each was weighted then I’d be making a fortune selling my services.
Fortunately a lot of it is known because clever people have done a load of testing to see which things make a difference. Google gets around this problem of reverse engineering pages by changing their algorithm frequently (Panda, Penguin and Hummingbird) – adding in new factors and removing old ones which no longer show that a page is valuable or just simply changing the weighting of certain factors. What hadn’t really changed before was the way that it crawled the page.
Of course it isn’t just that simple. Google isn’t just running like this, it is doing it both the new way and the old way:
What we don’t know is that magical algorithm and how much weight Google puts onto the stuff it finds. I suspect that this early into the process, Google doesn’t know what it is going to do with it either, so the old argument should still stand:
Create unique, interesting and relevant content, written in a way that the audience in question will search for it and it doesn’t matter how you present it to the search engine.