- HubPages»
- Technology»
- Internet & the Web»
- Search Engines
Google Caffeine - seo implications of Google's Next Generation search engine
Google is testing a new version of their search engine - nicknamed Google Caffeine.
On their webmaster central blog, they say that
The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results. But web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback.
and they are inviting webmasters to offer feedback.
Where to find the test search engine
To test out the new search engine go to Google Caffeine [edit: Caffeine has now gone live, so the test version has disappeared], and type in queries as normal.
To spot the differences it's helpful to open a second tab with the existing search engine and perform the same queries.
The differences in the results mean that webmasters will need to optimize for Google Caffeine.
What are the differences between Google Caffeine and Google?
[Note: - some of this has changed - I have kept the original findings and put an update below, so it's obvious just what has changed in the last month]
1. Google Caffeine is way faster to load.
2. The new index is much bigger (to check this, type in your query and see the number returned on the top right of the screen and compare to the old search engine).
However, on certain subjects, the new engine has a smaller index. eg at the time I tested, "make money online" has 98.7 million results in the new engine but 169 million results in the old engine. "Weight loss" has 68.4 milion results in the new engine but 101 million results in the old engine. "credit cards" has 86.7 million results in the new engine but 131 million results in the old engine. This tells me that the new engine has deindexed a lot of the sites that scammers have put up - the fake copy/paste jobbies designed to lure some desperate unsuspecting visitor to parting with money.
What about certain domains?
Well Hubpages comes out better - putting in the operator site:hubpages.com into the new engine returns 1,830,000 results, compared to 1,800,000 results in the old engine. Squidoo also gets more pages indexed; 2,200,000 results compared to 2,180,000 in the old engine, though notably they didn't gain as many pages as Hubpages. Ezine articles does significantly better - 4,110,000 in the new engine compared to 4,000,000 in the old one - a 2.75% improvement. Infobarrel, the new article site that everyone is getting excited about loses pages: 12,000 in the new engine compared to 12,400 in the old. EHow also loses: 4,290,000 in the new engine compared to 4,330,000 in the old engine.
3. The algorithm is slightly different and thus the order of the results is different. From my initial tests, they seem to give more weight to having the entire keyword string in the URL. In my test, I saw pages in the old engine where the URL had numbers after the domain name disappear in the new engine to be replaced with pages that had the keyword string or at least part of it in the URL.
4. pages with the keyword in the title, in the snippet of text and in the URL seem to do best.
5. twitter pages seem to be showing up higher in the new engine as are Facebook pages.
6. They seem to be focusing on real time, so that pages are being popped into the results even before they've been fully indexed for breaking news subjects (i.e. you will see them there without a cache indicating that it's the first time the bot found them).
7. More weight seems to be given to on-page SEO. For instance I've spotted pages with keywords bolded on the page in the new engine which weren't anywhere to be found in the first three pages of the old engine.
I will be testing a ton more over the next few days to find out the search engine optimization implications of Google Caffeine, and will update this page. If people have any observations, please leave your thoughts in the comments.
Update 12th Sept 2009 & 12 Oct 2009
It's been interesting to revisit point number 2 above.
There has been much churning of pages in both the old and the new Caffeine index. Google seems to be making it's pagerank changes in the existing index first, and then amending the new Caffeine index. And the way they seem to make the changes is to remove pages from the index, and then gradually add pages back that fit their new ranking criteria.
Rather than subject you to a lot of text to read about which changes have occured, I've summarized them in the tables below. It provides a fascinating snapshot of how Google changes the composition of it's index on an almost continual basis.
keyword: Make money online
date
| old (existing index
| New Caffeine index
|
---|---|---|
12 Aug 2009
| 169,000,000
| 98,700,000
|
12 Oct 2009
| 174,000,000
| 146,000,000
|
Keyword: Weight loss
date
| old (existing) index
| New Caffeine Index
|
---|---|---|
12 Aug 2009
| 101,000,000
| 68,400,000
|
12 Sept 2009
| 101,000,000
| 66,400,000
|
12 Oct 2009
| 101,000,000
| 89,000,000
|
Keyword: Credit Cards
date
| Old (existing) index
| New Caffeine index
|
---|---|---|
12 Aug 2009
| 131,000,000
| 86,700,000
|
12 Sept 2009
| 129,000,000
| 80,100,000
|
12 Oct 2009
| 42,400,000
| 107,000,000
|
Site: Hubpages.com
date
| old (existing) index
| new caffeine index
|
---|---|---|
12 Aug 2009
| 1,800,000
| 1,830,000
|
12 Sept 2009
| 1,100,000
| 1,140,000
|
12 Oct 2009
| 1,080,000
| 1,110,000
|
Site: Squidoo.com
date
| old (existing) index
| new Caffeine index
|
---|---|---|
12 Aug 2009
| 2,180,000
| 2,200,000
|
12 Sept 2009
| 2,200,000
| 2,220,000
|
12 Oct 2009
| 4,780,000
| 2,320,000
|
Site: Ezine articles
date
| old (existing) index
| new Caffeine index
|
---|---|---|
12 Aug 2009
| 4,000,000
| 4,110,000
|
12 Sept 2009
| 3,750,000
| 3,740,000
|
12 Oct 2009
| 3,690,000
| 3,640,000
|
Site: EHow.com
date
| old (existing) index
| new caffeine index
|
---|---|---|
12 Aug 2009
| 4,330,000
| 4,290,000
|
12 Sept 2009
| 4,170,000
| 4,170,000
|
12 Oct 2009
| 4,310,000
| 4,300,000
|
Why does the composition of the index matter so much?
It's clear from everything Google has said (and the results above) that this update to their search engine is more about the composition of the index than about algorithm changes (which appear to be slight).
However, changes in the composition of the index can have a bigger impact than algorithm changes. For instance if they suddenly find 200 pages that have links to you, you will rise in the rankings. If they suddenly deindex 200 pages linking to your site, you lose the value of any links to your site. This is particularly important if you are using Ezine articles to get backlinks.
I'm surprised that there is not more discussion about the impact of Caffeine to be honest. Apart from the initial flurry, when everyone talked about how much new social media was included, everyone has gone quiet. In particular people seem to have missed the massive deindexing that is taking place in some topics.
Update Feb 2010: it appears Google Caffeine has started to be rolled out in most data centres
Update 2012 It seems that the Caffeine infrastructure has helped Google roll out the Panda and penguin updates as it has provided them with the computing power needed for their new machine learning capability