ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Google Caffeine - seo implications of Google's Next Generation search engine

Updated on January 17, 2013

Google is testing a new version of their search engine - nicknamed Google Caffeine.

On their webmaster central blog, they say that

The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results. But web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback.

and they are inviting webmasters to offer feedback.

Where to find the test search engine

To test out the new search engine go to Google Caffeine [edit: Caffeine has now gone live, so the test version has disappeared], and type in queries as normal.

To spot the differences it's helpful to open a second tab with the existing search engine and perform the same queries.

The differences in the results mean that webmasters will need to optimize for Google Caffeine.

What are the differences between Google Caffeine and Google?

[Note: - some of this has changed - I have kept the original findings and put an update below, so it's obvious just what has changed in the last month]

1. Google Caffeine is way faster to load.

2. The new index is much bigger (to check this, type in your query and see the number returned on the top right of the screen and compare to the old search engine).

However, on certain subjects, the new engine has a smaller index. eg at the time I tested, "make money online" has 98.7 million results in the new engine but 169 million results in the old engine. "Weight loss" has 68.4 milion results in the new engine but 101 million results in the old engine. "credit cards" has 86.7 million results in the new engine but 131 million results in the old engine. This tells me that the new engine has deindexed a lot of the sites that scammers have put up - the fake copy/paste jobbies designed to lure some desperate unsuspecting visitor to parting with money.

What about certain domains?

Well Hubpages comes out better - putting in the operator site:hubpages.com into the new engine returns 1,830,000 results, compared to 1,800,000 results in the old engine. Squidoo also gets more pages indexed; 2,200,000 results compared to 2,180,000 in the old engine, though notably they didn't gain as many pages as Hubpages. Ezine articles does significantly better - 4,110,000 in the new engine compared to 4,000,000 in the old one - a 2.75% improvement. Infobarrel, the new article site that everyone is getting excited about loses pages: 12,000 in the new engine compared to 12,400 in the old. EHow also loses: 4,290,000 in the new engine compared to 4,330,000 in the old engine.

3. The algorithm is slightly different and thus the order of the results is different. From my initial tests, they seem to give more weight to having the entire keyword string in the URL. In my test, I saw pages in the old engine where the URL had numbers after the domain name disappear in the new engine to be replaced with pages that had the keyword string or at least part of it in the URL.

4. pages with the keyword in the title, in the snippet of text and in the URL seem to do best.

5. twitter pages seem to be showing up higher in the new engine as are Facebook pages.

6. They seem to be focusing on real time, so that pages are being popped into the results even before they've been fully indexed for breaking news subjects (i.e. you will see them there without a cache indicating that it's the first time the bot found them).

7. More weight seems to be given to on-page SEO. For instance I've spotted pages with keywords bolded on the page in the new engine which weren't anywhere to be found in the first three pages of the old engine.

I will be testing a ton more over the next few days to find out the search engine optimization implications of Google Caffeine, and will update this page. If people have any observations, please leave your thoughts in the comments.

Update 12th Sept 2009 & 12 Oct 2009

It's been interesting to revisit point number 2 above.

There has been much churning of pages in both the old and the new Caffeine index. Google seems to be making it's pagerank changes in the existing index first, and then amending the new Caffeine index. And the way they seem to make the changes is to remove pages from the index, and then gradually add pages back that fit their new ranking criteria.

Rather than subject you to a lot of text to read about which changes have occured, I've summarized them in the tables below. It provides a fascinating snapshot of how Google changes the composition of it's index on an almost continual basis.

keyword: Make money online

date 
old (existing index 
New Caffeine index 
12 Aug 2009 
169,000,000 
98,700,000 
 
 
 
12 Oct 2009 
174,000,000 
146,000,000 

Keyword: Weight loss

date 
old (existing) index 
New Caffeine Index 
12 Aug 2009 
101,000,000 
68,400,000 
12 Sept 2009 
101,000,000 
66,400,000 
12 Oct 2009 
101,000,000 
89,000,000 

Keyword: Credit Cards

date 
Old (existing) index 
New Caffeine index 
12 Aug 2009 
131,000,000 
86,700,000 
12 Sept 2009 
129,000,000 
80,100,000 
12 Oct 2009 
42,400,000 
107,000,000 

Site: Hubpages.com

date 
old (existing) index 
new caffeine index 
12 Aug 2009 
1,800,000 
1,830,000 
12 Sept 2009 
1,100,000
1,140,000 
12 Oct 2009 
1,080,000 
1,110,000 

Site: Squidoo.com

date 
old (existing) index 
new Caffeine index 
12 Aug 2009 
2,180,000 
2,200,000 
12 Sept 2009 
2,200,000 
2,220,000 
12 Oct 2009 
4,780,000 
2,320,000

Site: Ezine articles

date 
old (existing) index 
new Caffeine index 
12 Aug 2009 
4,000,000 
4,110,000 
12 Sept 2009 
3,750,000 
3,740,000 
12 Oct 2009 
3,690,000 
3,640,000 

Site: EHow.com

date  
old (existing) index 
new caffeine index 
12 Aug 2009 
4,330,000 
4,290,000 
12 Sept 2009 
4,170,000 
4,170,000 
12 Oct 2009 
4,310,000 
4,300,000 

Why does the composition of the index matter so much?

It's clear from everything Google has said (and the results above) that this update to their search engine is more about the composition of the index than about algorithm changes (which appear to be slight).

However, changes in the composition of the index can have a bigger impact than algorithm changes. For instance if they suddenly find 200 pages that have links to you, you will rise in the rankings. If they suddenly deindex 200 pages linking to your site, you lose the value of any links to your site. This is particularly important if you are using Ezine articles to get backlinks.

I'm surprised that there is not more discussion about the impact of Caffeine to be honest. Apart from the initial flurry, when everyone talked about how much new social media was included, everyone has gone quiet. In particular people seem to have missed the massive deindexing that is taking place in some topics.

Update Feb 2010: it appears Google Caffeine has started to be rolled out in most data centres

Update 2012 It seems that the Caffeine infrastructure has helped Google roll out the Panda and penguin updates as it has provided them with the computing power needed for their new machine learning capability

Other pages you may be interested in

working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)