ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

An Introduction To Search Engines

Updated on July 17, 2010

In the last unit, we explained why search engine visibility is important. In this unit we will take closer look at search engines. Because SEO is about improving the visibility of your web pages in search engine results, we have to understand a bit about how search engines work. By the end of this unit you should be able to:

  • understand what search engines do
  • understand which search engines to concentrate on when optimising your site
  • understand how search engines rank results
  • measure the PageRank of individual web pages
  • understand how to perform advanced searches

This unit assumes that you have read and understood the last part of the course and that you are comfortable with the terms: keyword, keyphrase and search engine optimisation.

2.1 What is a search engine?

Wikipedia defines a search engine as: ‘a program designed to help find information stored on a computer system such as the World Wide Web, or a personal computer. The search engine allows one to ask for content meeting specific criteria (typically those containing a given word or phrase) and retrieving a list of references that match those criteria. Search engines use regularly updated indexes to operate quickly and efficiently.’

In other words, a search engine is a sophisticated piece of software, accessed through a page on a website that allows you to search the web by entering search queries into a search box. The search engine then attempts to match your search query with the content of web pages that is has stored, or cached, and indexed on its powerful servers in advance of your search.

Note: many search engines allow you to search for things other than text: for example, images. However, for the purpose of this course, we will focus on text-based searches. As we pointed out in the last unit, SEO methods are largely (but not exclusively) centred upon text as they involve matching key parts of the text in your web pages with the keywords or keyphrases that people actually type into search engines when looking for something on the internet.

There are two main types of search indexes we access when searching the web:

  • directories
  • crawler-based search engines

Directories

Unlike search engines, which use special software to locate and index sites, directories are compiled and maintained by humans. Directories often consist of a categorised list of links to other sites to which you can add your own site. Editors sometimes review your site to see if it is fit for inclusion in the directory.

Crawler-based search engines

Crawler-based search engines differ from directories in that they are not compiled and maintained by humans. Instead, crawler-based search engines use sophisticated pieces of software called spiders or robots to search and index web pages.

These spiders are constantly at work, crawling around the web, locating pages, and taking snapshots of those pages to be cached or stored on the search engine’s servers. They are so sophisticated that they can follow links from one page to another and from one site to another.

Google is a prominent example of a crawler-based search engine.

Note: Some search systems are ‘hybrid’ systems as they combine both forms of index. Yahoo, for example, features both directories and search engines.

As we will see later in this course, the SEO process often involves optimising your site in such a way that it allows search engine spiders to locate every page on your site quickly and easily.

Spidering vs submitting your site manually

If you browse the web, you will notice that many companies will offer to submit your site to search engines for inclusion in their listings. The services these companies offer are largely unnecessary and can prove to be a waste of time and money.

It is important to remember that search engine spiders are constantly crawling the web, following links and indexing pages. Because spiders automatically index your pages when they find them, there is absolutely no need to submit your site manually to the major search engines.

Note, however, that the process of being found can take some time, and it can be weeks before the major search engines index your site. SEO is a cost-effective way of making your site visible, but it can take time especially for new sites. However, there are ways to accelerate the indexing process which include xml site maps and RSS. Both these topics will be covered in the next tutorial.

2.2 Which search engines to target?

In the last unit, we suggested that the vast majority of Internet users use search engines to locate products or services. This free system of listings is a more popular method of locating sites than paid-for advertising such as PPC and is thus a better way of improving the visibility of your website. But which search engines do you want to be found by and which search engines should you target?

Although the majority of Internet users rely on search engines to find what they are looking for, they do not all use the same search engines. There are, in fact, numerous search engines out there, all vying for a share in the lucrative search engine market. Here are just a few of the search engines that we use when looking for something on the Internet:

As you can see, then, there are numerous companies we can turn to when searching the Internet. Note, however, that not all of these search engines use truly distinct search technology. AOL, for example, bases part of its search results on Google. Teoma uses Ask Jeeves technology. Dogpile is a meta-crawler, which means that it searches all the major search engines for you and compiles results from places like Google, Yahoo, and Ask Jeeves.

This may seem like a bewildering array of search options and a formidable amount of search engines to optimise your site for. However, we only have to concentrate on the largest players in the search engine market as they have the most people using their search technology, and because they also act as search providers, leasing out their search technology to other search engines.

Let’s look at who the leading players are in the search engine market. The following chart, compiled from data provided by Hitwise shows the search engine market share for December 2007, November 2007 and December 2006.

As we can see, Google, Yahoo, and MSN are the big players in the search engine market, accounting for just over 90% of the total market. This means that more people use their search technology to search for products or services on the web than any other search engine. For this reason, these are the search engines you should primarily focus on when analysing optimising your site. Consequently, these are the search engines we will focus on throughout this course:

  1. Google – www.google.com
  2. Yahoo – www.yahoo.com
  3. MSN – search.msn.com

There are some important things to note about these search engines.

  1. each use different systems to rank pages
  2. because different systems are used, a high ranking for a specific keywords in one search engine does not automatically mean that your page will rank highly for the same keywords in another search engine
  3. nevertheless, each use similar principles to determine the relevancy and importance of web pages in relation to search queries

2.3 Anatomy of a search

In the last unit of the course we began to show you how search engines work. For the sake of simplicity, we can consider the search process to work something like the following:

  1. Search Engine Spiders the web
  2. Search engine caches pages that its spiders on its servers
  3. User enters a search query
  4. Search engine checks the search query against its index
  5. Search engine returns what it believes to be the most relevant results for that query

Although the process is actually more complex than this, the above diagram is useful in helping us to visualise how searches work, more so in reminding us that when we enter a search term, the search engine does not actually rush off and check every page on the web. This would take far too long. Instead it checks your search term against an index that is stored on its servers. Spiders working their way around the web constantly update this index.

Note: because pages are indexed in advance of searches, the results returned might be out of date. When you click on the link for one of the results, for example, you may find that the page has been updated since the search engine last spidered it, or even that the page you want has moved.

If I carry out a search for cheap web-hosting, the search engine checks its index to see which pages carry the terms ‘cheap’, ‘web’ and ‘hosting’. It then returns a results page containing what it believes are the most relevant pages for these particular keywords.

Let’s look at a typical search result page. Thispage shows the results for the above search in Google (Illustration 1). The results page is set out as follows:

  1. Search box with our search query
  2. The number of results Google returned for our search query (plus the time the search took)
  3. Sponsored links. This is paid-for advertising. For this results page, Google has selected adverts that are relevant to our search query.
  4. Search results. This section shows the pages that Google thinks are most relevant to our particular search terms. These listings are free.
  5. Link/Page title. The text is the exact text that appears between the title tags (<title></title>) on the page that the search result links to. Notice how keywords from our search query have been highlighted.
  6. Page description. This text is commonly the actual text that appears in the meta description of the page that the search result links to. This is the text between the quotation marks in the HTML tag <META NAME="description" content="YOUR TEXT HERE">. Again, Google has matched this text with our search query.
  7. Domain. This is the address of the page linked to.
  8. Cached page link. Unlike the above link, which links to the domain that the page is on, this link takes us to the cached version of the page that Google has stored on its server.
  9. More results. Links to further pages of results

We will now look at some of the ways in which search engines rank pages when determining search results.

2.4 Ranking

Algorithms

Search providers use complex mathematical equations called algorithms to rank web pages. These algorithms make calculations about the relevance of words on web pages in relation to search queries or the perceived importance and link popularity of websites. They may also take other factors into account when ranking results, such as the age of the domain your site is on, or whether the terms used in a search query appear in the URLs of sites in the search engine’s index.

You may be surprised to learn that SEO professionals are not entirely sure how these algorithms work. In fact, search algorithms are a closely guarded trade secret. If they were made available to the public, we would see a lot more websites trying to find ways to exploit them in order to gain better search engine rankings.

Algorithms tend to be patented, and these patents can sometimes give SEO professionals a clue as to how search engines rank the relevance and importance of web pages. Otherwise, SEO involves a fair degree of trial and error, and most of the SEO process falls back upon tried and tested methods that circulate amongst the SEO community and that have been shown to be effective in improving search engine visibility (SEO websites and forums can be a good place to visit to see SEO professionals discussing these methods and exchanging ideas).

2.4.1 Page Importance

There are two main factors that search engines use to determine the position that pages will gain in search results:

  • Keyword relevancy
  • Page importance or link popularity

As we noted above, when you carry out a search query, the search engine tries to return relevant pages for that query by returning pages that contain the keywords in your search query.

However, search engines also take the importance of the page into account when ranking pages. This importance is based on the number of external links pointing to a page. The more links pointing to your pages, the more important they are deemed to be by the search engine.

The best example of this system of ranking pages is Google’s patented PageRank.

2.5 Google PageRank

Google’s PageRank is a system that rates the importance of pages in direct proportion to the number of external links pointing to that page.

PageRank exploits the network of links on the web in order to determine the relative value of individual web pages. It does this by counting the number of links pointing to one page from other sites. As Google puts it, a link to one of your pages from another site is considered a ‘vote’ in favour of that page.The higher the votes, the greater the value or perceived importance of the page.

However, Google also takes the importance of the page that links to your page into account when determining the value of your page. If the page that links to you is already seen to have a high importance –in other words, if it already has a high PageRank – then the link it provides is ‘weighted’ higher than a link coming from a page with a lower PageRank or lesser importance.

Google then combines PageRank with page relevance to ensure that the pages returned in results are not only important in themselves but are also relevant to your search.

You can find Google’s own explanation of PageRank here:

http://www.google.com/technology/

Pages Vs Websites

Google PageRank applies to individual pages and not websites as a whole. Pages on the same site will often have a different PageRank.

It is important to note this emphasis on individual pages rather than sites as a whole. Similarly, when we carry out a search in a search engine, the results returned refer to individual pages rather than whole sites. This makes absolute sense from the point of view of the both the search engine and the user. Some pages within a website will usually be more important than others, e.g. the homepage. Also individual pages within websites are not always relevant to the same things and may cover topics that are unrelated to user’s search query.

From an SEO point of view, you will be looking to optimise individual pages so that they rank for different keywords. We will show you effective methods of achieving this later in the course.

Although PageRank is specific to Google, most of the major search engines now use a similar system to determine the position of pages in search results.

2.5.1 How to check PageRank

It is particularly important that you learn how to understand and measure PageRank, as it will play a significant part in your future SEO efforts. The ability to measure PageRank will help you analyse competitor’s web pages and to keep track of how well your own web pages are faring when they are optimised and online.

To measure Google PageRank you must first install the free Google Toolbar into your browser. 2.5.2 Installing the Google Toolbar To get the toolbar, navigate your browser to the following URL:

http://toolbar.google.com/

Different versions of the toolbar are available for different browsers like Internet Explorer and Firefox. Google should automatically detect which browser you are using and offer a download for the appropriate version.

2.5.3 Measuring PageRank

With the toolbar installed, try browsing the web. As you navigate from page to page, the little bar next to ‘PageRank’ will fill up green as the PageRank for the current page increases and go down as the PageRank for the current page decreases.

To get a more accurate numeric measure of PageRank, hover your mouse over the part of the Toolbar that reads PageRank. A small dialogue box should appear with the following text:

‘PageRank is Google’s measure of the importance of this page (x/10)’

where x is the actual value of the page out of a total of 10. The higher the number, the higher the PageRank for that page.

You can now measure the PageRank of web pages. This has many SEO applications, including:

  • The ability to measure the PageRank of your own pages
  • The ability to measure the PageRank of competitor’s web pages
  • The ability to measure the PageRank of potential link partners. Remember that the more important the site that links to you is, the more weight is given to that link, hence the greater your perceived importance.

TASK 1: MEASURING PAGE RANK

Let’s try measuring the PageRank of some web pages:

  • Install the Google Toolbar into your browser, making sure that you enable advanced options.
  • Once installed, try carrying out a search for the kind of products or services that your website offers.
  • Visit all the pages returned on the first page of search engine results, and note down their PageRank.
  • Compare the PageRank of these pages. Which pages have the highest measure of importance and which the lowest?

SUMMARY:

  • Search engines allow us to search the web by entering search queries that the search engine compares against its index of web pages.
  • The leading search engines are currently Google, Yahoo, and MSN.
  • Crawler-based search engines use software called spiders to crawl the web and index web pages.
  • Search engines use complex mathematical algorithms to rank web pages.
  • Search engine ranking is based on a combination of page relevance and page importance.
  • Page importance (or PageRank) is based on the link popularity of a web page and the quantity and quality of external links pointing to that page.
  • PageRank is calculated on a per-page basis and does not apply to websites as a whole.

2.6 Conclusion

Search Engines are sophisticated engines that allow users to quickly locate products and services on the Internet. Since SEO is aimed at improving your visibility in search engine results, it is essential that you understand the criteria they use to rank web pages. In the next units of this course we will show how to use search engines to help locate the right keywords for your products and help analyse the competition you will face in search engine listings.

REFLECT:

What do you understand by the following terms?

  • Search Engine
  • Search query
  • Spider
  • Page Importance
  • Link Popularity
  • PageRank

Once you fell that you can satisfactorily explain these terms move on to the next unit of the course.

Course Index

01: A Free SEO Training Course For Hubbers

02: SEO Course Outline

03: An Intoduction to SEO

04: An Introduction to Search Engines (You Are Here)

05: Search Engines and Latent Semantic Indexing

06: Search Engine Users

07: Keyword Research

08: Competitor Research

09: A Guide to PageRank

10: On Page SEO Part 1

11: On Page SEO Part 2 - Introduction To Quality Signals

Related SEO Hubs And Articles

Article Spinning

Why Article Spinning Is A Complete Waste Of Time

Small Business SEO

Internet Marketing Scotland: Promoting business online with professionalism and integrity.

working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)