ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Basic SEO for HTML - Creating a Robots.txt Script for a Search Engine Crawler

Updated on August 31, 2013

Every Website Should Use The Robots.txt

Bleep Bleep, the robots are coming! The robots.txt informs crawlers of what is, and isn't to be displayed in search results. This is basic SEO
Bleep Bleep, the robots are coming! The robots.txt informs crawlers of what is, and isn't to be displayed in search results. This is basic SEO | Source

How Does Using the robots.txt help a websites SEO?

The robots.txt is used by all search engine friendly websites.

The robots.txt tells a crawler the URL's to be indexed, and the URL's that are not to be indexed. It also contains the exact URL address of the sitemap script.

The robots.txt is the tool used in determining what is wanting to be indexed and what isn't wanting to be indexed. The sitemap is an xml script that lists the locations of all URL's on your website and how frequently they are updated.

It's best to break this scripts functions and attributes up into sections so one is not confronted with a tonne of information, in this article we will discuss only the robots.txt.

Let's create a scenario to express how beneficial the script is and what it is used for.

Your at a newly built shopping mall and your child needs to use the bathroom... Urgently!

You have taken (what you think) must be a wrong turn as you have absolutely no idea where you are.

There's no signs around showing the toilets location, you don't have a map of the mall and you are unable to find a familiar area where toilets may be located.

What do you do?

You can ask a shop for directions (robots.txt)

The shopkeeper may not know the intricate workings and layout of the entire mall, but he has a general enough idea to express what toilets are in good condition and where you can find a map of the mall (sitemap).

The map of the mall (sitemap)

The map of the mall tells you pretty much everything. It tells you where the toilets are, as well as every other shops location. By looking for a date on the map, you can also see how up to date the map is and the layout shows which shops are of priority.

To sum everything up, the robots.txt script helps a crawler know what you want indexed and the sitemap shows where everything is.

Both of these scripts are used on any website wanting to increase it's search engine potential.

These scripts (along with others) are the core fundamentals of any websites SEO plan.

Creating The Script

Tools Needed
Notepad - Notepad must be open to enter the script into.
Allow: attribute - This is where the URL's and data wanting to be displayed are typed.
Disallow: - Disallows specific URL or Data.
The filename has to be saved as robots.txt. No capitals.

What do i need to create a robots.txt script?

Creating the robots.txt script is a straightforward process that requires no extra programs or paid service.

All you need if your running windows is notepad.

To open notepad, click on start an select the notepad icon.

If it's not in the list, type notepad in the search bar and it will locate it for you. For a video tutorial see DIY SEO - Robots.txt

When you are finished writing your robots.txt it has to be saved with this exact filename robots.txt.

Once the script has been saved it has to be uploaded to that specific website as a page so a search engine crawler can access it.

Basic Functions and Attributes

To better comprehend the script, a basic example will be needed.

User-agent: *

Allow: /

Sitemap: http://www.yoursite.com/sitemap.xml

What is written here will tell every search engines crawler that everything on the website it is attached to is o.k to be indexed.

The User-agent:

This section is the place where the search engine that wants to be effected is inserted. i.e Google-bot, Bing-bot etc...

Placing the * means every search engine will be effected.

The Allow:

This section is where what is allowed to be indexed is placed. Placing the / means everything will be indexed.

Alternatively, this can also be Disallow: which will revoke a specific URL or item.

Robots.txt Guidelines

Read all About it!  Robots.txt Functions and Attributes
Read all About it! Robots.txt Functions and Attributes

How to append specific Images and URL's

It may sound tricky to add or remove URL's and images in your robots.txt, but it truly isn't.

Once you have a grasp on what the specific attributes are referencing to, it becomes a lot easier to understand.

The following example will portray how to disallow a specific page. An explanation can be found after the example.

User-agent: *

Disallow: /apps/games.html

Sitemap: http://www.yoursite.com/sitemap.xml

As described in a previous example, the asterisk (*) indicates that all search engines should adhere to the next lines of code.

As you would have noticed in the disallow section (or allow for that matter), there is no need to place the homepage URL.

The first forward slash (/) is everything after your homepage. All that you need to type is the URL after the /. i.e The URL in the disallow section wasn't written as http://yoursite.com/apps/games.html. All that was written was /apps/games.html.

Now you know how to modify a URL, adding or removing pictures is very much the same concept.

All you do is add the folder and the image name as well.

The following example is how a picture is removed

User-agent: *

Disallow: /images/golf-cart.jpg

Sitemap: http://www.yoursite.com/sitemap.xml

Now you have the basics sorted your ready to create your own robots.txt.

How Did You Go With the Robots Script Commands?

Did we explain the script enough for you to implement your own?

See results

Comments

    0 of 8192 characters used
    Post Comment

    No comments yet.

    working

    This website uses cookies

    As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

    For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://hubpages.com/privacy-policy#gdpr

    Show Details
    Necessary
    HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
    LoginThis is necessary to sign in to the HubPages Service.
    Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
    AkismetThis is used to detect comment spam. (Privacy Policy)
    HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
    HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
    Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
    CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
    Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
    Features
    Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
    Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
    Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
    Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
    Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
    VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
    PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
    Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
    MavenThis supports the Maven widget and search functionality. (Privacy Policy)
    Marketing
    Google AdSenseThis is an ad network. (Privacy Policy)
    Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
    Index ExchangeThis is an ad network. (Privacy Policy)
    SovrnThis is an ad network. (Privacy Policy)
    Facebook AdsThis is an ad network. (Privacy Policy)
    Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
    AppNexusThis is an ad network. (Privacy Policy)
    OpenxThis is an ad network. (Privacy Policy)
    Rubicon ProjectThis is an ad network. (Privacy Policy)
    TripleLiftThis is an ad network. (Privacy Policy)
    Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
    Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
    Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
    Statistics
    Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
    ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
    Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
    ClickscoThis is a data management platform studying reader behavior (Privacy Policy)