ArtsAutosBooksBusinessEducationEntertainmentFamilyFashionFoodGamesGenderHealthHolidaysHomeHubPagesPersonal FinancePetsPoliticsReligionSportsTechnologyTravel

Data Backup Sites, Strategies, and Mechanisms

Updated on May 28, 2013

Introduction

Herein we discuss scenarios and strategies for computer data backup. We consider sites, techniques, and mechanisms that might be applied to preserve information in a business environment or a home-user environment.

Assumptions

We assume the following:

1. Data has inherent value, therefore it should be backed up. A company (or an individual user) produces information in the form of files stored on a computer. That information has value to the folks that generated it. An permanent data loss would impact the profitability of the company. Time and effort would be required to recover or re-create the data, therefore taking away from the core operations of the business.

2. Mechanical devices will fail. No motor lasts forever. A device with moving parts does not have an infinite life span. Eventually, all hard drives will experience partial or catastrophic failure, causing some data loss.

3. Automatic data backup is necessary. Any process that requires human intervention will eventually fail sooner rather than later. In other words, don't depend on your users to backup their own stuff.

Based on these assumptions, data backup strategies are necessary for every computer with a hard drive.

Basic Data Availability Architecture

The traditional computer network consists of one or more server computers and zero or more individual workstations (or clients) throughout the workspace. The simplest case, a one-person office, may have one exactly one computer, which provides both server and workstation functionality. Ideally all data is stored on the server.

Ideally, no networked user stores data files on his or her workstation. A centralized storage strategy increases network security, streamlines network administration, and drastically increases the likelihood that backup and recovery efforts will be successful. Permitting data files to scatter throughout a network courts disaster.

Types of Data Loss and Data Failure

We consider two types of data loss and data failure.

The catastrophic failure occurs when a mechanical storage device fails completely. The network 'goes down'. Computer users are unable to log in to the network and unable to access any data whatsoever. Individual workstations may be functional. However, without access to data files, employee productivity falls to near zero. Morale also suffers.

The partial failure occurs when a file or subset of files is corrupted or accidentally deleted. User error, hardware failure, or both may be the culprit. Regardless of the cause of the data loss, some users will be severely impacted while others may continue working normally. Often this type of loss is not detected immediately.

Perfect-World and Worst-Case Scenarios

Given unlimited resources, we construct a 'perfect world' scenario in which data loss is immediately remedied with virtually no impact on end users. We then work backward from that scenario to illustrate solutions that are less resource-intensive but more attainable.

Consider the catastrophic failure, or worst-case, scenario. The network server suffers a complete failure. This could be caused by a motherboard failure, a disk controller failure, or even a network hardware failure. The server room may ignite or thieves may walk off with the entire computer. Regardless of the cause, server functionality is unavailable to all users. All data files are offline. In order to recover absolutely as quickly as possible, a mirror site was configured. Referred to as a hot backup, this site consists of an identical server connected directly to the office network and storing files in tandem with the primary office server. The backup server is off-site; it may even be in a different city. When the primary server becomes unavailable, the hot backup slides in seamlessly with minimal impact on end users.

Of course, an off-site server configured as a hot backup is relatively expensive. Small businesses rarely make such an investment.

Data Preservation Requiring Less Resources

Assuming a hot-site backup strategy is cost prohibitive, we consider other options. Instead of implementing a solution that includes both hardware and software backup solutions, we focus only on backing up our software. Initial costs and ongoing expenses are significantly lower, but recovery time increases. The trade-off must be understood by everyone involved in the business.

Even the smallest office (or home user) can implement a disk mirroring strategy on network servers. Requiring only a second hard drive and a mirroring controller, this scenario provides a virtually real-time solution in the event of a catastrophic failure of the primary hard drive. When the failure is detected, the mirroring controller automagically switches over to the mirrored drive and business continues normally. Some operating systems are designed to support mirrored drives, but mirroring at the disk controller level reduces the load on the processor and is not dependent on the operating system software. Compared to the cost of a hot-site implementation, a mirrored drive system is a trivial expense that can pay huge dividends. However, adding a mirrored drive should not be considered a complete backup strategy; the second drive is a mechanical device which will fail eventually.

Types of Backups

In general, two types of data backups are popular; complete and incremental. A complete backup makes a copy of the entire hard drive. Theoretically, this disk image can be restored to a new hard drive. Aside from the time required to copy from the backup device to the drive, end-users should experience little inconvenience. The age of the backup becomes an issue, but most end users are willing to accept a data restoration that is only a few hours old. They may grumble somewhat about recreating information that was lost between the last backup and the system failure; network administrators should carefully consider how much data they can afford to lose.

How is the complete backup stored? Given that a typical hard drive is 500GB or larger, storing the image on DVDs is problematic. A standard DVD has a capacity of about 4GB; too much disk swapping is required. The amount of human intervention necessary to implement the solution is prohibitive. Instead, a tape drive should be used to store the complete backup. Tape capacity exceeds hard drive capacity. Tapes are relatively inexpensive, reliable, portable, and easily swapped. We suggest assigning the swap to an office clerk or administrative assistant.

An incremental backup copies only files that have changed since they were last backed up. Operating system software tracks the most recent backup date/time for each file; backup software consumes this information. The primary advantage over complete backups is storage requirements. Less tape is required to store an incremental backup than a complete backup. Typically a single tape will hold many incremental backups.

One downside of the incremental backup becomes evident when a user requests a copy of a file that has not been changed recently. The user may not provide enough information to locate the most current copy of the file in the proper incremental backup. Proactively address this situation by investing in tape backup software that maintains tape inventories in a database on the server. Rather than enduring a seemingly endless tape-swapping exercise to locate the file, search the database instead.

Lessons from Experience

Implementing a tape-based solution presents problems but also provides data security. Be prepared for a relatively significant investment in a tape drive and a set of data tapes. If possible, include the cost of the hardware in the budgeted cost of the server. Adding the backup hardware at a later date will seem more painful to the business.

Assign an office member the responsibility of swapping the tapes on a daily basis. Select a person who is always in the office and is skilled at following directions. An administrative assistant or clerk usually handles this job very well. Don't depend on the office manager or the business owner. Clearly label the tapes. Make the job as rote as possible. One business owner placed a small log book next to the tape drive; the tape swapper was required to note the time and weather conditions at each tape swap.

Replace the tapes. Tapes will wear out. Observe the manufacturers recommendations for replacing your storage media. Be sure to clean the tape drive as well.

Get the tapes off-site. A backup is only useful if it's not melted. Expect the building to burn down every evening and plan accordingly. Periodically take a tape out of the rotation and store it semi-permanently in a safe-deposit box or fireproof safe in a different zip code.

Practice data recovery. Most businesses implement some type of backup strategy but very few actually attempt a recovery until data loss occurs. Schedule a time, preferably outside of business hours, and recover a few individual files simply to ensure that the system functions properly. Isolate and correct any problems before you're under pressure to recover something in real life.

Addendum: Online Data Backups

A viable solution for home users and home offices may be online data backups. Several companies offer a subscription-based service that automagically copies files as they are changed in real-time. Files are copied to a remote server by a small program running in the background on the local computer.  Costs are very low for these services.

The heavy advantage of contracting with an online backup service is the ease of use. Virtually no manual intervention is required.

The downside?

  1. Allowing the subscription to lapse will cut off access to the backups. The backups will be deleted after a brief period of non-payment.
  2. Down-level versions of files may not be stored. In other words, you may have access to the most recent version of your resume, but not last year's version.
  3. Recovery may be slow. In the event of a catastrophic failure, recovering the files will be limited to the speed of the Internet connection.

My company has a solid backup strategy

See results

Some images may be courtesy of http://www.sxc.hu/

working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)