Screen Scraper Software: Value Add or Digital Piracy?

79
rate this page

By dmccarty


First, A Very Generic Overview

Screen scraping generally takes data from one format and converts the data into a different format for use in a different application. That's the basic idea anyway. This can get quite complicated - very quickly.

The basic fundamentals of RSS and even your search engine spiders rely on highly targeted, algorithmic screen scraping technologies. This is a pretty simple analogy, but the basics hold true. Here are a few examples:

  • Search Engines - the spider bots are basically scraping the HTML and text from a website and parsing that data back to a database. This information is then catalogued and provided back to consumers on search queries. Search Engines like Google, Yahoo, Microsoft Live and others use variants on this basic technology.

  • RSS Feeds - the feed script ‘reads' the information from an HTML page and parses this information into an XML document - which is then delivered to your RSS reader, which then displays the information for a consumer. RSS stands for Really Simple Syndication. While there are a number of ways you can setup an RSS Feed of your site, the basic process is the same - the RSS application strips the information from your site, places the information into an XML file which is then sent to an RSS Reader application. The RSS Reader aggregates the information and makes it readable, with links back to your website.
  • Most Web 2.0 applications depend on some level of screen scraping. Many web 2.0 applications use an API (Application Programming Interface - this subject is for another post), which in turn can be used for a variety of mashup applications. This technology has been adopted very keenly by most developing software companies that encourage decentralization. This type of application is common in FaceBook, MySpace and almost any other social network, social content sharing or web 2.0 property. This "scraping" allows the content of one page to be placed simultaneously on other pages within the network and across the web.

RSS and Search have been the most recognizable benefactors of the screen scraping concepts. Web 2.0 properties are reaping the rewards as well, making mashups (the combination of two or more applications into a better, more useable interface), hacks (the modification of a single application, possibly using content from a separate source like a mashup) and other tools freely available that scrape data from one site and parse this onto the screens of other applications.

Video For EDI Screen Scraper: An Example for Screen Scraping in the Freight Carrier Industry





Screen Scraping for Profits, Value-Add & User Convenience

So, how exactly does screen scraping enter into the business realm?

A search engine spider crawls sites on the Internet, scraping information from all the websites it comes in contact with . The search engine company (Google for example) thensaves the information and provides this information back so that we, the users, can quickly and accurately locate what we need quickly (when we enter specific search terms). Typically the search engines monetize their services with advertising surrounding the data they serve - this isn't new. This is the purpose of the screen scraping spider bots, and a core reason that Google is the 800 lb. Gorilla - they have very, very good screen scraping technologies in place.

RSS feeds help reduce information overload and add convenience for the user by scouring selected sites for pre-determined information. Again, this is the purpose and effect of RSS feeds and readers. Time is money, so by saving your time, the RSS feed saves money too, right?

Other screen scraper products seek to parse information from a manufacturer website to a e-commerce site - reducing costly human intervention and reducing prices. This has an obvious dual effect - reduction of time/money and increase in efficiency that delivers better services and lowers cost to the consumer - everybody wins.

We are beginning to see spin-offs and variants, using screen scraping technology in conjunction with other applications. These are often refered to as "mashups". However, many of these new applications are highly unique and focused on specific niche markets.

Recently, a company called Intelek Technologies developed new screen scraping software to assist trucking companies in locating and securing freight loads from online trading partners. This new take on screen scraping automatically "scrapes" data from a website, converts the data to EDI format so the freight carriers can accept and process the loads to be shipped by the website owner. This saves both companies money in manpower, time, and effort while continuing to utilize legacy software in a new way. Companies operate at lower cost, helping reduce shipping costs, which may translate into lower retail pricing for everyone.

There are other companies around the globe that are using this basic technology to increase efficiencies, reduce human errors and bring value to users and customers.


So Which Is It - Godsend or Scourge

Currently, there is some debate about screen scrapers. Some have cried foul, claiming that this amounts to piracy, while others have embraced the practice as a cost efficient way of aggregating content or information.

With all the press coverage over digital piracy, there has been some outcry over screen scraping in general - essentially equating this practice as copyright violation. For the most part, this perspective has been limited in scope and hasn't gained much traction.

The Moral and Ethical Quandary

The morals and ethics of this general practice will, no doubt, continue to be debated. Generally, most people agree that the tool itself is not inherently good or evil, but rather the practice that it is used for.

At the time of this writing, Google lists approximately 269,000 entries for the term "screen scraper". Obviously, from search results like that, there is a vast and healthy selection of screen scraper products being produced by software developers around the world.

From the current vantage point, screen scraping technologies will continue to be developed, helping general users and niche businesses gain an edge in this ever increasing digital universe. Screen scraping looks to have a significant foothold in the new world of web 2.0. Ethics and morals aside, it would certainly appear that screen scraping, and it's virtual cousins, is here to stay.

How Can You Employ Screen Scrapers as a Benefit?

As business owners, entrepreneurs, programmers and general web users - is there a way to utilize screen scraping technology for the benefit of the public, users and companies - while reducing the possibility of negative side affects or abuse?

With the vast quantities of providers available, I am curious why this basic technology has somewhat lagged behind other technologies that have less direct impact on our daily lives. Perhaps, screen scraping has simply become the drive train of the vehicle - the part that few people pay attention to compared to the shiny new exteriors, but that actually makes or breaks an application.

So, allow me to pose a basic question in this two-part inquiry.

  1. Have you ever actively searched for a screen scraper product, application or mash-up?
  2. What purpose did you have in mind - what problem were you trying to solve with a screen scraper solution?

I suppose a follow-up question here would be:

  • Were you successful in locating a Screen Scraper out of the 269,000 entries and in achieving your goals with or without a screen scrape program?

  —   Rate it:  up  down  [flag this hub]

Comments

RSS for comments on this Hub Small RSS Icon

khartley profile image

khartley  says:
3 months ago

Thanks for posting this- I never understood what people were talking about when they said people were "scraping our MLS" (Multiple Listing Service- I am a REALTOR) and reselling theinfo. Now I do! Thanks! Karl Hartley

wombats profile image

wombats  says:
3 months ago

What an eye opener this article is. A whole new dimension for me to think about. Thanks for this information. Any ideas on how I use this in my Bed & Breakfast.
Regards Laurie Leask www.wombatbed.com

dmccarty profile image

dmccarty  says:
3 months ago

Hi Wombats,

thanks for your comment, glad you found the article informative. to be perfectly honest, I have no idea how you could use screen scraping in your B&B business... I suppose you could offer RSS feeds from your site (which use similar technology) and you may want to search for a tool to scrape sites and parse the information onto your B&B site - maybe looking for other availabilities or properties?

KHarltley

Thanks for your post as well. Though I haven't searched for one - I believe that there would be some tool readily available for the MLS. I would look for one that is very flexible, one that allows you to store your login info (so you don't have to constantly login) and one that has the "granularity" to choose a variety of parameters to scrape from.... Sorry, used some techie words there.. Make sure the product has the features that you need.

Susan Geary  says:
3 months ago

This is definitely a new area for me. I never stopped to think about how Google does all those searches so fast, but your Hub page helps explain it. Thanks for the education for us non-techies!

Susan

dmccarty profile image

dmccarty  says:
2 months ago

Thanks for the post Susan. I know that this is a very specific hub - one that doesn't neccessarily hit everybody, though the concept and basic backbone is one that we all rely on on a daily basis.. we just never really consider it...like gravity..

thanks again

craigg13 profile image

craigg13  says:
2 months ago

Thanks for this insight into how the engines work...very understandable and a very good read...appreciate it!

Caregiver-007 profile image

Caregiver-007  says:
2 months ago

I never understood what a "screen scraper" was before this. You have an excellent facility in translating the technical to terms the non-techie can understand... and in such a way that, as business people, we are spurred to think creatively about practical applications for our businesses. I will look forward to your future, very helpful articles. We need you, friend!

gjcody profile image

gjcody  says:
2 months ago

Great lens. The video was very informative too. This is way different than anything I have read before. Thanks for sharing

I do not like giving up my privacy ....but I can see what the benefits can be. I just wonder how far this can be taken. It sort of frightens me to some extent. Most of our privacy is gone as it is right now.

Best to you!

Submit a Comment

Members and Guests

Sign in or sign up and post using a hubpages account.


optional



working