Stolen Hubs, Plagiarism and Spammers

Stolen Hubs, Plagiarism and Spammers

There is obvious concern amongst fellow Hubbers about stolen hubs, if recent forum posts are anything to go by. So what can you do if you discover your work has been plagiarised and posted elsewhere?

Well, you could write to the webmaster of the site that is hosting your content and ask them to remove it. You could add a copyright notice or a Creative Commons Licence to your work. You could even go as far as filing a DMCA complaint against the offending site. However, believing any of this will make any real difference is incredibly naive.

To understand why, you first have to understand the spammers, their techniques and tools of the trade and how they reposition and profit from your content.

Who Are The Spammers?

Spammers have no concern for the law and for the most part are totally anonymous and beyond the reach of the legal system. They falsify WHOIS data, open hosting accounts in fictitious names and, more often than not, there is a total lack of contact information on any of the sites they publish. If that wasn’t enough to throw you of the scent and you actually managed to track a spammer down, you would probably find they were located in Russia, Romania or some other country where any legal recourse would be completely impractical.

The DMCA Digital Millennium Copyright Act only applies within the US, so if the spammer is operating outside the US the act won’t apply.

The Tools of the Trade

Splogging: Spammers and Black hat SEOs have their own version of blogs often referred to as Splogs or Spam Blogs. Splogs are created to get other sites indexed and increase their search engine ranking in order to:

  1. Promote affiliated websites.
  2. Artificially Increase Adsense earnings
  3. Create backlinks and Increase the search engine rankings of low quality or disreputable websites that find it difficult to obtain links naturally.

Spam blogs generate content by scraping other sites or by automatically generating unreadable content from RSS feeds, some simply copy and paste. Scraping and copy/paste are simple to detect and computer generated content is usually unreadable. So, many splog owners now use Article Spinners to rewrite stolen content. Although the quality may be lower than the original document it is far more readable than the automated content generators and more difficult to detect than a simple copy and paste job.

Scrapers: A Scraper is a very simple software application that can grab an entire website, blog or RSS feed. There are various reasons that content might be scraped and, depending who you are, might not be considered a bad thing. For example, many SEO tools are in effect content scrapers, some companies use scrapers to track competitor prices and there are many applications for scraping within data mining. However, when used to steal and profit from the hard work of another, there is no justification.

Article Stitching: Article stitching is the process of mixing (or stitching together) paragraphs and sentences from multiple related documents to create multiple hybrid documents. This process lessens the likelihood of these documents been detected by tools like Copyscape or tripping any duplicate content filters.

Article Spinners: An Article Spinner is a semi automated software application or web based service that rewrites original or PLR articles and generates a number of variations. Most article spinners produce junk which will be rejected by any quality conscious site exercising editorial control and could damage the credibility of the author if they are brazen enough to put their real name to it.

At this point the original article or articles should be undetectable, so, when you discover one of your articles being reproduced word for word on a website it is probably the work of a complete novice.

What Do You Write About?

Hubs about finance, mortgages or insurance that attract high price Adsense ads are more likely to be stolen than topics that make little profit for the spammer. Spammers make serious money from the hard work of others, much more than the original authors. If they get caught they just move on and do the same again. Closing one site will have little effect if they have set up a network of hundreds of similar sites. If you are writing content around high paying keywords don’t be surprised when it gets stolen by a spammer.

So What Can You Do?

If out of principle or self satisfaction you want to chase each offender then do so but it can be a time consuming business and for each page you find there is probably tens of others that remain undetected. Rather than fight a battle that you can never win there are a few steps you can take that will lessen the benefit to the splogger.

  1. Ensure that each webpage, hub, blog post or article you post contains a link back to you. Scraping and republishing are highly automated activities and links often go undetected.
  2. Each time you post a new webpage, hub, etc. make sure you bookmark it and create links to it from any blog you have under your control.
  3. Visit blogs that are in keeping with your article enter the discussion and post a link if it would be helpful to the readers of those blogs. Never spam.

Unfortunately, this will not curtail the activities of the professional spammer but it will help identify your document as the original and perhaps get a little link juice.

Small Business SEO Services Scotland: Promoting Scottish business with professionalism and integrity.

More by this Author


Comments 17 comments

eovery profile image

eovery 7 years ago from MIddle of the Boondocks of Iowa

Good info.

I thought you were greating a new word for good practices call spain (is-pain), until I looked at the site and saw the seo company was from Spain.

Thanks.


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

eovery,

That would have been clever, but alas no!


Kosmo profile image

Kosmo 7 years ago from California

One of the biggest problems with the Internet is its anonymous nature, and this will probably never be changed. Moreover, if you don't want something stolen, don't put it on the Internet!...


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

Sad but true Kosmo.


VioletSun profile image

VioletSun 7 years ago from Oregon/ Name: Marie

There is much to learn about the internet, and I am not a newbie, thank you again for sharing your knowledge.


whoelsecouldib profile image

whoelsecouldib 7 years ago from Alabama

Wow. Learn something new everyday. I did not know that this type of thing even went on. I am referring to article splogging or scraping. Very informative.


Research Analyst profile image

Research Analyst 7 years ago

Hey peter do you think that the semantic web will cut out this type of activity, I heard that our search results will be tailored to our pattern of search and our geographic location (called behavior based search).

I would think this would make it harder for scrapers to use content because they will not be able to measure it with SEO. what do you think?


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

Tim Berners-Lees vision of the semantic web or GGG (Giant Global Graph) rather than WWW is, I feel, still some way off. Some advances have been made with the acceptance of standards like XHTML, XML and RSS which have increased the general accessibility of the web. However, until search engines and developers embrace microformats, RDF and SPARQL standards the semantic web will remain a vision.

Scrappers already make heavy use of RSS and could be easily adapted to make use of microformats etc. The idea of the semantic web is to make information more accessible to both human and machine and might actually aid the spammer. I guess we will just have to wait and see.

As an SEO, my immediate wish is that web developers and designers would pay a little more time creating semantic HTML rather than the tag soup we often have to work with.

Thanks for the great question Research Analyst!


Research Analyst profile image

Research Analyst 7 years ago

Thanks, for the explanation, what got me thinking about this was a video I saw on the Next Web Conference 2008, Have you seen it?

Here is the link: http://vimeo.com/1062481?pg=embed&sec=1062481


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

Just watched the video, very interesting indeed.


Debby Bruck profile image

Debby Bruck 7 years ago

Dear Peter ~ Can you go into further detail about . . .

Ensure that each webpage, hub, blog post or article you post contains a link back to you. Scraping and republishing are highly automated activities and links often go undetected. [Where on hubpage would you do this? Don't we have our info up at the top right corner of each page?]

Each time you post a new webpage, hub, etc. make sure you bookmark it and create links to it from any blog you have under your control. [Bookmark in our browser? What does this do?]

Visit blogs that are in keeping with your article enter the discussion and post a link if it would be helpful to the readers of those blogs. Never spam. [Seems like it would take an inordinate amount of time to surf around for similar articles.]

Thank you so very much for making us aware of this type of plagarism. When we create a hubpage with links to the original article is that sufficient to credit the owner?

Debby


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

Scrappers can identify which area of a page contains the information they require. This area could be a table cell or a div. So, by identifying where the content or information begins and ends they can steal everything between those two points. The code used on this page is the is the same as the code used on every other hub, so once a profile has been created that works on one hub it will work on all. Once a profile has been created it can be used again in the future.

Although your info is visible on the page it would most likely be out-with the area being scraped. Each website would have its own profile and there may be multiple profiles for each website that covers different page layouts. A link back to your profile page or any other related site you are promoting may survive the scrapping process if it’s within the area being scraped.

When I was mentioned bookmarking was referring to social bookmarking sites like digg. The object of linking to your own documents is to encourage quick spidering of the page. You want your page to be the first one discovered by the search engines as this will help identify your page as the original which will help rankings and give you a slight edge over the spammers.

If you do not have a blog of your own try http://www.inlineseo.com/dofollowdiver/ there you will be able to search for related blogs to post to and gain links. You could also type 'nofollow search' into google and you will find many similar search engines.


Peter Hoggan profile image

Peter Hoggan 7 years ago from Scotland Author

Ouch, there is a rather serious typo in the comment above. What you should search for is 'dofollow search'

Sorry!


Debby Bruck profile image

Debby Bruck 7 years ago

That was extremely helpful. You're talking to novice level in terms of blogs, security, etc. muchos gracias...


Peggy W profile image

Peggy W 7 years ago from Houston, Texas

As a novice, you keep opening the door to new and different words and ideas. Thanks for all of your hubs, instructions, etc.


The Lost Dutchman profile image

The Lost Dutchman 7 years ago from Flanders (Belgium)

I got plagiarised and so I found this Hub... with very useful information! Thank you!


blbhhdcn profile image

blbhhdcn 6 years ago

you know a lot about SEO and related topics, you could really write a lot of informative articles with high ranking seo. God bless you and thank you again

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working