How to Prevent Plagiarism of Your Content on the Internet
So, you're brave enough after a while, and you've decided that you want to search the internet for your copyrighted content to see if it has been copied elsewhere without your consent. Let's look at how you would go about doing all that, and what you can do if you find that something has indeed been stolen.
Possible signs your content has been stolen
- Traffic to your article, blog or website decreases noticeably over time.
- Your search ranking goes down.
- Your affiliate program earnings take a hit.
- You get banned from an affiliate program, mainly because along with the content, your Google AdSense ID, for example, could have been scraped from the page, too.
- You receive copied content warnings (this applies on HubPages).
- The hubscore on the potentially stolen article goes down (this also applies on HubPages).
- Traffic comes to your page, directly or indirectly, from little known sources (perhaps because of names, links, and profile information left in the content stolen). Checking search terms that people use to arrive on your page can help identify this.
- When you search for your own content, other sites have the same or similar URL, same or similar summary text, and same or similar content.
Steps to take with a case of content theft
First, ways to find your content online. Text is likely the easiest. There are several plagiarism detectors out there, and one of the most commonly recommended ones is CopyScape. These detectors can work in one of two ways. You either type in a URL where the text to be searched for can be found, or you can copy and paste the text itself and search for it. You can even use Google for the latter method. Just make sure to surround your content with quotes, and also use encrypted search if you can help it. I would recommend it.
This manual method is a good practice, as using services like CopyScape that automatically check for you don't always find everything or even anything at all. Using both methods is probably best and increases your chances of finding what you're looking for. That, and services like CopyScape only allow you so many free searches per month. If you want unlimited searches and other benefits, like alerts when your content is found, then you need to register an account and pay for it.
For photos or pictures of some sort, you can use TinEye. You upload a copy of the photo you’re looking for, and the service will try to pinpoint which sites are using that same image. Doing a reverse image search with Google can yield the same results.
Once you’ve found content online that belongs to you, then it's your turn to copy and paste the text or even better, take a screenshot which clearly shows the website's URL and your content that has possibly been modified (contains inserted URLs, keywords, etc.). You can use MS Paint and the print screen key for this, or you can use a screenshot taking program. A simple to use one is Screenshoter. Browsers like Firefox even have screenshot-taking addons.
You can then contact the person who owns the forum, blog, or website, and politely ask that they take it down. You can claim that you are the content owner, and perhaps point them in the direction of where the content was originally published. If they’re the ones who took it, this would be obvious to them.
If there is no reply, or they refuse, then warn them again, saying that you will begin the process of filing a DMCA complaint and/or informing their advertisers/affiliate programs, and possibly even getting their page de-indexed from search engine results and lodging a report with a consumer complaints website or forum. Also consider leaving a comment on their blog, saying that the content is stolen and point visitors in the direction of the original article – yours.
This will probably get them to take down the content, if the first warning wasn’t enough.
If they still refuse, or you can’t get through to them, then you should find out where their site is hosted. You can do this by using whoishostingthis.com. Contact the hosting company and you will most likely need to send them a DMCA complaint. They will then do their thing, or not, when it comes to taking the content down. Just be aware that the contents of that DMCA complaint will likely be sent to a third party clearing house and certain personal information may be displayed online. If you are comfortable with this, then go ahead. If not, consider some of the other methods below.
If the webhosting company doesn't help you out, or you don’t want to bother with them, there are alternatives. You can perhaps report the offending site that stole your content for spam – because it is more than likely a spam blog (which I will explain later on). You would report this to the webhost or perhaps try the domain name registrar. Then you don’t have to send a DMCA complaint, at least in theory. If the site is malicious, and some of them are, you can report that site for malicious activity. But like it has been pointed out before: if the site is malicious, chances are most people won’t visit it, and risk the chance of infecting their devices or systems. So they won’t get much traffic in the first place, or steal traffic and money from you.
Another way is to go to their advertisers. Google has a form where you can report abuse and actions that go against the Google AdSense terms of service, for instance. You can also go through Google Webmastertools and file a DMCA complaint. This complaint will be reviewed by a team who will then verify that there is indeed content theft that has taken place. They will then hopefully de-index the pages where the content has been copied and pasted, illegally. This won’t get the pages taken down, but it will stop them from getting any traffic from that search engine. You can also do the same with Yahoo and Bing, the other major search engines.
You can then go to websites and forums dealing with consumer complaints or copyright or content theft, and effectively blacklist that site, or even the webhost if it failed to do anything about the content theft.
The last resort would be legal action, as in actually contacting your lawyer, or one that specialises in this area, and then going through the motions of taking the matter to court, which I won't cover here in this article.
Things to do to help prevent or curb content theft
- Always have copyright notices on your content, whether it be websites, blogs, articles, etc. You can even do this on photos, using watermarks – although HubPages, in particular, is rather strict about writers using watermarks on images. Even if they are your own, the practice isn’t allowed. You can use a service like Digimarc for digitally watermarking images.
A copyright notice can look like the following:
Copyright © 2015 by Anti-Valentine. All rights reserved.
"All rights reserved" means the author literally reserves all rights to their content. Authors who opt for a creative commons licence that allows, say the use of their content with specific stipulations, such as providing attribution, might say "some rights reserved".
- Also have warnings in addition to your copyright notice, about the potential repercussions of content theft – this should appear on the page somewhere where people can see it. Content theft can end up with the transgressor receiving cease & desist notices, DMCA complaints, having lawyers breathing down their neck, and even legal action, as in being taken to court. Scare them so they don’t do it in the first place.
- Use a service like MyFreeCopyright, to register your hubs and upload thumbnails of them online. You can also get badges to put on your pages to inform visitors that your content is protected by the service.
- Always record the date the creation was published online. The date should be available somewhere in your account. HubPages has the original publish date under your stats page as well as on the hub itself under stats at the top. You can take this further by posting links to your articles on Twitter, Facebook and Google Plus – where the date will also show up. In the event of a DMCA dispute, where someone files a counter-claim against you, you can then prove that you published the content first, giving you the advantage.
- Program your page or website in such a fashion that it becomes harder to highlight and copy text or download or embed images. This will help cut down on the amount of manual scraping attempts, although this method could be bypassed and content could still be harvested from RSS Feeds.
- In your RSS feeds, make sure to leave links to your website and perhaps other places like social networking presences. You can also have notices about copyright and content theft. Spam blogs that automatically harvest content from RSS feeds will probably end up displaying all this content on their sites, unedited. Then visitors will know the content is stolen.
- Also elect to have your RSS feeds only show partial content, like the first 100 – 200 words, which would be the first few sentences only, and not a full RSS feed, preferably.
- You could try to password protect your RSS feeds so only legitimate readers can subscribe and access them. This might help cut down on content theft via harvested RSS feeds.
- Set up alerts for content theft. You can use Google alerts. Then make sure to have unique keywords set up so that content copied from your website, article, or blog stands a chance of ending up in your email as an alert, giving you a link to the stolen content too. This really depends on how well indexed or ranked the site doing the copying is though. Smaller, more unknown sites might get away with it and you won’t be alerted.
- Protecting your content can be a little easier with videos. In YouTube, you can specify with a particular video that you don't want it to be added to playlists, embedded in other websites, etc. YouTube doesn't have a native option to download videos, and for good reason. People can still download a YouTube video with a browser extension or flash-grabbing program of sorts, and then upload it elsewhere, like their own YouTube channel. If this happens, then you would contact that person and go through the process described earlier, asking them to take it down, or file a DMCA complaint. You can also flag the video or the channel. YouTube relies on crowd sourcing to mark videos as inappropriate. So maybe other people will do the same if they know the content is stolen. But YouTube is known to be a bit slack in actually reviewing these reports properly.
- You can also make the video private so that only certain people can view it, and the public, people who aren't given the link, can't. Private videos also can't be added to playlists, particularly public ones.
- Flickr also has options in place to stop people from stealing photos, by disabling the downloading of photos. You can view the photos, often in several different resolutions, but not download them, if the owner of the account has disabled this feature. People can often still bypass this though, by using a screenshot taking program or by using the print screen key and pasting in a program like Microsoft Paint, saving the file in a supported format (usually JPEG) and uploading it almost anywhere.
Common culprits to watch out for
- Spam blogs
Spam blogs most commonly operate by harvesting content online from a website or blog’s RSS feed(s) and then display that content on their website. They usually go after popular, high ranking articles or blog posts because that's what people are searching for, and that will result in them getting visitors. A lot of the time the text will have links in it, or words that don’t belong in there, that have been added, and these words will have links that go off-site to affiliate landing pages, phishing websites, sites that serve malware, adult websites or other disreputable places on the web. The overall aim is to make money, but these people won’t go the honest route like most of us. Most spam blogs, much like most spammers nowadays, are automated – that’s why they often refer to automated spammers as spam bots. They aren’t human, but are used by spammers to further their own ends.
You might have some people who will manually copy and paste content too, often referred to as content scraping, and it’s not uncommon for them to alter copyright notices or erase them completely, otherwise people would know it’s stolen. This is one thing automated scrapers get wrong – they aren’t wise enough (yet) to take out copyright notices and other distinguishing tells that would inform visitors that the content is stolen. They can be outsmarted, and it's possible to outsmart manual scrapers too.
- Content mills
Content mills or content farms are websites that may well look and function like legitimate websites, and may for all intents and purposes even be legitimate, perhaps even similar to HubPages in some ways – that’s where the similarities end, however. The owners are likely aware that content theft goes on, and recruit writers to add content to the site, which is often copied from elsewhere and the writing is low quality at that. Sites like HubPages take content theft very seriously and will likely cooperate with DMCA complaints, or complaints from hubbers on the site, with reports that other hubbers have stolen their content. HubPages is different because it promotes high quality, original content. Content farms don't.
- Social networking and social bookmarking websites
The one great thing about Twitter is the 140 character limit – love it or hate it. I like it because this pretty much deters people from trying to plagiarise others' work as you can only type a sentence or two at most. There’s also a feature on Twitter for the purpose of reporting people who spam. Facebook also happens to have a character limit of 255. This mainly seems to extend to wall posts or messages on your own profile in my experience. Within groups, Facebook messages, and the like, you are able to go over that character limit. Content can be copied and pasted here, but whether this would actually crop up in search results is a different matter – especially if the group is a closed one.
Tumblr is a different story. I’ve encountered a lot of content theft there – text, photos, etc., and to boot there’s a reblog feature, similar to Twitter's retweet, where people can essentially spread that content elsewhere. A lot of it is manual content theft, but I have many a time witnessed evidence of spam blogs here, too. They’re easy to recognise as they mostly don’t have avatars or profile pics. They'll also use the appropriate tags so their posts end up in searches. They often have links that lead to phishing sites and the like.
Blogger and Wordpress.com as well as other free blogging services are probably just as likely to have it. Blogger is quick to get rid of spam blogs though, seeing as its owned by Google after all, and if a blog is hosted on blogspot (Blogger's hosting service), you can count on DMCA complaints reaching someone.
YouTube has its share of content theft too. As I mentioned earlier, people will download YouTube videos, which is quite easy to do if one has the right tools. Then they will upload those videos as their own. Other ways include recording or downloading content and then uploading it. There is a lot of copyright infringement here.
Pinterest is another one that I’ve seen mentioned by others. Of particular concern is the pinning of content that is copyrighted, which you shouldn't do. If you pin something, make sure you have permission to do so, or at least credit the author if it is a creative commons piece.
Users on social bookmarking sites like Reddit often commit infractions such as copyright infringement too. You may link to a source, but don't go and copy the entire article or even part of it. You might get away with quoting a line or two.
There are plenty of forums out there that will have someone copy and paste content in to a forum post on a thread. Somehow, because the website isn't theirs, there's less accountability on their part. Even if the person responsible for it was banned, it usually means nothing for that person to create a new account. It really is the job of the moderators and site admins to curb not only spam, but copyright infringement - and many forums don't bother with the latter, sadly. It's often the author's responsibility to seek out theft of their content, and nobody else is really going to give a damn. They don't really have the legal right to intervene in any case.
Images often end up on places like Instagram. Here they can usually be embedded in forums, blogs and so on with HTML and BB code. They might be made available for download, too, depending on the service.
Reasons why content theft happens
People are lazy. They can’t be bothered to write something, because it takes too much effort and too much time. So they copy and paste an entire article or bits of it, perhaps even from multiple sources. You can even come across something referred to as article spinning, where people will alter words using synonyms and many other techniques to try and conceal content theft, and even fool Google.
Money is most commonly the main goal. Spammers' intents include getting a site up, filling it with stolen content, even adding advertising and links to places all over the show, where they can potentially make more money.
Sometimes people do it intentionally. They know the author personally, and perhaps they don't like them. It might not be enough to go about flagging their content and reporting them for abuse. If that doesn't work, they might decide to steal your content and this can affect your search ranking among other things. I've seen this happen on HubPages, where the content thief clearly knew the victim, or at least knew of his or her work.
A lot of the time it's due to ignorance or apathy. People either don't realise that the content is copyrighted, whereas in most cases, it probably is - even if it doesn't have a copyright symbol. They might assume that because it doesn't have a copyright notice or a watermark of sorts that it isn't copyrighted, but this usually isn't the case. They might also see other people online using images without providing attribution and think that it's okay if they do the same. But you shouldn't worry about them and what they're doing. You should do the right thing. Always look for copyright notices, creative commons permissions, or ask the author first if in doubt.
Some people flat out don't care. That's the reality.
Resources for writers
So you want to use other people’s content, but you want to do so in a legal fashion? You can try and ask the author for permission. If they grant you permission to use the content, then you can go ahead and do so. But this works best with videos, photos or pictures. With text, Google still picks up on copied content, and copied content is often penalised with lower search rankings. It doesn’t matter if there’s an agreement between you and the author of the content.
Try and use as little unoriginal content as possible when it comes to text. Quoting a line or two is probably fine. A few paragraphs is pushing it. In cases like these, perhaps contact the author and talk to them about it so they know beforehand. They might do a search for content theft themselves and come across parts of their work on your site or blog and not like it very much. Copying an article word for word is out, even if you translate it in to a different language. Have some respect. Rather link to an article as a reference instead of doing that. Links are good. Writers will thank you for linking to their work, as they get valuable backlinks, which increases their search rankings and overall visibility.
Original content, written in your own words, is rewarded highly. A topic can be covered in many different ways; not just one. Don't be lazy. Work a bit.
Places to go for pictures would include Wikimedia Commons, and Flickr’s creative commons section. There are plenty others if you look around. Even Google’s Picasa or Google Search – but always filter it to get an image that has a licence that will allow you to use it elsewhere, and preferably for commercial use as well in the case of HubPages. If in doubt, contact the author.
Then when you use the image on your blog or article, be sure to provide attribution. This will usually include the author’s name, a link to the image or their website, and a copy of the licence. For example: By John Doe, [CC-BY-SA-3.0-2.5-2.0-1.0], via Wikimedia Commons. And then put a link in the text somewhere.
Here on HubPages it's accepted that videos can be used in a hub if they are able to be embedded. As far as YouTube is concerned, a user must specify in the settings for that video that it is not allowed to be embedded in other sites. It is up to them. People argue that seeing as you're embedding the video it counts as redistribution, and whether or not you're allowed to use said video in a commercial article or blog isn't defined. It's a bit of grey area to say the least. It may, however, be nice to ask them for permission to use the video, regardless.
So there we are. Hopefully you'll walk away from this article with a better understanding of what content theft is, why it happens, how to search for stolen content and how to potentially prevent it from happening in the first place.
© 2012 Anti-Valentine