We all appreciate that it's our own responsibility to protect our own work, but it's a big job, and it's in HubPages interest to help us, because plagiarism affects the whole site's income.
On another thread, there have been several suggestions of features HubPages could add. So far we have:
- negotiate a bulk rate with Copyscape for their api usage. HP could then resell that to Hubbers or even provide it free to people who meet some specific goal.
- improve the existing checker (it is only finding a fraction of copies)
- install anti-scraping software (can any Hubber give more details on what's available?)
- install software that appends "data" such as a link back to anything that is copied and pasted (like Wordpress), so at least we get a backlink if anything is copied.
- disable right-click like Wordpress (though this won't stop the professionals)
- monitor traffic logs for ip addresses that visit/download large amounts of hubs in a short period, so they can be blocked/checked further.
We would appreciate HubPages' response as to
(a) which of these measures have been considered,
(b) why they weren't taken up and
(c) which ones they are willing to look at for the future.
P.S. Let's try to keep this thread for constructive suggestions only.
- I like the Copyscape idea
- Improving the checker would also be a good idea, though its extra toll on HP's servers might be too much and it might be cheaper to use Copyscape.
- Antiscraping software is useless. Scrapers might be noobs, but those making scraping software know how to script. They will adapt their scripts to antiscraping software.
- "install software that appends "data" such as a link back to anything that is copied and pasted" - scraper software can be set to remove links. Since even invisible data is stored on a page, they can find and remove it (unless you do cookies, which won't help in this case).
- "disable right-click like Wordpress". Only absolute beginners would fall to that. Within a few days, they can learn about Ctrl+U.
- "monitor traffic logs". Might work for those scraping in large quantities who can't rotate IP addresses fast enough. Won't prevent those that scrape in smaller quantities (100 a day across 10 IP is 10 pages per IP, perfectly reasonable if timed right).
I think the best option is to locate scraped content after it's been posted, either by Hubpages improving its checker or by using a 3rd party checker such as Copyscape. Prevention can be circumvented, detection not so much (since they want their sites to be seen by google, they can't really stop detection). Perhaps every active hubber could have his/her hubs fully checked every 3-4 months. This gives time for hubpages to balance how many hubbers to check a day.
Provide DMCA tools to help streamline the process. Fill in the blank stuff like Edweirdo did.
I think that since it is likely that no prevention will be fool-proof, it might be a good idea to include detailed information on how to monitor our content for duplication and how/when to fill out DMCA complaints in the learning center.
Some basic information on copyright law wouldn't be a bad idea either.
I know that having dates in our hubs makes them appear "dated" to searchers who are likely to avoid such search results, but perhaps putting a date stamp in our hubs in such a way that it doesn't appear in the search results. Having a time stamp in our hubs would help with filing a DMCA, especially when contacting hosts. It would help assure them that our hub was written on x date, before it was copied, thus kind of helping us prove that it's ours.
Sounds like a good suggestion. I'm a novice when it comes to copy protection and then it gets too technical for me. I do appreciate when Hubpages team knows how to cover this aspect of the system.
My question relates to when we want to link or use parts of our own work as a jumping off point on other blog sites to link back to Hubpages. Will we be able to do this?
Melbel, I agree. The date could be stated under the hub beside our author name where our name is underlined. It would be placed there automatically by the system.
So far, only we can see the date of publication on our statistics page.
Yeah, and it's a shame that it only shows there. On older hubs, the publish date (or date Google first saw the content, I'm not 100% sure which it is) DOES show up in Google search results. It's kind of tacky for it to show up on the older hubs but that's the way HP had it set up back in the day and Google has not forgotten on the older hubs.
However, it WOULD be nice to have a date on the hubs that does NOT show up to the search engines as a date stamp. Like an image or something. Just a little something to show hosting companies and other services that our hubs were there first.
Well,Google knows when it first saw it, but I agree - I have never liked HP's redating.
Allow users to install an "invisible" link from a honeypot (see projecthoneypot.org) in order to allow them to take note of any suspicious activity themselves.
Note: Project Honey Pot was designed to catch email address scrapers, I'm not sure if it'll work with content scrapers.
I never knew that duplicate content was making it through the filters on HubPages. Seems like the community should be playing a larger roll in this though.
Duplicate content that happens here on HubPages is generally taken care of. However, this is about our hubs being copied by other websites. It's really frustrating when people steal our hard work, so we're coming up with ideas to prevent that from happening.
The right click thing may not be the best option because sometimes when I see a hub that I think a user might have taken from elsewhere, I'll do a Google search on sample text from their hub. I also do this with my own hubs to see if it's been stolen (although technically I could copy and paste text from the hub in draft.)
I support all these suggestions. I just realized today, how it felt to have my work copied word for word, even the title.
Oh yeah, they're voracious animals aren't they? They'll copy your images, your captions, your products (with the links changed to their ref), even your screen name. These content thieves have no dignity. Some are just ignorant of the law and don't realize this hurts people, so I'm more forgiving of them... after they remove my content, of course. Most of them are complete jerks. One of whom seriously told me that I can't do anything about it because he lives in China.
I like your new avatar by the way.
Thanks, I like the gift bow up there too.
I tried sending them DMCA email and it bounced. I tried the Google approach and the system keeps asking for a valid URL from them. I have no idea what to do now. I 'll try Google again.
They chose the my SEO hubs and they are showing up in search before me evn though they are using my titles.
I honestly believe that some of these people are members of HP. They join to scour HP in order to make their attack!
That is sickening and I fully sympathise having had it happen to me on more than one occasion.
It's got to the stage when I cringe any time I see someone from Asia looking at my hubs. (analytics real time)
I know its not right to label a whole continent, but that is where the majority of the thieves come from.
I believe you and they are so darn sneaky. Somehow they block their info from WHois too by private registration or something.
I am so upset. My hubs have not made me much money, yet I am losing already. I cannot imagine losing over 1000 hubs the way you did. It would drive me crazy.
These people need to be taught a lesson...I am yet to figure out what that lesson is!
Don't start me on China!!
It was the China Daily who initially stole one of my articles - followed by about 1500 others, but they all thought they'd got their copy from the China Daily.
I have since found out that China Daily is one of the biggest websites in China.
I got about 20 copies removed. Others filed counter-claims, and Google washed their hands unless I brought in lawyers, which I can't afford.
In the end, I changed my original article - made it longer and changed the title.
But they won.
I am sorry about that Izzy. That's rough having to lose so much when you have worked so hard.
This really pisses me off. When it's not one thing it's the other.
There is an HP author that manually rewrites articles and she is so good you can't flag her even if you recognize your article, but I have learned to live with that.
If they had changed the title and maybe a few words I might live with that too. But They even copy the voting and feedback options at the base of the hub, not to mention the Amazon and eBay items!
I am ashamed of that HP writer - wish I knew who she was!!!!
Have you reported this to HP? I would, even if it's not duped word for word. Something like this is likely to upset the community and if HP were to look into the hub and agree that it matches too much, they may disable the account.
That is so sad. I have had my own experiences with theft. On my other account on HP, someone in Denmark copied one of my hubs word per word and put it on their website. Hang in there, it will get better.
Just think although they are our words, someone is not as good as us, therefore they have to steal in order to shine. It will come back to bite them in the end. Things like that always do.
A list of sites known to be mass copying hubs. I just found another one.
The definitive way of defending yourself is whitelisting - basically you whitelist certain bots (eg the googlebot and msnbot) and then block all other bots from accessing your site.
You automatically block screen scrapers, comment spammers, sql injectors and other nasties.
The other way people can scrape without visiting a site is to use the feed - and all hubpages needs to do to allow us to block them is to give us the option of turning feeds off or setting them to summary.
If you block the bots and close down the feeds, the only way to scrape is to do it manually - and most of these guys won't bother, it's too much work.
Anyway, if you are trying to defend your own website, I recommend reading the following blog:
In particular check out the bit about scrapers in the sidebar.
I don't think it would stop bots that don't follow the rules set forth in robots.txt
We discussed this today. The ongoing, constant detection of Hub content showing up on other sites is, apparently, very, very expensive. It would make it infeasible to roll out to everyone, unfortunately.
Copying of content is not something you can completely deter (this is the Internet). However, it might provide some solace that copied content appearing on a brand-new, crappy site with nothing but copied content is unlikely to outrank you. (If there are exceptions to this, please do share it publicly here so we can see what's going on) It might be grating, but it's unlikely to impact your traffic and/or earnings.
If you do detect sites that are copying your Hubs, we encourage you to follow the procedure here:
http://hubpages.com/learningcenter/how- … -complaint
Jason, I did try that and all of the complaints I sent to the through the sites contacts were bogus. I tried using Google but it keeps telling me to use a valid URL, whatever that means.
But as you said, the sites are crappy so maybe Google will realize what's happening and do something about it.
However, one of the sites with my nutrition hubs is showing up before me in Google I don't know why or how but it really upsets me.
You filed a DMCA request through Google to have it deindexed, and that didn't work?
Yes, I keep getting an error message about the URL being valid. Maybe I am daft but I followed all the instructions.
Can you share the link here? Maybe we can help you troubleshoot it.
Are you talking about the Google link or the fake website links?
I will search again because I was so frustrated I just didn't save it.
The fake Website link (i.e. the page that's copying your nutrition Hub)
Thanks Jason, will post them. It's more than one.
Here they are Jason! Copied word for word including titles.
http://www.beyourverybest.org/the-healt … g-pumpkin/
http://www.newspanel.co.uk/health/ten-b … diet/2192/
http://www.newspanel.co.uk/health/the-h … pkin/2139/
This is sick... 2 sites copying your work
Karma Institute! What irony!
I would go to Google's DMCA page and file separate DMCA complaints about all 3 of them:
http://support.google.com/bin/static.py … page=ts.cs
(choose "Web search" from the first radio-button option)
BTW, do you know anything about DMCA.com? I just saw it and was curious.
That's strange, I've never seen that before. I did one the other day, but it wasn't that radio button form. Odd.
This was my entire day:
https://www.google.com/search?client=ub … p;oe=utf-8
7 pages deep.
What do I do when this happens? This is the second hub of mine that I've caught all over the Internet.
Actually, if you go to the last page it tells you there are more that google omitted. In total, you're looking at 9 pages.
This really sucks. People are so selfish.
Ugh, for some reason I ignored that little bit, but yep.. there's more. One of the webmasters was kind enough to tell me:
"You don't need to threaten us at [redacted] to remove an article
or author if they have stolen it. Since our artticles come to our website from
mass article submission sites like SubmitYourArticles.com, it is
likely re-printed on 100's of other article directories as well."
Ironically, it was just a generic DMCA I filled out and not a threat, but yeah, so I guess people are still keen on using PLR articles as content on their sites. It's a shame that some random guy has decided to submit my hub somewhere as PLR and it's essentially all over the Internet.
I once download a small free PLR pack to see what they looked like (just to see. I wasn't planning on using them. I was just wondering what kind of quality they were.) A lot of them sounded awkward, as if they were spun and then slightly corrected.
So I wouldn't be surprised if those PLR packs were stolen from other places. The sad thing is a lot of the massive PLR packs cost money. I know because I still get mail from that PLR provider about all the sales on the thousands of new articles that are available in the new packs.
Personally, I would pick your battles. I would not bother with the sploggy sites. I would go after ArticleBank with a DMCA complaint. If they don't respond within 48 hours, I would file complaints with Google.
Just a quick question here, but why ArticleBank over the others? They all look kind of the same and AB doesn't seem like it's the source of the other articles. Nevertheless, they did take it down. Some of the sites even just outright deleted the users account (and all his articles)which may have helped other hubbers.
I do need to pick my battles, it's not even an example of my best work to be honest, and I actually only checked while I was starting a major overhaul of it. I stopped the 'overhaul' in order to get those articles removed (if they see that my article is different then hosts and websites aren't as likely to remove it.)
Melbel, I followed your link and was aghast at the amount of copies that article has. At least yours is at the top.
Cardisa warned me and sent links to me about my work being stolen. I must say the whole process is tiring and depressing.
That might be why I write so much depressed articles apparently ? Anyone wanting to know what I mean check out my comment capsule on my rant I wrote about Dad being in a car accident. No, I am not trying to generate traffic. I'm just ticked.)
by Writer Fox4 years ago
In reading the forum post from two years ago when all Hub URLs were changed to Subdomains, many people reported that copied Hubs began to outrank theirs when the change was made. I think this might happen again...
by Cynthia Calhoun4 years ago
As hubbers, we know that our content can and will be copied. We file DMCA complaints. We try to use CopyScape. We try to sign up for Google alerts.Is there any other way to help protect our content BEFORE it...
by Don Fairchild6 years ago
Plagiarized HUBs have occurred in the past and will most likely occur in the future. If you are angry enough to stop this illegal activity, you can put a stop to it now.Recently an illegal site was shut down while...
by Jessica4 years ago
I found out this morning that this site had stolen at least 2 of my hubs. The more I looked, the more I noticed that it appears every single post on the site is a stolen hub, completely copied in most cases (including...
by Kristin Trapp6 years ago
I realized today that one of my hubs has been copied and re-published on two different websites. I was looking at my Google Analytics and noticed someone had searched on a complete sentence or two from one of my hubs....
by Suzanne Day3 years ago
Hi everyone, I have recently begun working on checking and removing copied content from my hubs on the web. I am doing a rather in depth analysis, so thought I'd share what I am doing with you, so you can try it too, if...
Copyright © 2018 HubPages Inc. and respective owners.
Other product and company names shown may be trademarks of their respective owners.
HubPages® is a registered Service Mark of HubPages, Inc.
HubPages and Hubbers (authors) may earn revenue on this page based on affiliate relationships and advertisements with partners including Amazon, Google, and others.