When searching for unscrupulously borrowed copies of my work sometimes I come across results that don't seem to really be there. Even though I search for a sentence in quotes, and the SERPs shows that quote in my results, when I click on the link either the text isn't really there, or I get a 404.
I assume this means the infringing content has been taken down, but G-Bot hasn't been back to the page in a while. I see this especially with that huge scraper site that stole, and is still stealing, HP Hubs.
So, I'm mostly concerned about these 404s from a "moving my content from one subdomain to another" point of view. I'm going through the process of actually removing copied content from google results, but I don't know how the 404s and false positives will impact any article I move. I don't want to trip any HP filters,and more importantly I don't want my article to appear as the second duplicate to the search engines.
How can I get rid of these 404s? I have a feeling if I submit a takedown with WMT they'll just tell me they can't find the content.
You will get 404 error messages for the URLs from the scraper's original sites (the ones closed down by his host). He has now, of course, found a nice new host, so the sites are back up (not sure about the indexing of them though).
The moving content issue is the one that concerns me - although Digby Adams suggested that I could do enough of a rewrite to get around the duplication issue - I suspect that's what I might have to do.
Right, but my concern is how Google interprets content that seems to a duplicate according to their cache, but in reality has been taken down and is no longer present on the web. Will they devalue our content because they think it is a copy of something that is already out there?
Google usually tries to work out which is the original and then doesn't index the latter copies, or if it does, doesn't give it the same credit as the original content.
To remove the cache of your Hubs (or the plagiarist's articles) on Google:
http://support.google.com/bin/static.py … page=ts.cs
(click inside the circle which says 'Web Search', then scroll down the page and click inside the circle which says 'A piece of content I am concerned about has already been removed by the webmaster but still appears among the search results.' Then, click on the link 'this tool.' You'll then be asked to sign in and then to enter the page URL and, if the page is still live, a word from the outdated cache page that you want to remove.
To remove the cache on Bing/Yahoo:
http://www.bing.com/webmaster/help/bing … l-cb6c294d
It can take 24 – 48 hours for the removal process.
Google doesn't give it the same credit? I'm not quite sure what you mean by this. If content is copied (stolen) and posted how does the crawler distinguish between original and duplicate? All content is indexed but what criteria are used to differentiate?
Chef: If your Hub is stolen it will generally rank well above the copy, as long as your account is healthy. The usual kind of site that takes our stuff can't compete with HP in terms of PageRank and authority. Plus, Google knows which content was there first, or at least they should, but they sometimes screw up.
However, if you decide to move your Hub from one account to another it becomes de-indexed from the search results, and now the stolen version is the only one left on the web. So, when you re-post your Hub again, even though it's yours by copyright, it looks to search engines like you are posting duplicate content. It would probably be flagged by HP and/or banished into the depth of the SERPs.
Search engines don't immediately know when a content thief changes a page or takes it down. It takes hours, days or even weeks (in this case apparently months) for the crawlers to come back and see the changes. So until G-Bot figures it out that leaves copies of your stolen work in the cache, and Google thinks they still exist even though if someone tries to go to them they'll get a 404.
So that's the meat of my question: If Googlebot hasn't come back and recognized that stolen content is no longer on a site, will they penalize my Hub if I delete from one account and repost it on another?
That's a neat reply. I've recently experienced a first dose of plagiarism and it's not very pleasant! As a complete novice and a naive onliner I ask the basic question: Why can't a design guru/expert come up with a foolproof copyright button/icon/system whereby the original author's article is 'certified' original with date, time etc etc. Any subsequent copies will be stopped/automatically suspended/deleted?
Would this be possible? For all the prevention and care we take as authors it seems the bottom line still is...any article can stolen at will, by whoever, whenever, wherever.
I don't think it would be possible, though am not expert. What would be possible though is for Hubpages to put in a similar function as a popular wordpress plugin is does - an automatic backlink to the original article for content scrapers. It doesn't stop people copying the content, but in some circumstances does credit you with creating the article with a backlink bonus.
Search engines aren't in the business of copyright enforcement, except to the extent that they must comply with the laws when a complaint is made. Remember also that there are still legitimate syndicated articles out there, particularly when it comes to news, so search engines don't always want or need to penalize a site with copied content. They just decide which is original and most relevant to the search query to the best of their ability, and push the rest down in the SERPs . . . at least that's what they're supposed to do!
Enforcing your own copyright is part of being an online content creator, or any kind of creative person. Unfortunately, it goes with the gig. Any article can indeed be stolen. Don't let it get you down.
I am new but I think the search results help with the type of ads in regards to what you are writing about. Hope this helps. MamaChellie.
by carolynkaye3 weeks ago
Some Turkish website has had my article on their site for months and I can't figure out how to get them to remove it. They've actually stolen this article 2 or 3 times before and used it on several different sites they...
by Will Apse4 years ago
There is a lot of SEO related stuff about Panda in these forums, so here is something about quality and the kinds of content Google is trying to find and offer to searchers:It comes from Amit Singhal, Google Fellow and...
by Jessica3 years ago
I found out this morning that this site had stolen at least 2 of my hubs. The more I looked, the more I noticed that it appears every single post on the site is a stolen hub, completely copied in most cases (including...
by Margaret Perrottet3 years ago
I've noticed that when I do a Google search that my Hubs show up in the search results without any date. I just made a change to one of my articles "What is Dave Chappelle Doing These Days". I made...
by Susana S5 years ago
There have been several theories about what content Google is penalising and rewarding in the search results but at the moment it does seem a bit random (from my end). Let's compare notes and hopefully we can see some...
by Isaac Asante2 years ago
Hi guys,For a while I've been using Google's Keyword Tool to research high-paying keywords and their estimate monthly traffic. Normally, what I do is that I look for Low competition keywords with around 1,000 monthly...
Copyright © 2017 HubPages Inc. and respective owners.
Other product and company names shown may be trademarks of their respective owners.
HubPages® is a registered Service Mark of HubPages, Inc.
HubPages and Hubbers (authors) may earn revenue on this page based on affiliate relationships and advertisements with partners including Amazon, Google, and others.