Duplicate content crisis spooked by Google Panda

Jump to Last Post 1-5 of 5 discussions (9 posts)
  1. LHwritings profile image91
    LHwritingsposted 12 years ago

    Recently (based partly on my own experience plus research) I published an article on how the "duplicate content" hysteria triggered by Google's Panda algorithm is adversely affecting good quality writing - see:

    Quality Articles Fall Victim to Google Panda and Anti-Duplication Hysteria
    <link snipped>

    I'm finding that fear of triggering more "duplicate content" so-called "violations" has made me hesitant - gun-shy, in effect - about contributing ANYTHING, ANYWHERE. This includes comments to others' articles, early versions of what may be longer articles, and even postings like this to forums. The reason: If Google indexes it, then if you try to use those same written words elsewhere - even PARAPHRASED - the Panda algorithm (and whatever sites like HubPages are using) may well flag it as "duplicated content" and prohibit its re-use (even just a portion of it)!

    As far as I can tell, the Panda algorithm not only flags specific words and word phrases, but also SYNONYMS of words and the basic SEQUENCE of the words and phrases in the article, even if those words and phrases differ from the supposedly "duplicated" piece. In effect, this seems to suggest that once you have written and posted something, you can never use it, or any part of it, or even the basic organization, again in any context where you want wide public exposure of your writing that will be indexed and ranked by Google. HubPages and other sites appear to be demanding absolutely pristine, first-time-ever writing ... or forget it.

    So, back to my main concern - this would seem to suggest that one should think carefully before commenting on another article, or other website posting, or even in a forum such as this. (Earlier I responded to comments on one of my Hub articles; now I'm wondering if that was a mistake, since I might want to turn those comments into another article.)

    The words you write may cause Panda to bite you.

    LH

  2. psycheskinner profile image83
    psycheskinnerposted 12 years ago

    I find it hard to believe that an intelligent person would stop making new content out of a fear of making duplicate content.  That is kind of upside down?

    1. LHwritings profile image91
      LHwritingsposted 12 years agoin reply to this

      Your comment seems kinda snarky but I'll respond anyway. Let me see if I can explain another way.

      (1) Let's say you write a short piece, then later, with more research, more thoughts, etc., you try to update it. Basically, any use you make of any part of the original - even paraphrasing, and even the logic of your argument - may be flagged now as "duplicate content". Perhaps you never do this - but lots of writers do.

      (2) Do you never write a long comment to someone else's article or forum post? I'm raising the possibility that that itself, if indexed by Google, could trigger a Panda flag if you ever try to incorporate those same comments (you wrote) in a new article or longer piece you subsequently write.

      (3) Likewise, Google indexes many forum posts (I get these in SERPs all the time). I'm raising the possibility that if you use any part of your posting to a forum (such as this one) in a subsequent article, it could be flagged as "duplicate comment".

      What this all means is that we writers may need to consider that anything we post on the Web, virtually anywhere (even, say, in a personal blog) may later trigger a "duplicate content" violation when you try to post it formally on another site such as HubPages or other article banks. Therefore, you would want to think very carefully what you're writing, and try not to post anything you will ever consider using again. Basically, that should make anybody who thinks - genius-level or not - to think twice before idly posting comments, or any written material, anywhere.

      LH

  3. psycheskinner profile image83
    psycheskinnerposted 12 years ago

    I wasn't being snarky.  I really am having trouble with your assertions

    Updating the same piece at the same location is not duplicate, and never in my life have I had duplicate issues when free-writing without copying--no matter how many dozens of articles I have written on the same topic.

    This comes not only from my own experience bu years monitoring plagiarism at the university level.  It is very hard to write something duplicate unless you are copying word for word from a source.

    If you put the sources away and write a piece of decent length off the cuff, I would be astounded if it ended up a duplicate unless you have a photographic memory.

    1. LHwritings profile image91
      LHwritingsposted 12 years agoin reply to this

      First of all, this is a very recent development (Google implemented the first Panda early this year, and subsequent versions seem to have tightened it). So what we have all known for the past couple of decades or so is not necessarily valid any longer.

      Updating the same piece at the same location is probably not a problem - if the original site allows subsequent editing by the author.

      However, I and other writers on occasion will take one of our original articles, published elsewhere, and "update" it with new information, perhaps changes in arguments, etc., and publish it elsewhere as a different, perhaps extended, article (maybe we feel the new site is more appropriate). Panda might well flag the new content as "duplicated" and land you a violation with the new publishing site.

      Furthermore, "duplicate content" does not mean copying exactly the original material, word for word, and posting it elsewhere. It can mean PARAPHRASING it, using only some of the original wording (e.g., phrases), and even using the original sequence of ideas (i.e., the argumentation), which Google would identify because SYNONYMS are now identified as indicators of "duplication".

      Even if you write something new, without reference at all to the previous material, but dealing with the same topic, you'll probably find yourself using some of your original words and perhaps following the same trail as your older writing. That could still land you in trouble.

      My main concern at the moment is whether, and to what extent, less formal Web postings (such as comments to others' articles or blogs, forum posts, etc.) could trigger a "duplicate content" violation if you try to use those words elsewhere - e.g., you incorporate them in a more formal Hub article. This occurred to me after I posted a fairy long response to comments on one of my own articles; I had started thinking about flashing them out into an entire article of its own, but then wondered whether HubPages would allow that ... and that prompted this thread.

      LH

      1. Lisa HW profile image62
        Lisa HWposted 12 years agoin reply to this

        I haven't had any problems when it comes to comments or posting in the forums.  One point is, though, that I most often don't  write about the same things in the forums that I do in Hubs.  If I sense a forum comment would make a good Hub I don't post it and instead make it Hub (often linking to the forum that "inspired" it).

        With comments, I either leave my own long replies on my own Hubs as "more discussion" for the Hub, or else I don't post them and instead turn them into a separate Hub (and say it where I was originally going to post the response).

        One thing I'd never do is post a long comment on someone else's Hub and then write about the same exact subject from the same angle.   I'd think that could potentially cause problems for the other person, but even then the person could delete the comment and (maybe) clear up the problem.  I'd think, too, that the Hubber who did that kind of thing with comments on other people's Hubs would eventually run into problems on this site (maybe I'm wrong).

        Either way, I've been here since before the first Panda and all through the following runs and have had no problems with duplicate content from "informal" posting,

        I don't know how long Google takes to index something like forum posts or comments, but with comment-responses on your own Hubs once you get rid of them (and maybe wait until they're de-indexed) they're no longer duplicate content.

        (I keep copies of all my long posts and comments anywhere just in case I need them for anything or might be able to do something else with them somewhere else at some point, or else just for my own date-keeping, reference, and ideas for writing.)  The more track of, and records for, anything you write you keep, the less likely the chances you'll run into problems that can't be fixed.  Also, if you think someone else may be copying your forum posts or comments, you can more easily verify that.  smile  )

  4. Lisa HW profile image62
    Lisa HWposted 12 years ago

    The person who comes up with his own, original, words any time he writes anything generally doesn't have to worry about those human words coming across like they're either spun or copied.  In the unlikely event something turns into an issue all the person has to do is either fix what showed up (or else defend himself against false accusations) as duplicated or else accept that that particular article (if it's among many, many, more that are without similar "issues") isn't very likely to get traffic. From what I've heard, a lot about Panda is based on percentages in a person's material.  The rare situation where a person appears to have messed up on a rare occasion aren't generally likely to do a whole lot.  If someone has a whole "iffy deal going" it's usually pretty obvious to both algorithms and human eyes.  If you do what you're supposed to do on whatever sites you're on it's not likely you'll run into any serious problems (and, again, you fix the minor ones one way or another).

    LH (same initials as mine, by the way), it looks to me like you're pretty much brand new at writing here and on your blog (unless, of course, you write elsewhere too, but I'd think if you did you wouldn't be so worried about the duplicate-content thing).  In any case, if you don't write and don't post stuff places you don't get readers or traffic; and you don't, needless to say, make money either.  Too many people have been left shaking in their boots by "Panda", and most of the time legitimate writers don't have to worry a whole lot about it.

    As someone who has written thousands of things, I'd agree with psychskinner.  I think you can relax and just feel free to enjoy HubPages.  You may also want to think about establishing your own authorship and original content with Google by creating a Google+ profile and reciprocal links between your HP profile, your writing (here and anywhere else), and the Google+ profile.  As far as Panda goes, I'd be more concerned with not establishing credibility and authorship with Google than with the whole duplicate-content thing.  (It might even help prevent some potential duplicate-content issues from arising in the future if someone else copies your content.) 

    All that said, there's a point where a "legitimate" writer can't always be worried about Panda.  All we can do is do what we think is right and OK and let the Panda chips fall where they may.

    1. LHwritings profile image91
      LHwritingsposted 12 years agoin reply to this

      Thanks for the observations. Obviously this is a "non-issue" for some, but I think it's a legitimate threat for others, depending on your mode of writing and producing new material (in my case, I often use forum posts and similar material as a basis for some of the content of new articles).

      I do want to underscore that my concerns aren't based on speculation, they're based on actual "duplicate content" problems that I and others have been experiencing. Examples that come to mind:

      * One person encountered a violation because she'd used some of her own blog writing as content in an article...

      * One person found that some of his own material was copied (plagiarized), and posted elsewhere, where it was indexed by Google - triggering a "duplicated content" violation against his own original article (through some crazy dysfunction in Google's indexing process)...

      * OK, my own case, which triggered my concern, involves my excerpting a section of a much longer article on grammar problems that I'd published elsewhere. I wanted to take each specific problem discussed in the first article, focus on it, and turn each into a separate article in a series. So I took a section (about 1/5 of the original) and totally re-wrote it, sentence by sentence (i.e., change wording, created new sentences, inverted previous sentences, etc.). I added several paragraphs of new introduction, plus a new ending. I completely changed the examples used. I did, from necessity, retain the wording of some phrases (e.g., it's hard to find a substitute for "objective case"). I made the same basic argument, but in a completely different article. The first iteration of this got published on HubPages, but unpublished about 2 days later for "duplicate content". So I re-wrote it some more, but still retaining the same basic argumentation. Again, it was rejected. So I've decided to abandon the "series" project and the aim of contributing articles on grammar issues (despite decades of experience in this). I now regret I had ever discussed those grammar issues in the original article, because now I would really prefer to deal with them individually, in a series.

      My conclusion: You have one, and only one, chance to write on a given topic and publish it anywhere on the Web. Be careful - use it wisely. Don't expect that you can come back to that topic, change wording and sentences, expand on it, add more, and then publish a new article. If you do, under the new regimen, you'll be quite lucky. 

      Re: the reassurance about comments and forum posts ... thanks. I'll try to use some material like this, and see if I can get it past the Panda censor.

      LH

  5. psycheskinner profile image83
    psycheskinnerposted 12 years ago

    Well, reposting a somewhat changed/spun article somewhere else is duplicate--so that seems fair.

    The vast majority of people have brains that just don't create verbatim duplicate material from free-writing a new statement (even of an old idea many, many times over) beyond standard cliches and idiom (which Google knows how to account for).

    IMHO, this is a non-issue.

 
working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)