jump to last post 1-5 of 5 discussions (9 posts)

Duplicate content crisis spooked by Google Panda

  1. LHwritings profile image86
    LHwritingsposted 6 years ago

    Recently (based partly on my own experience plus research) I published an article on how the "duplicate content" hysteria triggered by Google's Panda algorithm is adversely affecting good quality writing - see:

    Quality Articles Fall Victim to Google Panda and Anti-Duplication Hysteria
    <link snipped>

    I'm finding that fear of triggering more "duplicate content" so-called "violations" has made me hesitant - gun-shy, in effect - about contributing ANYTHING, ANYWHERE. This includes comments to others' articles, early versions of what may be longer articles, and even postings like this to forums. The reason: If Google indexes it, then if you try to use those same written words elsewhere - even PARAPHRASED - the Panda algorithm (and whatever sites like HubPages are using) may well flag it as "duplicated content" and prohibit its re-use (even just a portion of it)!

    As far as I can tell, the Panda algorithm not only flags specific words and word phrases, but also SYNONYMS of words and the basic SEQUENCE of the words and phrases in the article, even if those words and phrases differ from the supposedly "duplicated" piece. In effect, this seems to suggest that once you have written and posted something, you can never use it, or any part of it, or even the basic organization, again in any context where you want wide public exposure of your writing that will be indexed and ranked by Google. HubPages and other sites appear to be demanding absolutely pristine, first-time-ever writing ... or forget it.

    So, back to my main concern - this would seem to suggest that one should think carefully before commenting on another article, or other website posting, or even in a forum such as this. (Earlier I responded to comments on one of my Hub articles; now I'm wondering if that was a mistake, since I might want to turn those comments into another article.)

    The words you write may cause Panda to bite you.


  2. psycheskinner profile image83
    psycheskinnerposted 6 years ago

    I find it hard to believe that an intelligent person would stop making new content out of a fear of making duplicate content.  That is kind of upside down?

    1. LHwritings profile image86
      LHwritingsposted 6 years agoin reply to this

      Your comment seems kinda snarky but I'll respond anyway. Let me see if I can explain another way.

      (1) Let's say you write a short piece, then later, with more research, more thoughts, etc., you try to update it. Basically, any use you make of any part of the original - even paraphrasing, and even the logic of your argument - may be flagged now as "duplicate content". Perhaps you never do this - but lots of writers do.

      (2) Do you never write a long comment to someone else's article or forum post? I'm raising the possibility that that itself, if indexed by Google, could trigger a Panda flag if you ever try to incorporate those same comments (you wrote) in a new article or longer piece you subsequently write.

      (3) Likewise, Google indexes many forum posts (I get these in SERPs all the time). I'm raising the possibility that if you use any part of your posting to a forum (such as this one) in a subsequent article, it could be flagged as "duplicate comment".

      What this all means is that we writers may need to consider that anything we post on the Web, virtually anywhere (even, say, in a personal blog) may later trigger a "duplicate content" violation when you try to post it formally on another site such as HubPages or other article banks. Therefore, you would want to think very carefully what you're writing, and try not to post anything you will ever consider using again. Basically, that should make anybody who thinks - genius-level or not - to think twice before idly posting comments, or any written material, anywhere.


  3. psycheskinner profile image83
    psycheskinnerposted 6 years ago

    I wasn't being snarky.  I really am having trouble with your assertions

    Updating the same piece at the same location is not duplicate, and never in my life have I had duplicate issues when free-writing without copying--no matter how many dozens of articles I have written on the same topic.

    This comes not only from my own experience bu years monitoring plagiarism at the university level.  It is very hard to write something duplicate unless you are copying word for word from a source.

    If you put the sources away and write a piece of decent length off the cuff, I would be astounded if it ended up a duplicate unless you have a photographic memory.

    1. LHwritings profile image86
      LHwritingsposted 6 years agoin reply to this

      First of all, this is a very recent development (Google implemented the first Panda early this year, and subsequent versions seem to have tightened it). So what we have all known for the past couple of decades or so is not necessarily valid any longer.

      Updating the same piece at the same location is probably not a problem - if the original site allows subsequent editing by the author.

      However, I and other writers on occasion will take one of our original articles, published elsewhere, and "update" it with new information, perhaps changes in arguments, etc., and publish it elsewhere as a different, perhaps extended, article (maybe we feel the new site is more appropriate). Panda might well flag the new content as "duplicated" and land you a violation with the new publishing site.

      Furthermore, "duplicate content" does not mean copying exactly the original material, word for word, and posting it elsewhere. It can mean PARAPHRASING it, using only some of the original wording (e.g., phrases), and even using the original sequence of ideas (i.e., the argumentation), which Google would identify because SYNONYMS are now identified as indicators of "duplication".

      Even if you write something new, without reference at all to the previous material, but dealing with the same topic, you'll probably find yourself using some of your original words and perhaps following the same trail as your older writing. That could still land you in trouble.

      My main concern at the moment is whether, and to what extent, less formal Web postings (such as comments to others' articles or blogs, forum posts, etc.) could trigger a "duplicate content" violation if you try to use those words elsewhere - e.g., you incorporate them in a more formal Hub article. This occurred to me after I posted a fairy long response to comments on one of my own articles; I had started thinking about flashing them out into an entire article of its own, but then wondered whether HubPages would allow that ... and that prompted this thread.


      1. Lisa HW profile image74
        Lisa HWposted 6 years agoin reply to this

        I haven't had any problems when it comes to comments or posting in the forums.  One point is, though, that I most often don't  write about the same things in the forums that I do in Hubs.  If I sense a forum comment would make a good Hub I don't post it and instead make it Hub (often linking to the forum that "inspired" it).

        With comments, I either leave my own long replies on my own Hubs as "more discussion" for the Hub, or else I don't post them and instead turn them into a separate Hub (and say it where I was originally going to post the response).

        One thing I'd never do is post a long comment on someone else's Hub and then write about the same exact subject from the same angle.   I'd think that could potentially cause problems for the other person, but even then the person could delete the comment and (maybe) clear up the problem.  I'd think, too, that the Hubber who did that kind of thing with comments on other people's Hubs would eventually run into problems on this site (maybe I'm wrong).

        Either way, I've been here since before the first Panda and all through the following runs and have had no problems with duplicate content from "informal" posting,

        I don't know how long Google takes to index something like forum posts or comments, but with comment-responses on your own Hubs once you get rid of them (and maybe wait until they're de-indexed) they're no longer duplicate content.

        (I keep copies of all my long posts and comments anywhere just in case I need them for anything or might be able to do something else with them somewhere else at some point, or else just for my own date-keeping, reference, and ideas for writing.)  The more track of, and records for, anything you write you keep, the less likely the chances you'll run into problems that can't be fixed.  Also, if you think someone else may be copying your forum posts or comments, you can more easily verify that.  smile  )

  4. Lisa HW profile image74
    Lisa HWposted 6 years ago

    The person who comes up with his own, original, words any time he writes anything generally doesn't have to worry about those human words coming across like they're either spun or copied.  In the unlikely event something turns into an issue all the person has to do is either fix what showed up (or else defend himself against false accusations) as duplicated or else accept that that particular article (if it's among many, many, more that are without similar "issues") isn't very likely to get traffic. From what I've heard, a lot about Panda is based on percentages in a person's material.  The rare situation where a person appears to have messed up on a rare occasion aren't generally likely to do a whole lot.  If someone has a whole "iffy deal going" it's usually pretty obvious to both algorithms and human eyes.  If you do what you're supposed to do on whatever sites you're on it's not likely you'll run into any serious problems (and, again, you fix the minor ones one way or another).

    LH (same initials as mine, by the way), it looks to me like you're pretty much brand new at writing here and on your blog (unless, of course, you write elsewhere too, but I'd think if you did you wouldn't be so worried about the duplicate-content thing).  In any case, if you don't write and don't post stuff places you don't get readers or traffic; and you don't, needless to say, make money either.  Too many people have been left shaking in their boots by "Panda", and most of the time legitimate writers don't have to worry a whole lot about it.

    As someone who has written thousands of things, I'd agree with psychskinner.  I think you can relax and just feel free to enjoy HubPages.  You may also want to think about establishing your own authorship and original content with Google by creating a Google+ profile and reciprocal links between your HP profile, your writing (here and anywhere else), and the Google+ profile.  As far as Panda goes, I'd be more concerned with not establishing credibility and authorship with Google than with the whole duplicate-content thing.  (It might even help prevent some potential duplicate-content issues from arising in the future if someone else copies your content.) 

    All that said, there's a point where a "legitimate" writer can't always be worried about Panda.  All we can do is do what we think is right and OK and let the Panda chips fall where they may.

    1. LHwritings profile image86
      LHwritingsposted 6 years agoin reply to this

      Thanks for the observations. Obviously this is a "non-issue" for some, but I think it's a legitimate threat for others, depending on your mode of writing and producing new material (in my case, I often use forum posts and similar material as a basis for some of the content of new articles).

      I do want to underscore that my concerns aren't based on speculation, they're based on actual "duplicate content" problems that I and others have been experiencing. Examples that come to mind:

      * One person encountered a violation because she'd used some of her own blog writing as content in an article...

      * One person found that some of his own material was copied (plagiarized), and posted elsewhere, where it was indexed by Google - triggering a "duplicated content" violation against his own original article (through some crazy dysfunction in Google's indexing process)...

      * OK, my own case, which triggered my concern, involves my excerpting a section of a much longer article on grammar problems that I'd published elsewhere. I wanted to take each specific problem discussed in the first article, focus on it, and turn each into a separate article in a series. So I took a section (about 1/5 of the original) and totally re-wrote it, sentence by sentence (i.e., change wording, created new sentences, inverted previous sentences, etc.). I added several paragraphs of new introduction, plus a new ending. I completely changed the examples used. I did, from necessity, retain the wording of some phrases (e.g., it's hard to find a substitute for "objective case"). I made the same basic argument, but in a completely different article. The first iteration of this got published on HubPages, but unpublished about 2 days later for "duplicate content". So I re-wrote it some more, but still retaining the same basic argumentation. Again, it was rejected. So I've decided to abandon the "series" project and the aim of contributing articles on grammar issues (despite decades of experience in this). I now regret I had ever discussed those grammar issues in the original article, because now I would really prefer to deal with them individually, in a series.

      My conclusion: You have one, and only one, chance to write on a given topic and publish it anywhere on the Web. Be careful - use it wisely. Don't expect that you can come back to that topic, change wording and sentences, expand on it, add more, and then publish a new article. If you do, under the new regimen, you'll be quite lucky. 

      Re: the reassurance about comments and forum posts ... thanks. I'll try to use some material like this, and see if I can get it past the Panda censor.


  5. psycheskinner profile image83
    psycheskinnerposted 6 years ago

    Well, reposting a somewhat changed/spun article somewhere else is duplicate--so that seems fair.

    The vast majority of people have brains that just don't create verbatim duplicate material from free-writing a new statement (even of an old idea many, many times over) beyond standard cliches and idiom (which Google knows how to account for).

    IMHO, this is a non-issue.