jump to last post 1-4 of 4 discussions (12 posts)

Duplicated content policy

  1. LHwritings profile image88
    LHwritingsposted 4 years ago

    Like some other contributors in this forum, I've also just had an article suspended for "duplication". This is an article on grammar rewritten from a small section of my own much longer article on EzineArticles. I modified just about every sentence, changed examples, added content, etc. so it was a different article, but addressing the same sub-topic.

    The Hubpages policy on duplication is not clear. Even if each sentence is modified, will Hubpages still claim duplication? What if I quote something of my own, e.g., "As I wrote in my previous essay, ..."?

    Also, what about quotes taken from other material and attributed? (Often I paraphrase, but I also quote, and I always attribute to the original source.) Is quoting, and particularly block-quoting, prohibited - even if attributed? If so, this substantially limits one's ability to produce quality material.

    Any clarification appreciated...

    Thanks - LH

    1. wilderness profile image95
      wildernessposted 4 years ago in reply to this

      Changing one word out of ten in a copied article leaves 90% of it still copied directly from the source.  I don't know how much changing you did to your source article on Ezine, but if it's something like that it would definitely be considered duplicate because 90% of it still is.

      Block quotes are copied directly and obviously duplicated from somewhere else.  They are quite acceptable, but only if there is a good deal of original material in the hub.  A hub of copied, duplicated quotes with a few original sentences is not going to make it.

      Although HP has not, and will not, release hard numbers for the percentage of original work, if your duplicated quotes comprise less than 20% of the hub I don't think you would have any trouble.  The problem is when people just assume that because quotes are properly attributed it somehow means they are not duplicated from somewhere else, and that just doesn't follow.  They are obviously still copied.  Biblical quotes are perhaps the most common violation here; some hubs are little more than a collection of quotes from the bible, obviously duplicated millions of times all over the net, with very little original work.  They are not acceptable.

      1. pennywrites profile image60
        pennywritesposted 4 years ago in reply to this

        Advertising Space - After your hub is published advertisments may be placed in this space. Please note, it can take some time after you publish for the ads to match the content of your hub.Advertising Space - After your hub is published advertisements may be placed in this space. Please note, it can take some time after you publish for the ads to match the content of your hub.

        Advertising Space - After your hub is published advertisements may be placed in this space. Please note, it can take some time after you publish for the ads to match the content of your hub.

        These words were the dupliated content that came up in 19 different Internet articles under my new article search for duplicated content.  These are obviously all hubpages creations and clearly not content I put on my hub.  Hubpages put this content on my hub.  I had posted my original article that I rewrote for hubpages under Ezinearticles and WordPress there is dupliced content because I used the same keyword phrases which are my target for the article.  Also I quoted my own words from the orginal article.  The other duplicated content is the words 'for instance' which I use often in my writing. 

        So, if I completely rewrite my title and change for instance to as an example all that is left is HubPages words and my quoted sentence.  I can rework that but didn't because it is a pain to do.  It is doable though. 

        Is this the correct solution?  Also, I am confused about creating hubs. Should each article sumission have it's own hub or should related articles be posted under the same hub titie?  I wrote one artilce giving tips and then a follow up article with more tips but created a second hub for it.  Is that wrong?  I am confused here.

        1. wilderness profile image95
          wildernessposted 4 years ago in reply to this

          First, don't worry about the advertising space.  I'm not sure why you are seeing that - it sounds like the message put onto unpublished hubs, but HP shouldn't care in any case.

          Yes, change your title and preferably the keywords.  Why compete with yourself for top SE results?  Surely there are other keywords that can be used for the same content?  That way, you now have two articles on the same subject, but drawing two different crowds of searchers.

          Every hub should stand on it's own, with unique titles (HP won't let you use identical titles anyway).  Interlink hubs with similar subjects, such as those with tips that you mentioned.  I wrote, for example, a set of 7 hubs all about the same general subject but with different specifics.  One is simply a list of links and a short blurb - it acts as a "title page" so to speak.  All are heavily interlinked, with each hub linking to each of the others.

      2. LHwritings profile image88
        LHwritingsposted 4 years ago in reply to this

        Thanks very much for the response and advice. It's difficult to estimate how much of my original published submission was "duplicate" - in terms of entire sentences, less than 5%, in terms of word phrases, perhaps 20-30%. In terms of the basic concept, 100% - this was addressing the same grammatical error as the original article, but with different language, different ideas, different examples, additional content, and a new title.

        For example, when you are discussing "subjective case" and "objective case", it's virtually impossible to do so without using these very phrases. Yet, if they're used frequently, apparently they're registered as "duplication". It occurs to me that perhaps Hubpages is using far too mechanistic an algorithm to assess duplication...

        The message sent to me describes a "duplicate content" problem as consisting of:

        * Text that was previously published on another site, even if you wrote the text or retain the copyright.
        * Text that already appears in whole or large part on HubPages.
        * Text copied from multiple sources.
        * Substantial similarity to another work. This includes close paraphrasing, among other forms of misappropriation or copying of content

        In other words, even if you re-write major sections of your original, it might be assessed as "close paraphrasing".


        1. wilderness profile image95
          wildernessposted 4 years ago in reply to this

          If 5% of entire sentences are copied as well as 30% of long phrases, I would suspect that you are borderline in being duplicate.  That could equate to 30 or 40% total or even more.

          Your point on "subjective case" and "objective case" is well taken and very nearly unavoidable.  It's simply something you have to deal with, and could require that even less of other phrases and sentences be copied.

          Your point on "close paraphrasing" is also well taken, and is also something you simply have to deal with.  It can be difficult to "copy without copying" an article.  One suggestion might be to use a different set of keywords, looking for different searchers.  This can help get rid of that duplication, although paraphrasing might still be a problem.  You can also then link the two articles to each other and hopefully drive traffic from each to the other.

          It can be tough, LH, but remember that it seems that duplicate material is one of the big reasons HP was slapped so hard by the first Panda.  They aren't going to risk that again, and we can probably expect them to remain tough on duplicate content.  Google as also working on the same thing (or so they claim) - the days of publishing the same article on 100 different sites is likely over.  Google wants to see only one copy and even paraphrased copies are questionable - HP follows the same path perhaps even stronger.

          1. LHwritings profile image88
            LHwritingsposted 4 years ago in reply to this

            Google's Panda content-screening algorithm is clearly playing the role of the Big Gorilla in all this. Figuring this might be the basis of another investigative article, I did a bit of research - here are some fairly good links on this issue:

            http://www.ithenticate.com/plagiarism-p … ogle-Panda

            http://www.webpronews.com/google-panda- … le-2011-03
            (Also check out the comments section - interesting observations there...)

            I wasn't aware of all this until I got snared yesterday by the "duplicated content" robo-police. Basically, since time immemorial, writers have reworked their own content, expanded it, massaged it. And since time immemorial, writers have also quoted liberally (with attribution) from others' work - think of literary reviews, or critical responses to political commentaries, for example.

            Now, I'm concerned that Google's Panda policy may be significantly crimping all this. Like any other writer, I certainly don't want my material filched and used without attribution as someone else's work - i.e., plagiarism. But robotic processes are smart in one way and incredibly stupid in others, and they can tend to sweep with a very broad scythe.

            I remain concerned for the impact on quality non-fiction writing if the ability to recycle one's own observations and phrases, to quote from the work of others (e.g., to corroborate assertions), etc. is truncated in the course of pressure from Web-based publishers to conform to Google's content-screening strictures.

            I've reworked and resubmitted my original article, but I'm still working in the dark and have no idea whether or when it might get re-posted.


            1. wilderness profile image95
              wildernessposted 4 years ago in reply to this

              Personally, I think you've hit the nail on the head here.  Google really is the Gorilla in the closet - the one that all sites try to placate and HP is no different.  Yes, the bots that you are running up against are HP's, but they are there because of that gorilla.

              Nor do I think this is all necessarily bad.  The web is full of duplicate and near duplicate material, all trying to get traffic and MONEY.  If, instead of money being the purpose of writing, if we all concentrated on giving readers what they want the web would be a much better place.  And as misguided and fumbling as they are, that big gorilla is trying to do just that.

              Although I haven't "spun" my own material and reposted it out there somewhere else, I probably will one day.  Why?  Because readers need a second copy of all that I know and can offer them?  No - because I want that MONEY and a reiteration of my knowledge somewhere else may get it for me.  I like a fat pocketbook as well as the next person, but also recognize that it really does the web itself no good.

            2. LHwritings profile image88
              LHwritingsposted 4 years ago in reply to this

              I agree that the emerging "business model" (pay for affiliated content, pay per click, pay per 1000 page views, etc.) is creating a kind of Wild, Wild West Gold Rush environment on the web, particularly with the exploding population of "content providers" ... but the over-reaction can be equally deleterious, and that's what I'm now concerned about.

              Money is certainly not always the reason for some duplication of material and literary effort. For example, take professional papers (of which I've written quite a heap). Many researchers and other professionals try to get their basic reports and studies published in various venues, mainly to get them exposed to different segments of this or that industry or professional community. I have seen papers written and posted in one transportation journal, then another, then posted, say, to a planning journal, then another of these ... each time with minor alternations to re-target a slightly different audience, but basically the same material. I don't see a problem with that.

              The other aspect is the creation of new material, whereby you might take an older paper, say for the introductionary sections, but then elaborate on it and update it with new findings, and so on. I've done that. Why would I throw out old formulations of issues when I've worked on them, fine-tuned them so they express precisely what I need to say?

              But it seems to me that this new rather draconian prohibition against duplicative material could very well constrain and inhibit that - making me and other professional writers face extra hours and hours of writing, delaying me from dealing with my new material, etc. If I would keep facing this in the form of pressure from the journals or other publishers to expunge most duplicative material, I think there would be a lot of "Oh, forget it" kinds of reactions, and less material and less quality being produced.

              I don't have a clear-cut solution, but I've been a programmer and systems analyst myself in the mid-distant past, and that's left me with a healthy disdain, suspicion, and cynicism with respect to robotic approaches to achieving goodness.

              Now, I wonder ... if I take all my comments in this thread, and edit and tweak them into a new Hub article, will it be regurgitated by the system because of excess duplication?


  2. Richieb799 profile image68
    Richieb799posted 4 years ago

    http://dupecop.com/ Here is a useful site

    I think the Duplicate content filters are a good inforcement since it stops people stealing, I recently had to file copyright on my work which was stolen.

  3. Greekgeek profile image99
    Greekgeekposted 4 years ago

    One thing to remember about duplicate content and the big Gorilla.

    Google is trying to serve up one copy of each article, because its readers/users don't WANT all ten search results on the first page to be spun/rewritten copies of the same article. So Google is doing all it can to have ten different excellent articles on the topic show up first.

    However,  you are free to ignore Google on your own website or blog. Go ahead! Post your content on three different places if you feel the need! Google isn't stopping you. You can thumb your nose at big G all you want and break every one of its webmaster guidelines! Google just may not send you any traffic.

    However, you can only do that on your own sites and blogs. If you've got Adsense content, if you've got Amazon content, if you're publishing on a third party platform like Hubpages, you're stuck... you've gotta obey their house rules. And for self-preservation reasons, most sites' "house rules" include "don't do anything that'll draw  a Google penalty." Duplicate and spun content incur Google penalties... as Hubpages learned the hard way earlier this year. So they've gotta be careful about it.

    And yes, automated filters are never perfect -- not Hubpages', and not Google's. You can ask for a hand review if you think the computer's being stupid. But I imagine HP must err on the side of caution, because Google's algorithm might be stupid in the same way.

  4. pennywrites profile image60
    pennywritesposted 4 years ago

    Thank you one and all for the great comments and reply.  It is most helpful and points well taken.  I have new perspective now.  Cheers everyone.