jump to last post 1-16 of 16 discussions (57 posts)

Measures to help Hubbers detect and report copied Hubs

  1. Marisa Wright profile image92
    Marisa Wrightposted 5 years ago

    We all appreciate that it's our own responsibility to protect our own work, but it's a big job, and it's in HubPages interest to help us, because plagiarism affects the whole site's income.

    On another thread, there have been several suggestions of features HubPages could add.  So far we have:

    - negotiate a bulk rate with Copyscape for their api usage.  HP could then resell that to Hubbers or even provide it free to people who meet some specific goal.

    - improve the existing checker (it is only finding a fraction of copies)

    - install anti-scraping software (can any Hubber give more details on what's available?)

    - install software that appends "data" such as a link back to anything that is copied and pasted (like Wordpress), so at least we get a backlink if anything is copied.

    - disable right-click like Wordpress (though this won't stop the professionals)

    - monitor traffic logs for ip addresses that visit/download large amounts of hubs in a short period, so they can be blocked/checked further.

    We would appreciate HubPages' response as to

    (a) which of these measures have been considered,
    (b) why they weren't taken up and
    (c) which ones they are willing to look at for the future.

    P.S. Let's try to keep this thread for constructive suggestions only.

    1. Rising Caren profile image77
      Rising Carenposted 5 years ago in reply to this

      - I like the Copyscape idea

      - Improving the checker would also be a good idea, though its extra toll on HP's servers might be too much and it might be cheaper to use Copyscape.

      - Antiscraping software is useless. Scrapers might be noobs, but those making scraping software know how to script. They will adapt their scripts to antiscraping software.

      - "install software that appends "data" such as a link back to anything that is copied and pasted" - scraper software can be set to remove links. Since even invisible data is stored on a page, they can find and remove it (unless you do cookies, which won't help in this case).

      - "disable right-click like Wordpress". Only absolute beginners would fall to that. Within a few days, they can learn about Ctrl+U.

      - "monitor traffic logs". Might work for those scraping in large quantities who can't rotate IP addresses fast enough. Won't prevent those that scrape in smaller quantities (100 a day across 10 IP is 10 pages per IP, perfectly reasonable if timed right).

      I think the best option is to locate scraped content after it's been posted, either by Hubpages improving its checker or by using a 3rd party checker such as Copyscape. Prevention can be circumvented, detection not so much (since they want their sites to be seen by google, they can't really stop detection). Perhaps every active hubber could have his/her hubs fully checked every 3-4 months. This gives time for hubpages to balance how many hubbers to check a day.

  2. Pcunix profile image88
    Pcunixposted 5 years ago

    Provide DMCA tools to help streamline the process.  Fill in the blank stuff like Edweirdo did.

  3. Pcunix profile image88
    Pcunixposted 5 years ago

    Email if scraper activity is noticed on our hubs.

  4. MelissaBarrett profile image60
    MelissaBarrettposted 5 years ago

    I think that since it is likely that no prevention will be fool-proof, it might be a good idea to include detailed information on how to monitor our content for duplication and how/when to fill out DMCA complaints in the learning center.

    Some basic information on copyright law wouldn't be a bad idea either.

  5. melbel profile image91
    melbelposted 5 years ago

    I know that having dates in our hubs makes them appear "dated" to searchers who are likely to avoid such search results, but perhaps putting a date stamp in our hubs in such a way that it doesn't appear in the search results. Having a time stamp in our hubs would help with filing a DMCA, especially when contacting hosts. It would help assure them that our hub was written on x date, before it was copied, thus kind of helping us prove that it's ours.

    1. Debby Bruck profile image84
      Debby Bruckposted 5 years ago in reply to this

      Sounds like a good suggestion. I'm a novice when it comes to copy protection and then it gets too technical for me. I do appreciate when Hubpages team knows how to cover this aspect of the system.

      My question relates to when we want to link or use parts of our own work as a jumping off point on other blog sites to link back to Hubpages. Will we be able to do this?

      1. Cardisa profile image91
        Cardisaposted 5 years ago in reply to this

        Melbel, I agree. The date could be stated under the hub beside  our author name where our name is underlined. It would be placed there automatically by the system.

        1. FloraBreenRobison profile image60
          FloraBreenRobisonposted 5 years ago in reply to this

          So far, only we can see the date of publication on our statistics page.

          1. melbel profile image91
            melbelposted 5 years ago in reply to this

            Yeah, and it's a shame that it only shows there. On older hubs, the publish date (or date Google first saw the content, I'm not 100% sure which it is) DOES show up in Google search results. It's kind of tacky for it to show up on the older hubs but that's the way HP had it set up back in the day and Google has not forgotten on the older hubs.

            However, it WOULD be nice to have a date on the hubs that does NOT show up to the search engines as a date stamp. Like an image or something. Just a little something to show hosting companies and other services that our hubs were there first.

  6. Pcunix profile image88
    Pcunixposted 5 years ago

    Well,Google knows when it first saw it, but I agree - I have never liked HP's redating.

  7. melbel profile image91
    melbelposted 5 years ago

    Allow users to install an "invisible" link from a honeypot (see projecthoneypot.org) in order to allow them to take note of any suspicious activity themselves.

    Note: Project Honey Pot was designed to catch email address scrapers, I'm not sure if it'll work with content scrapers.

  8. authentication profile image60
    authenticationposted 5 years ago

    I never knew that duplicate content was making it through the filters on HubPages. Seems like the community should be playing a larger roll in this though.

    1. melbel profile image91
      melbelposted 5 years ago in reply to this

      Duplicate content that happens here on HubPages is generally taken care of. However, this is about our hubs being copied by other websites. It's really frustrating when people steal our hard work, so we're coming up with ideas to prevent that from happening.

  9. melbel profile image91
    melbelposted 5 years ago

    The right click thing may not be the best option because sometimes when I see a hub that I think a user might have taken from elsewhere, I'll do a Google search on sample text from their hub. I also do this with my own hubs to see if it's been stolen (although technically I could copy and paste text from the hub in draft.)

  10. Cardisa profile image91
    Cardisaposted 5 years ago

    I support all these suggestions. I just realized today, how it felt to have my work copied word for word, even the title.

    1. melbel profile image91
      melbelposted 5 years ago in reply to this

      Oh yeah, they're voracious animals aren't they? They'll copy your images, your captions, your products (with the links changed to their ref), even your screen name. These content thieves have no dignity. Some are just ignorant of the law and don't realize this hurts people, so I'm more forgiving of them... after they remove my content, of course. Most of them are complete jerks. One of whom seriously told me that I can't do anything about it because he lives in China.

      I like your new avatar by the way.

      1. Cardisa profile image91
        Cardisaposted 5 years ago in reply to this

        Thanks, I like the gift bow up there too.

        I tried sending them DMCA email and it bounced. I tried the Google approach and the system keeps asking for a valid URL from them. I have no idea what to do now. I 'll try Google again.

        They chose the my SEO hubs and they are showing up in search before me evn though they are using my titles.

        I honestly believe that some of these people are members of HP. They join to scour HP in order to make their attack!

        1. IzzyM profile image86
          IzzyMposted 5 years ago in reply to this

          That is sickening and I fully sympathise having had it happen to me on more than one occasion.

          It's got to the stage when I cringe any time I see someone from Asia looking at my hubs. (analytics real time)

          I know its not right to label a whole continent, but that is where the majority of the thieves come from.

          1. Cardisa profile image91
            Cardisaposted 5 years ago in reply to this

            I believe you and they are so darn sneaky. Somehow they block their info from WHois too by private registration or something.

            I am so upset. My hubs have not made me much money, yet I am losing already. I cannot imagine losing over 1000 hubs the way you did. It would drive me crazy.

            These people need to be taught a lesson...I am yet to figure out what that lesson is!

  11. IzzyM profile image86
    IzzyMposted 5 years ago

    Don't start me on China!!

    It was the China Daily who initially stole one of my articles - followed by about 1500 others, but they all thought they'd got their copy from the China Daily.

    I have since found out that China Daily is one of the biggest websites in China.

    I got about 20 copies removed. Others filed counter-claims, and Google washed their hands unless I brought in lawyers, which I can't afford.

    In the end, I changed my original article - made it longer and changed the title.

    But they won.

    1. Cardisa profile image91
      Cardisaposted 5 years ago in reply to this

      I am sorry about that Izzy. That's rough having to lose so much when you have worked so hard.

      This really pisses me off. When it's not one thing it's the other.

      There is an HP author that manually rewrites articles and she is so good you can't flag her even if you recognize your article, but I have learned to live with that.

      If they had changed the title and maybe a few words I might live with that too. But They even copy the voting and feedback options at the base of the hub, not to mention the Amazon and eBay items!

      1. Jane@CM profile image60
        Jane@CMposted 5 years ago in reply to this

        I am ashamed of that HP writer - wish I knew who she was!!!!

      2. melbel profile image91
        melbelposted 5 years ago in reply to this

        Have you reported this to HP? I would, even if it's not duped word for word. Something like this is likely to upset the community and if HP were to look into the hub and agree that it matches too much, they may disable the account.

      3. AEvans profile image68
        AEvansposted 5 years ago in reply to this

        That is so sad.sad I have had my own experiences with theft. On my other account on HP, someone in Denmark copied one of my hubs word per word and put it on their website. Hang in there, it will get better.

        Just think although they are our words, someone is not as good as us, therefore they have to steal in order to shine. It will come back to bite them in the end. Things like that always do. smile

        1. calpol25 profile image74
          calpol25posted 5 years ago in reply to this

          That is so true AE, my gran always used to say about people who wrong do you in life  "If you live long enough, and you wait long enough, you will see more than you want so see happen to them.." Hang in there Izzy nothing stays bad forever smile

  12. melbel profile image91
    melbelposted 5 years ago

    A list of sites known to be mass copying hubs. I just found another one.

  13. Silver Rose profile image76
    Silver Roseposted 5 years ago

    The definitive way of defending yourself is whitelisting - basically you whitelist certain bots (eg the googlebot and msnbot) and then block all other bots from accessing your site.

    You automatically block screen scrapers, comment spammers, sql injectors and other nasties.

    The other way people can scrape without visiting a site is to use the feed - and all hubpages needs to do to allow us to block them is to give us the option of turning feeds off or setting them to summary.

    If you block the bots and close down the feeds, the only way to scrape is to do it manually - and most of these guys won't bother, it's too much work.

    Anyway, if you are trying to defend your own website, I recommend reading the following blog:

    http://incredibill.blogspot.com/

    In particular check out the bit about scrapers in the sidebar.

    1. melbel profile image91
      melbelposted 5 years ago in reply to this

      I don't think it would stop bots that don't follow the rules set forth in robots.txt

      1. Pcunix profile image88
        Pcunixposted 5 years ago in reply to this

        I don't think they meant robots.txt.  I'm sure they mean block them with the web server or with the firewall.

  14. Jason Menayan profile image60
    Jason Menayanposted 5 years ago

    Hello,

    We discussed this today. The ongoing, constant detection of Hub content showing up on other sites is, apparently, very, very expensive. It would make it infeasible to roll out to everyone, unfortunately.

    Copying of content is not something you can completely deter (this is the Internet). However, it might provide some solace that copied content appearing on a brand-new, crappy site with nothing but copied content is unlikely to outrank you. (If there are exceptions to this, please do share it publicly here so we can see what's going on) It might be grating, but it's unlikely to impact your traffic and/or earnings.

    If you do detect sites that are copying your Hubs, we encourage you to follow the procedure here:
    http://hubpages.com/learningcenter/how- … -complaint

    1. Cardisa profile image91
      Cardisaposted 5 years ago in reply to this

      Jason, I did try that and all of the complaints I sent to the through the sites contacts were bogus. I tried using Google but it keeps telling me to use a valid URL, whatever that means.

      But as you said, the sites are crappy so maybe Google will realize what's happening and do something about it.

      However, one of the sites with my nutrition hubs is showing up before me in Google I don't know why or how but it really upsets me.

      1. Jason Menayan profile image60
        Jason Menayanposted 5 years ago in reply to this

        You filed a DMCA request through Google to have it deindexed, and that didn't work?

        1. Cardisa profile image91
          Cardisaposted 5 years ago in reply to this

          Yes, I keep getting an error message about the URL being valid. Maybe I am daft but I followed all the instructions.

          1. Jason Menayan profile image60
            Jason Menayanposted 5 years ago in reply to this

            Can you share the link here? Maybe we can help you troubleshoot it.

            1. Cardisa profile image91
              Cardisaposted 5 years ago in reply to this

              Are you talking about the Google link or the fake website links?

              I will search again because I was so frustrated I just didn't save it.

              1. Jason Menayan profile image60
                Jason Menayanposted 5 years ago in reply to this

                The fake Website link (i.e. the page that's copying your nutrition Hub)

                1. Cardisa profile image91
                  Cardisaposted 5 years ago in reply to this

                  Thanks Jason, will post them. It's more than one.

                  1. Cardisa profile image91
                    Cardisaposted 5 years ago in reply to this
  15. melbel profile image91
    melbelposted 5 years ago

    This was my entire day:
    https://www.google.com/search?client=ub … p;oe=utf-8

    7 pages deep.

    What do I do when this happens? This is the second hub of mine that I've caught all over the Internet. sad

    1. Rising Caren profile image77
      Rising Carenposted 5 years ago in reply to this

      Actually, if you go to the last page it tells you there are more that google omitted. In total, you're looking at 9 pages.

      This really sucks. People are so selfish.

      1. melbel profile image91
        melbelposted 5 years ago in reply to this

        Ugh, for some reason I ignored that little bit, but yep.. there's more. One of the webmasters was kind enough to tell me:

        "You don't need to threaten us at [redacted] to remove an article
        or author if they have stolen it. Since our artticles come to our website from
        mass article submission sites like SubmitYourArticles.com, it is
        likely re-printed on 100's of other article directories as well."

        Ironically, it was just a generic DMCA I filled out and not a threat, but yeah, so I guess people are still keen on using PLR articles as content on their sites. It's a shame that some random guy has decided to submit my hub somewhere as PLR and it's essentially all over the Internet.

        1. Rising Caren profile image77
          Rising Carenposted 5 years ago in reply to this

          I once download a small free PLR pack to see what they looked like (just to see. I wasn't planning on using them. I was just wondering what kind of quality they were.) A lot of them sounded awkward, as if they were spun and then slightly corrected.

          So I wouldn't be surprised if those PLR packs were stolen from other places. The sad thing is a lot of the massive PLR packs cost money. I know because I still get mail from that PLR provider about all the sales on the thousands of new articles that are available in the new packs.

    2. Jason Menayan profile image60
      Jason Menayanposted 5 years ago in reply to this

      Personally, I would pick your battles. I would not bother with the sploggy sites. I would go after ArticleBank with a DMCA complaint. If they don't respond within 48 hours, I would file complaints with Google.

      1. melbel profile image91
        melbelposted 5 years ago in reply to this

        Just a quick question here, but why ArticleBank over the others? They all look kind of the same and AB doesn't seem like it's the source of the other articles. Nevertheless, they did take it down. Some of the sites even just outright deleted the users account (and all his articles)which may have helped other hubbers.

        I do need to pick my battles, it's not even an example of my best work to be honest, and I actually only checked while I was starting a major overhaul of it. I stopped the 'overhaul' in order to get those articles removed (if they see that my article is different then hosts and websites aren't as likely to remove it.)

        1. Jason Menayan profile image60
          Jason Menayanposted 5 years ago in reply to this

          Not sure...maybe AB just has higher authority than those other sites, if it wasn't the source of the originating article. I'm glad they took it down. smile

    3. Cardisa profile image91
      Cardisaposted 5 years ago in reply to this

      Melbel, I followed your link and was aghast at the amount of copies that article has. At least yours is at the top.

  16. FloraBreenRobison profile image60
    FloraBreenRobisonposted 5 years ago

    Cardisa warned me and sent links to me about my work being stolen. I must say the whole process is tiring and depressing. sad
    That might be why I write so much depressed articles apparently ? smile Anyone wanting to know what I mean check out my comment capsule on my rant I wrote about Dad being in a car accident. No, I am not trying to generate traffic. I'm just ticked.)

    1. Cardisa profile image91
      Cardisaposted 5 years ago in reply to this

      Do the DMCA thing from the link Jason gave me. It's quite easy from there and they respond really quickly. I guess Google has proof of the article that was indexed first.

 
working