Multiple stolen sentences not whole articles

Jump to Last Post 1-7 of 7 discussions (19 posts)
  1. LeanMan profile image71
    LeanManposted 11 years ago

    I want peoples opinions about the following.

    I have been checking for stolen content using sentences further into some of my hubs and have started uncovering dozens of pages where just random sentences have been stolen from a hub and combined with random (related??) sentences from I guess other peoples articles to create new articles which are just garbage. For example;
    http hmm/legalhelp.newsvine. com/_news/2013/06/05/18769981-a-look-at-significant-factors-for-law

    Note the couple of extra spaces after http and before com so that it does not show the link.

    For some I literally have dozens of sites and pages that have stolen several sentences but created complete garbage articles. These sites are clearly not going to respond to a DCMA notice.

    My question is; is it worth expending time and energy finding out who hosts these sites and sending DCMA notices to them? Opinion please because I am finding too many and I don't want to waste my time doing something that is not going to help me in the slightest.

    1. Judi Bee profile image91
      Judi Beeposted 11 years agoin reply to this

      I have had this happening for well over a year on several sites and I believe others have too.  Often they nick sentences from various articles by different authors with a common theme, eg hotels, and have a backlink somewhere amongst the gibberish.

      There are also the stolen articles that show up in search, but the page is no longer there.

      So far, I 've not found a solution.

    2. profile image0
      tsmogposted 11 years agoin reply to this

      You know much more than I of this in more ways than one. With your six-sigma philosophy what does that share with efficiency? I am not qualified with that either other than being introduced. Is it worth your time?

      Again, I ponder and I only have myself as a direct comparison. The obvious is reading your contributions here in the forum I have learned much on SEO of the least sparked interest looking at different factors. I use to to take the stance that if someone copied my work then it was complimentary, although I am mixed today. Again, my performances as bread winner articles are not a good comparison.

      Personally, I do not have enough time to work long at researching key words and building relationships with backlink sources like websites much less check for copied content. Sharing is I spend more time in the garden today. Next year may be different. I feel the time spent researching an article has a greater value than the worth of the same time looking up URL's with copyscape and etc at least today. If I was researching that methodology of discovery and subsequent proactive actions maybe it would have a greater value as being contributory toward a niche article or article market. I dun'no . . .

    3. awordlover profile image78
      awordloverposted 11 years agoin reply to this

      Yes, they do. I filed 145 notices against wordpress sites and every one of them was taken down by wordpress's host. Not one was a fully copied hub - every one was parts of sentences from three of awordlover's hubs, even copying her name which she mixed in several of her hubs in the middle of paragraphs about 3 years ago, speaking of herself by name in quotations.  When they took the parts of sentences, they use a scraping tool and it takes words that have high search value and puts them into paragraphs that make no sense when read out.   As long as you can identify your hub where it was taken and as long as you write verbatim what parts of sentences they put in their article that belong to you, you most likely will get a response 95% of the time.

      I got 145 of these. lol
      WordPress.com DMCA
      To Me
      Mar 22
      Hi there,

      Thank you very much for your report.

      The site in question has been removed from WordPress.com for violating our Terms of Service

      If you contacted the site and got no response, then go to whois.com and find out who the host is if you haven't done so already.

      Follow Writer Fox's suggestions in the other hub about stolen content. I usually get no response from host and filed DMCA against them if they are not wordpress or the experience project or some site that has their own DMCA setup.

  2. Helena Ricketts profile image67
    Helena Rickettsposted 11 years ago

    I had one that stole bits and pieces here and there of one of my hubs.  I sent the site a complaint and received no response.  I even called them and received no response.  Sent a complaint to Google and Google deindexed their page that had the content on it.

    It's worth a try.  I think it really depends on how much they stole.  One or two sentences probably won't get anywhere but if they stole quite a bit, you could more than likely get Google to yank it.

    1. LeanMan profile image71
      LeanManposted 11 years agoin reply to this

      The thing that really gets me is that the pages are clearly garbage and make zero sense at all, why are they even indexed at all by Google! I thought all of these changes to the algorithm were meant to help Google get rid of all this crap.

      1. Helena Ricketts profile image67
        Helena Rickettsposted 11 years agoin reply to this

        You are right.  That is gibberish garbage.  hmm

        It's on an actual contributor website.  I'd contact the website first and give them a link to your hub that has been partially copied along with the "gem" that is posted on their site.

        The site administrators really should (and hopefully will) just take that piece of literary trash down once it is called to their attention.

        I didn't look at the page until a couple of minutes ago and wow, it gave me a headache just trying to read it.

        1. LeanMan profile image71
          LeanManposted 11 years agoin reply to this

          I have sent a DCMA notice to the site as there are several of my articles there as well as highlighting this particular user through their own internal reporting system as being a piece of trash that maybe they should think about removing.. This contributor has close to 100 articles which appear to all be the same garbage.

      2. tirelesstraveler profile image61
        tirelesstravelerposted 11 years agoin reply to this

        They are using your sentences for SEO purposes only. Currently have one hub copied for what I can tell is SEO purposes only in 5 different languages. They have nothing to do with google.  It they weren't causing my hub score and author score to drop I wouldn't bother with them.

  3. grand old lady profile image85
    grand old ladyposted 11 years ago

    Some phrases are commonly used, so how do you determine if it was stolen or it only sounds like yours? phrases like "at the end of the day," and other such, although less famously used. Some adjectives commonly go with a specific noun.

    1. LeanMan profile image71
      LeanManposted 11 years agoin reply to this

      These are clearly stolen grand old lady, full sentences with other sentences from the same hub repeated elsewhere also. In the example above there are several long sentences from one hub while others can have just 2 or 3 sentences. I guess the articles are automatically generated by a program that selects a half dozen articles related to the keywords they want and then pulls random sentences from each article.

      1. RachaelOhalloran profile image81
        RachaelOhalloranposted 11 years agoin reply to this

        I guess you already know the site NEWSVINE is owned by NBC News.  I wanted to share with you that I had a similar problem with  CNN's ireport site two weeks ago for scraped content belonging to awordlover's account. 

        Scraped content is when they take anything from a phrase to full sentences and use them on their websites - either in a post that makes sense but usually in a post of garbage that makes no sense.  HP doesn't always catch duplicate content to notify so I keep check by selecting keyword phrases and googling them to see if anyone has copied awordlover's content. It happens at least ten times a week so it keeps me busy.

        The CNN ireport authors had purely garbage posts with links leading to advertisements for weight loss products etc., There were 52 authors who  took phrases from awordlover's  hubs and placed them over 52 different posts. Some copied her name from the HP copyright logo and mixed them within the lines of  their garbage posts  as well . When I routinely Google her name on Google and Bing, it usually shows up that way. After I saw  her name, I saw select phrases that belonged to her hubs too. 

        The ireport part of CNN has little affiliation with CNN and I learned that ireport and newsvine are similar in nature in that CNN & NBC have no authority over the content posted on their sites, as it says  per their TOS

        CNN's  ireport website is a free for all for anyone to post.  A lot of HP articles were copied in phrases to this site. These are not reporters or journalists employed by CNN or Newsvine - they are bloggers and content writers just on there to make money.

        Each posting account (author)  has ads running so they also have ADSENSE. Newsvine gives directions in their TOS on how to put Adsense on the author's acct.  You can either report the author to have their ADSENSE revoked ,,, or be nice and follow the site's TOS directions first. If no results, then I'd  go after their ADSENSE.

        For Newsvine, NBC News owns the site (and General Electric owns them), but they don't monitor the content at all.

        However, like CNN's  ireport , Newsvine  says in their FAQ's they will remove any reported spam --- likely only  in their own good time.  If you haven't done so already, use their directions for reporting spammy stuff -

        this is the link broken up with the words dot and slash

        support dot newsvine dot com slash kb slash other frequently  -asked questions slash how do I report a malicious spammer to staff

        In addition, here is the link for contact us. Write the same thing to both screens.

        dot newsvine .com slash underscore tps/ about/ contact

        Like I did on CNN, you will have to create an account in order to use their reporting method.

        This is what I did to get awordlover's name and phrases removed.  The first thing I did was go to every single one of the 52  posts and left a comment singling out every single one of the parts that were copied.

        That way when my report was checked out for the TOS violations I wrote about, they would see the  comment saying it was stolen.

        I also did something kind of underhanded but I didn't care, since the site wasn't monitored anyway.

        I included LIVE links back to awordlover's original hubs, so the link would drive traffic back to her hubs and still get the point across at the same time.  I don't think they cared too much for redirecting traffic from their site back to awordlover's site. lol

        Per their TOS, I then used their form to report spam. In my diatribe, I again listed all 52 authors and posts telling them  that I left comments under each post  citing that their site was guilty of "PLAGARISM" and "100% COPIED CONTENT" typed in all caps .

        I sent the exact same  email to their "contact me" and "report spam"  links. 

        Capitalizing those words makes them stand out to  their email  skimmers who monitor their email so, that squeaky wheel email usually gets attention, otherwise it gets ignored.

        When you write to them, give them  line by line citings of the abuse  and include your URL to point them to  show it belongs to you.  Tell them you want the post taken down  immediately because it is all 100% copied.

        I got no return email  response. But I did see  action on the website.

        They took down  all the posts which had awordlover's phrases, sentences and her name within 3  days because I checked it every 8 hours or so for the whole  3  days.

        But CNN didn't write back to me until 4 weeks later to tell me they removed the post and the authors.

        In their response, they said they did not notice my comments on each post for more than 48 hours.  None of the authors acknowledged the comments, of course. However, some readers did comment which is why they acted swiftly (@!*##) to remove posts and authors.

        They apologized and hoped I would keep my account open to consider becoming a contributor.

        I thanked them for doing the right thing and didn't answer if I would keep the account open or not.

        For Newsvine, wait no more than 3 days. If they are true to their TOS, they should take care of the problem.  NEWSVINE is  a US owned company in Connecticut, so it shouldn't be hard to communicate with them. 

        On Newsvine, they have a sidebar topic CODE OF HONOR, click it and then click MORE to see their policy on spammy junk.

        On the same sidebar, click KNOWLEDGE BASE go to FAQ, expand all 13 questions. How Do I Report A malicious user or spammer to staff?

        You'll probably  have to create acct to make the report, then go to  the page you found the spam post, click the green box and follow the directions.

        In case you don't see where I mean, here is the link with spaces and slashes

        support dot newsvine dot com slash kb slash other frequently  -asked questions how do I report a malicious spammer to staff

        I guess you have their whois info but in case you don't, here it is:
        www dot whois.com slash whois slash newsvine.com

        Their abuse contact is admin@internationaladmin.com

        Their business offices in CT for administrator is +1.2033732962

        Their toll free phone in Connecticut is: +1.8887802723

        After they remove it, to get it out of Google's search, enter all of the URL's into URL removal tool, the message will say they can't find the page, do you want to remove it from search results? click yes, and within 48 hrs it is gone from search results.

        I've read in other forums that hubbers don't think they need to do this because Google will eventually find them and remove them. They will. But not for a while. If you want to remove all mention of your stuff on that URL, it is quicker to use remove tool than to wait. It removes them from index and no more worries as some hubbers have commented.

        1. grand old lady profile image85
          grand old ladyposted 11 years agoin reply to this

          That is so sad. Actually, in the Philippines we had a senator who stole content online for his senate speech. In the first case, the blogger filed a complaint. Then the second time, the Senator got a quote from one of the Kennedy brothers, but translated it to Tagalog and claimed ownership. Really, really pathetic that a senator does this.

  4. Cardisa profile image94
    Cardisaposted 11 years ago

    This has been happening to me for a long time but I have no idea what to do about it. Can we report this as well?

    1. LeanMan profile image71
      LeanManposted 11 years agoin reply to this

      It is still plagiarism so yes we can report it. My question is if it is really worth it if the pages are clearly garbage and only have a few lines - but they are still being indexed by Google which is what really worries me!

      1. RachaelOhalloran profile image81
        RachaelOhalloranposted 11 years agoin reply to this

        after the remove the content, you can go to webmaster tools URL removal, put in the URL. The message will say they can't find the content, did you still want to report the URL to remove, click yes, and it is gone. If you wait for Google to do it, you will be waiting a long time. Notifying them gets it done quicker.

        Also as a point of interest, every hub you write then later delete shows up as an error on your Google Analytics when they crawl your site. The more errors you have listed on your GA report, the lower you are indexed on Google. So remove any error sites you are notified on GA report and on your HP stat page if you know you aren't going to republish, remove those too because they are also counted as crawl errors.  Crawl errors change your rank and index.

  5. crazybeanrider profile image90
    crazybeanriderposted 11 years ago

    I have had the same thing, but found a site that used my sentences and posted someone else's pictures to the sentence and wrote it off as there own on their blog. It's very annoying to see your work all spruced up or garbled into a mess on someone else's site.

  6. Sonja Larsen profile image59
    Sonja Larsenposted 11 years ago

    I wonder what percentage of a Google search is just repeated articles. I do think there's some kind of automated feeding system happening that is out of Google's control. I once Googled my user ID that I often use for public online activity and found that a health forum comment I made about garlic was on 32 different sites. It seemed like that one comment was automatically uploaded all over the place.

  7. tirelesstraveler profile image61
    tirelesstravelerposted 11 years ago

    Going searching.  My best hub is dying on the vine because it has been high jacked and I can't get anything done about it.  I think I will go look at this new site. 
    Thanks

 
working

This website uses cookies

As a user in the EEA, your approval is needed on a few things. To provide a better website experience, hubpages.com uses cookies (and other similar technologies) and may collect, process, and share personal data. Please choose which areas of our service you consent to our doing so.

For more information on managing or withdrawing consents and how we handle data, visit our Privacy Policy at: https://corp.maven.io/privacy-policy

Show Details
Necessary
HubPages Device IDThis is used to identify particular browsers or devices when the access the service, and is used for security reasons.
LoginThis is necessary to sign in to the HubPages Service.
Google RecaptchaThis is used to prevent bots and spam. (Privacy Policy)
AkismetThis is used to detect comment spam. (Privacy Policy)
HubPages Google AnalyticsThis is used to provide data on traffic to our website, all personally identifyable data is anonymized. (Privacy Policy)
HubPages Traffic PixelThis is used to collect data on traffic to articles and other pages on our site. Unless you are signed in to a HubPages account, all personally identifiable information is anonymized.
Amazon Web ServicesThis is a cloud services platform that we used to host our service. (Privacy Policy)
CloudflareThis is a cloud CDN service that we use to efficiently deliver files required for our service to operate such as javascript, cascading style sheets, images, and videos. (Privacy Policy)
Google Hosted LibrariesJavascript software libraries such as jQuery are loaded at endpoints on the googleapis.com or gstatic.com domains, for performance and efficiency reasons. (Privacy Policy)
Features
Google Custom SearchThis is feature allows you to search the site. (Privacy Policy)
Google MapsSome articles have Google Maps embedded in them. (Privacy Policy)
Google ChartsThis is used to display charts and graphs on articles and the author center. (Privacy Policy)
Google AdSense Host APIThis service allows you to sign up for or associate a Google AdSense account with HubPages, so that you can earn money from ads on your articles. No data is shared unless you engage with this feature. (Privacy Policy)
Google YouTubeSome articles have YouTube videos embedded in them. (Privacy Policy)
VimeoSome articles have Vimeo videos embedded in them. (Privacy Policy)
PaypalThis is used for a registered author who enrolls in the HubPages Earnings program and requests to be paid via PayPal. No data is shared with Paypal unless you engage with this feature. (Privacy Policy)
Facebook LoginYou can use this to streamline signing up for, or signing in to your Hubpages account. No data is shared with Facebook unless you engage with this feature. (Privacy Policy)
MavenThis supports the Maven widget and search functionality. (Privacy Policy)
Marketing
Google AdSenseThis is an ad network. (Privacy Policy)
Google DoubleClickGoogle provides ad serving technology and runs an ad network. (Privacy Policy)
Index ExchangeThis is an ad network. (Privacy Policy)
SovrnThis is an ad network. (Privacy Policy)
Facebook AdsThis is an ad network. (Privacy Policy)
Amazon Unified Ad MarketplaceThis is an ad network. (Privacy Policy)
AppNexusThis is an ad network. (Privacy Policy)
OpenxThis is an ad network. (Privacy Policy)
Rubicon ProjectThis is an ad network. (Privacy Policy)
TripleLiftThis is an ad network. (Privacy Policy)
Say MediaWe partner with Say Media to deliver ad campaigns on our sites. (Privacy Policy)
Remarketing PixelsWe may use remarketing pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to advertise the HubPages Service to people that have visited our sites.
Conversion Tracking PixelsWe may use conversion tracking pixels from advertising networks such as Google AdWords, Bing Ads, and Facebook in order to identify when an advertisement has successfully resulted in the desired action, such as signing up for the HubPages Service or publishing an article on the HubPages Service.
Statistics
Author Google AnalyticsThis is used to provide traffic data and reports to the authors of articles on the HubPages Service. (Privacy Policy)
ComscoreComScore is a media measurement and analytics company providing marketing data and analytics to enterprises, media and advertising agencies, and publishers. Non-consent will result in ComScore only processing obfuscated personal data. (Privacy Policy)
Amazon Tracking PixelSome articles display amazon products as part of the Amazon Affiliate program, this pixel provides traffic statistics for those products (Privacy Policy)
ClickscoThis is a data management platform studying reader behavior (Privacy Policy)