Content Scrapers – How to Find Out Who is Stealing Your Content & What to Do About It

If you have been blogging for a while, chances are you are familiar with content scrapers. Content scrapers are websites that steal your content for their own blogs without your permission. Some content scrapers will just copy the content off of your blog, but most use automated software that takes the content from your RSS feed and posts your content to their site like it is a new post.

In this post, we are going to look at some potential link building benefits to content scrapers, how to find out what sites are scraping your content, and what you can do if you want to either benefit from the linking standpoint or have them take it down.

Linking Benefits of Content Scrapers

Last week, I was happy to see that I was listed in ProBlogger’s 20 Bloggers to Watch in 2012. Within 24 hours, I received a notification in my WordPress dashboard that a page on my blog had been linked to in the post on ProBlogger’s site.

After receiving the original notification from the ProBlogger post, I also received another 18 trackbacks from sites that had stolen the content in their post verbatim. Trackbacks are WordPress’ way of letting you know that another website has linked to a post on your blog. In this case, these 18 sites had posted the content exactly like the original post – with the links back to my blog still intact.

It was then that I started contemplating the potential link building benefits of content scrapers. These are not by any means quality links – the highest Google PageRank was a PR 2 domain, many were stealing content in a variety of languages, and one even had the nerve to use some kind of redirection script to take away the link juice of outgoing links! So while these links didn’t have the same authority that the original post had, they still count as links.

How to Catch Content Scrapers

Unfortunately, unless you want to continuously search for your post titles in Google, you’ll only be able to easily track down sites that keep your in-content links active. If you want to know what websites are scraping your content, here are a few tips to sniff them out.

Copyscape

Copyscape is a simple search engine that allows you to enter the URL of your content to find out if there are duplicates of it on the Internet. You can get a few results using their free search, or you can pay for a premium account to check up to 10,000 pages on your site and more.

Trackbacks

The first way is through your trackbacks in WordPress (as shown in the image above). Many of these will show up in the spam folder if you use Akismet. The key to getting trackbacks to appear from content scrapers is to always include links to other posts in your content. Be sure those links have great anchor text too, if you’re going for a little extra link juice. And even if you are not, internal linking with strong anchor text is good for your on-site optimization too!

Webmaster Tools

The next way to catch them is in Webmaster Tools. Simply go to your site in Webmaster Tools, and look under Your Site on the Web > Links to Your Site. Then sort by the Linked Pages column.

Anyone thinking about link building benefits at this point is probably noting the sheer volume of links from these sites, some of which are content scrapers. Essentially any site that is linking to a lot of your posts that isn’t a social network, social bookmarking site, or a die-hard fan who just loves linking to you is potentially a content scraper. You’ll have to go to their website to be sure. To find your links on their site, click on one of the domains to see the details of what pages on your site they are linking to specifically.

Then, click on one of your links to see which pages on their site is linking to yours.

You can see here that they are just blatantly copying my posts titles. When I visited one of the links, sure enough, they are copying my entire posts in their full glory onto their site.

Google Alerts

If you don’t post often or want to keep up with any mentions of your top blog posts on other websites, you can create a Google Alert using the exact match for your post’s title by putting the title in quotation marks.

I deliver all of my Google Alerts to an RSS feed so I can manage them in Google Reader, but you can also have them delivered regularly by email. You’ll even get an instant preview of the types of results you will get.

How to Get Credit for Scraped Posts

If you use WordPress, then you definitely want to try out the RSS footer plugin. This plugin allows you to place a custom piece of text at the top or bottom of your RSS feed content.

The result is this simple line on my blog posts when viewed through a RSS feed.

As you can see, even if you aren’t using it for the purpose of getting credit back to your posts when content thieves steal it, you can still use it for a little extra bit of advertising with the possible benefit of people who subscribe to your RSS feed clicking through to your website or social profiles. And when someone does scrape your content from your RSS feed, it shows up there too.

So in the event that someone finds your scraped content, they will hopefully notice the credit before assuming it was created by the blog that stole it. If you don’t have WordPress, you can simply include a note at the top or bottom of your content that includes the same information.

How to Stop Content Scrapers

If you’re not interested in anyone copying your content, then you have a few options to choose from. You can start by contacting the site that is stealing your content and sending them a notice that you want all of your content removed immediately. You can do this through the site’s contact form, email address, or post it to any social accounts they list.

If there is no contact information on the website stealing your content, you can do a Whois Lookup to (hopefully) find out who owns the domain.

If it is not privately registered, you should find an administrative contact’s email address. If not, you should at least see the domain registrar which, in this case, is GoDaddy and/or the hosting company for the website which, in this case, is HostGator. You can try to contact both companies (HostGator has a DMCA form and GoDaddy has an email) and let them know that the domain in question is stealing copyrighted content in hopes that the website will be suspended or removed.

You can also visit the DMCA and use their takedown services to remove anyone who is copying your photos, video, audio, blog, or other content. They even offer a WordPress plugin to incorporate a DMCA protected badge on your site to warn potential thieves.

Have you ever dealt with content scrapers and thieves? Do you leave it alone for the link benefits, or do you fight back? What other tools, services, or other preventative tactics do you use to block content scrapers? Please share your thoughts and experiences in the comments!

About the Author: Kristi Hines is a freelance writer, professional blogger, and social media enthusiast. Her blog Kikolani focuses on blog marketing for personal, professional, and business bloggers. You can follow her on , Twitter, and Facebook.

  1. Great post, Kristi.

    I’m always on the look out for content thieves.

  2. In the past I’ve used Tynt. You add a script to your blog and it reports what has been copied.

  3. I think it’s very important to make a distinction between content scrapers and content curators.

    Content scrapers are thieves — that is, they take your writing and post it, without permission or attribution, as their own work.

    Content curators — especially dedicated ones — create carefully-selected collections of posts for specific audiences, that cross many different sources…and include only the headline and a snippet of text, as well as tags that help guide the audience to a set of sub-topics.

    I’m a content curator, editing a relatively new site called “Rich Content Daily” (http://www.RichContentDaily.com). I ran across your post because I’m always on the lookout for interesting and important articles for practitioners and creators of rich content for content marketing and online learning.

    I believe that content creators help bloggers for several reasons: 1.) they expose their content constantly to relevant new audiences; and 2.) they offer the SEO “juice” that comes with inbound links. Our objective is always to encourage click-through to the original article, so we only publish a “teaser” snippet, always being careful not to “give away the punchline” of the article. (In our case, we also create original content ourselves on the site.)

    Do you agree with this important distinction between scrapers and curators — especially when the curation is carefully hand-done, as we do at RIch Content Daily? Do you have any advice for curators on how we can be even more helpful to authors whose work we so much respect?

    • Scrapers and curators are two entirely different creatures, imo. If you’re not publishing the entire article, you aren’t a scraper.

      I won’t speak for anyone else, but we love curators. :)

    • Hi Michael! I do believe there are good ways to go about content curation, and so long as you are getting permission from the author and crediting them in the repost, then there’s nothing wrong with it at all. I’ve given permission to a few curation sites – I basically look for a site where I can expand my own reach.

      As far as advice, I would say that you need to put up the reasons why someone should want content on your site. Traffic stats, future plans, how it will be displayed, customized byline, etc. Maybe even some control as to which posts will be curated and which won’t, like letting the author submit a category feed instead of their main one. :)

  4. I’ve seen a lot of sites copy my content verbatim, leaving links in tact, but they always show that the post was written by me and my name is linked back to my site. Technically, is that scraping?

    • I consider any site that uses your full content without your permission a form of content theft. Scraping is really the process for grabbing the content – specifically the software / plugins that will “scrape” content from RSS feeds. Some of those will do it with full attribution, others will just grab what is there, and some sites specifically strip any links or originating author information.

  5. Excellent post, specially about how to turn the content theft to your advantage…

  6. Kristi, I love that you’ve found the “bright side” of the issue. We could complain all day about the scrapers and make ourselves crazy fuming, but they aren’t going away any time soon. I’d rather join you in making lemonade! Thanks for pointing us to the sugar.

  7. Great post Kristi!

    I installed the RSS-footer immediately and it works like a charm!
    Thanks for the tip!

    Cheers,
    Kris
    https://twitter.com/KrisOlin

    PS. I’m now following you on Twitter as well.

  8. Is it worth hunting them down? If you run a site with several new posts a day, it’ll be a lot of work keeping track of them and hunting them down.

    • It depends on how proprietary you want to keep your content. I know some people that do hunt them down and stand up to them. I really don’t have time to do it, but sometimes I will check out the sites to at least make sure they are not posting my content next to something offensive.

  9. Really enjoy the rss footer options

  10. Hi! It is very useful information. Thanks for sharing. Now. I can catch content thieves.

  11. Awesome post, Kristi! My content, but especially my photos, are hijacked fairly frequently. I watermark all my photos, and sometimes the thieves have the nerve to crop out the watermark. I have started watermarking them higher up in the photo and more prominently. My content is also frequently plagiarized. People say that copying is the sincerest form of flattery. I really don’t want to be flattered in that way!

    For me, the biggest problem is website designers who steal images off Google Images to make their clients’ sites pretty. Very often, the client is totally unaware where the photos came from or that they are infringing on someones’ copyright. I received a nasty email from a web designer after sending him a copy of the US Copyright law (he said the photos didn’t say “copyright” so he was free to use them – he was wrong). He did remove the photos, but not without a fight. I doubt he will ever use another of my photos, but he made it clear he was not going to change his ways.

    Thank you for this great information (I love all the details) and for getting the word out.

    • You might try contacting the client in whose material the image appears, and send them a bill for the licensing fee for the image, with an explanation that their web designer has illegally used your material.

      I’ve seen it done and I’ve seen it work. Even if the bill doesn’t get paid, you make the designer look horrible. Which he/she should.

    • Hi Michelle! I’d like to think that copying is flattery, but a lot of copying these days comes from automation based on a keyword. Flattery would be a bit better comparatively. :)

      When it comes to photos, that is definitely something you want to fight for. My husband sells his photos as prints, so it’s almost product theft when his are taken. He uses the meta data option in Lightroom to automatically add his name, website, and copyright information into each photo he imports. I’ve noticed that when you upload one of his photos, that info actually pops up in the details in WordPress. I’m not sure what program you use, but you could put your copyright info in that way that way it is on each individual photo.

    • michael, don’t be such a fool. Professional web designers don’t steal photos from other websites or form Google images. How are you so sure he/she was a web designer and not an amateur aka web designer

  12. I used copyscape and found that some one has copied my content in my social media blog . But What would be the next step?

    • Hi Jonathan. The next steps are listed in the last section of this post under “How to Stop Content Scrapers.” Basically you can report the site to their hosting company or work with the DMCA to do a takedown. Good luck!

  13. Kristi, thanks for pointing out what most people don’t seem to realize — if you include lots of internal links in your content (which is a good practice anyway), scrapers don’t do much harm. (I’m talking written content, photographers should IMO go after those who steal their work.)

    If we spent any time going after the hundreds of scrapers who lift Copyblogger posts, we’d have to choose what productive activity to stop doing instead. It’s annoying, but there are worse annoyances on the web.

    I really like that RSS footer plugin.

    • Hi Sonia! I know what you mean – tracking down people who lift content from my site wouldn’t be so bad, but if I started going on the lookout for the other sites I write for, then I’d be in for a time-consuming fight. I figure even if they are lower quality, having the backlinks is nice. In the even that someone did find a scraped post over the original, at least they’ll be pointed back in the right direction. Glad you like the plugin! :)

    • Sonia, I realize this is an old article, but I have to wonder if scrapers (or blatant lazy plagiarists) started targetting Copyblogger after you basically announced Copyblogger doesn’t pay too much attention to scrapers. Hopefully in the time since this post your company has started tracking it.

  14. I am so glad that I found this. I hate thieves. I work hard to create my own content. I pay for my images. Why should people get away with stealing my hard work. I have +1d this, liked it, shared it and now I’m going to bookmark it. After that I’m going on a crusade. If I can get a thief’s blog or site shut down, I will. If everybody took action, they would think twice about their parasitic behaviour.

    • Thanks for sharing this post Steve. As far as the images, if you pay for the image license, and the content is scraped, can you report the thief to the site/artist that licensed the image? Then maybe you can have them fight it too?

  15. Your suggestion about adding in an RSS footer, including links and simple CTAs, is gold! So simple, but one I hadn’t thought of myself. :)

    • I used to have the plugin on my site, then forgot to reinstall it after I had to move the site to a new host. Then I saw it on another of my subscriptions and remembered just how genius it was!

  16. Agree with Simone to a great extent that if you are likely to get scraped make sure there are plenty of full url address links (not relative ones) back to internal content to minimise any possible detrimental impact. Most scrapers will maintain content text links.

    Do they not say ‘is plagiarism or imitation not the sincerest form of flattery’!

    Having said that if the only objective is to attract visitors to a low-quality site mainly for adsense or other advertising purposes it may be that the recent Panda/Farmer update and subsequent algorithm changes may have addressed some of the issues with low-quality scraper sites.

    If I remember rightly, Matt Cutts, Google’s Spam Master, stated that although an issue of duplicate content their algorithm can and does make every effort to identify the original content giving it attribution and higher ranking weight than obvious straight copies.

    • Google does a good job of ranking the scrapers beneath the original Rob, but I have still found some cases where I’m searching for an article and come across the duplicate instead of the original. And I have seen some scraper sites actually have comments on the stolen posts too! That’s why I think sticking an attribution in is important, just in case people find the duplicate first.

  17. Great tips on RSS footer options for WordPress. Thanks.

    Glad I discovered you on the new My SEO Community site.

  18. Really through post. I definitely learned something today, although, thankfully I haven’t had to deal with it firsthand (yet!). I am going to add the RSS footer that you describe today, because you are so right that we can’t prevent scrapers completely so we might as well do what we can to benefit from their actions as much as possible.

    I did see you listed on the Copyblogger list of bloggers to watch in 2012. After reading this post, I’ll definitely be back.

  19. Heck yes everyone should fight back to help make this problem shrink! The question we have is what if it is a press release? We link back but do we have to since the release is for journalist?

  20. I only knew about Copyscape, I never realised there were so many ways to find out who is nicking your content. I was even more surprised to learn I could find out via Google Analytic. Thanks Kristi. :)

  21. Thank you or the tip on the plugin! I installed it immediately. I often find my RSS feeds on other scrapper sites. Thank you!

  22. Good job Kristi…I never knew Webmaster Tools will have such stuffs in it. I will explore them tonight and check because our website content was on several blogs..need to track them. I was looking for some application to find them, wonder webmaster tools got what I really want.

  23. The only time I happen to notice my content or images that appear elsewhere is because I happen to be checking logs and notice a bunch of incoming traffic from the same site or page. It tends to be other sites using my screen shot images. It is a bit annoying. I don’t have a lot of time to contact the site owner, host, or dmca. It does make me wonder how much is actually copied, scraped, etc. and many people probably don’t even realize it.

  24. really learnt a lot through this article.images made it easy

  25. Very interesting article, and there are a couple of links I should definitely check. As for me I just publish rss as summary and not full posts. It works for now but one of the tool I have to check is the rss footer plugin.

    Thanks for the interesting article.

  26. Well, lots of people having their own blogs have to deal with this problem of content scrapping. This is the same problem with me as I have a blog on umrah packages and I have to suffer a lot due to the theft of my content but now I will check out the tools suggested by you. Your suggestions are worth practicing.

  27. Great post kristi and thanks for the link to wordpress Rss footer plugin. As you say its a brilliant way to get some credit back even if they are low quality sites. Better than nothing at all isnt it?

  28. Thanks so much Kristi!!! I’ll have to go check my other blogs but I know for certain that this is happening regularly with one of them… I had been wondering what all of those WordPress Trackbacks were and if it was a good thing or not. Now I know and I know what to do about it.

  29. Most of the blogger are facing the same problem including me. This post is giving the valuable information. Thanks for this post.

  30. Content Thieves really annoy me so thanks for you suggestions.

    We take the time to create good unique content and some **** steals it and half the time doesn’t even bother to spin it.

    Lazy scumbags.

  31. Thanks for the invaluable info!

    That make two of your post i now need to take action after seeing. You are deff keeping me busy :)

  32. I am glad that you point out that there are some situations where content scrapers benefit the sites from which they retrieve information.

    A good example of “white hat” web scraping is my horses for sale platform at http://horses.fm, Horses Farm & Market. It indexes horse for sale ads from other sites and compiles them into a single, searchable list. It gives a very brief summation of each external ad with a link to the original ad, and it also has Twitter, Facebook, and Google+ share buttons for each external ad as a courtesy to those sites. The intent is to augment the external services may providing a layer of greater functionality and accessibility.

  33. Thanks so much for the WP “trackbacks” clarification. I usually just spam or trash them when the sources are less than reputable (most are). Ironically earlier today I was curious and followed one trackback only to find my content on another site with back-links still intact? I do use anchor text in my href’s often so like you said it’s odd that they “scrapers” leave it all intact? Anyways thanks so much for the clarification.

    Cheers!

  34. Great publish and blog! I do not have time for you to read every publish right now however i have book-marked it as well as added your Nourishes, then when I’ve time I’ll be to find out more. Please continue the truly amazing work.

  35. Finding out such content scrapers of your site is really a tough job. But thanks for your nice post and tips I will act upon these tips mentioned in your post to trace content scrapers of my site.

  36. Hi,
    Its very difficult to stop content scrappers. They will get the contents checked for copy scape, with the Google Analytics showed some results for copy scapes. And they do necessary steps.

  37. In my opinion you can fight and win but you you still lose in one part. You will lose your time and health dealing with content scrapper. I got a question will content scrapper post outrank our post as the original post? Will our page be devaluate by Google when there is duplicate content?

    I go for take a benefit from content scraper.

  38. While I certainly don’t track down every content thief, I do pay attention to scrapers and go after them — especially the ones stealing all of the blog content. I’m a writer. I make my living largely blogging for clients (and for myself). When my blog posts appear for free on sites without permission and those bloggers are monetizing them, it cheapens my work in the eyes of prospects and clients. That’s just as bad as someone stealing images that a photographer might want to sell in product form.

    I’ve had several sites shut down over the years and even more forced to remove specific posts. A DMCA takedown notice to the host is always a good idea. And if you automate the infringement searches and you use templates for the notices, it really doesn’t take much time or effort to go after them.

    I also tell other writers and bloggers to consider approaching two other groups which are better at hitting thieves where it hurts:

    1. Search engines — if the stolen content is appearing in results at all, especially anywhere near (or above) your own

    2. Advertisers — they can have their ad network account access suspended over copyright infringements which can not only knock out their income on that site, but on any site where they’re using the network’s ads (they often have scraper sites in more than one niche)

    Learn to craft a firm cease and desist letter, and you rarely even have to go that far. The vast majority yank the content down without 48 hours when I send them a demand letter to that effect.

  39. Great info in this post and in the comments! I’ve had several of my posts “scraped” and I always comment on the blog as soon as I find it to express my extreme ire and demand the poached post be immediately removed. If it is not, I proceed with further action but have so far gotten cooperation. Some of the scrapers apologize and pretend not to realize that they are stealing content because they included all my credits, but I am careful to educate them about posting content I wrote on their site without my knowledge or permission as being a hybrid of pirating and plagiarism, scraping. I always ask them if they would like it if I copied their posts and put them on my blog without them knowing about it. Another thing that really bugs me about scrapers – they are never good sties that steal my content, they are crappy sites and some are even questionable, that I don’t want to be associated with in any way.

  40. Thank you so much, this is very interesting! I’m a victim of content scrapers!
    Cheers

  41. Thanks for this info. It is still unclear to me; if google finds duplicated content taken from my site will google then consider that original content still on my site as being duplicate content and negatively impact my site or will google recognize that my site has the original content and not impact my site?

  42. I just realize that someone is stealing all the content from our website … Our domain name is http://www.mad-mongolia.com and someone has registered madmongolia.com and somehow has got all our content on it .. how can I prevent it ? Is it illegal in USA ?

  43. That is what I want to find on the web. It will definitely help me a lot. We have seen many scraper of our blog’s content.

    Best Regards.

  44. Thank you for the post!

    Someone (psdto-magento .info) copied our site http://psdtomagento.com/ and advertising via Google adwords. Even they are displaying my site name as well.

    Please suggest what to do to take down their website.

  45. Wonderful post Kristi Hines, intuitive and clear too.

    I have a big, important question.

    In the case of article directories or blog directories, one can post their article on their website exactly as it appears on their own blog. You can simply copy and paste content from your blog into their post area and publish it their for whatever benefit one is searching for such as exposure.

    Is that stealing/scraping?

    The content will obviously appear somewhere in the internet exact; title and body including the links within.

    Being the one who did do the copy and posting, can I be labelled as a scraper of my own content?

    If that site, lets say the article directory, has advertisements which is their sole income earner, does that mean they are benefiting financially by having content such as the one I did post on them?

  46. OK, scrapers: Grrr!! Snippet scrapers: Hmm (so long as they don’t appear above you in Google… But what about a scraper that reposts your content in their own language?

    By complete chance we discovered a Russian website had taken all of our post content but translated them completely into Russian! Hard to track down were it not for a freak coincidence and this leaves a language barrier between us when attempting to get them to remove content.

    Thank you so much for the RSS footer plugin suggestion.

  47. It’s a shame that this goes on as much as it does. On one of my other websites, I have a site that does pingbacks just about every day. At first, I thought it was a good thing, but I see now that I am supplying all their content for their blog!

  48. A great help for me. Thank you very much for this informative post. I really appreciate your effort. And now I am using plugin by yoast to get backlins if someone scrap my contents via RSS feeds!!

  49. Thanks for the information! In the spirit of stealing, I hope you don’t mind that I copied your format for the RSS Footer plugin. :) I’ve been using it for a while, but I liked the way you added the links to Facebook and the rest. And as far as making sure you use internal links on each post in order to get notified via trackbacks if the article shows up anywhere else, thanks for the reminder! I had forgotten that. You did note that they can end up in the spam folder if using Akismet. Do you have any ideas how to avoid that? Or do I just have to go in periodically and check?

  50. Very, very, very helpful – thanks so much Kristi. One tweak I made to your formula above was to add thumbnail image of myself via HTML to the RSS Footer line. I’m going to publish my first post tonight with the plugin installed so I’m kind of excited to see how this looks on the Scrapers end. Plus I’m guessing it will help with my branding and people associating my face with the content.

    Here is the code I used:

    You are reading %%POSTLINK%% by Mike Mintz from My Media Labs. If you like this post then follow Mike on Twitter, Facebook and Google+.

    Hope this works!

  51. Great post Kristi,

    I use copyscrap and Google Alert but with a slightly different way.
    Many content scrapers change the posts’ title so I don’t think it would be effective to set the Google Alert for the post’s title.
    What I do is I choose a couple of sentences, a sentence in the first paragraph and a sentence somewhere in the middle.
    Or two continuous sentences in the first paragraph and two continuous sentences in the middle.

    This has always gave me better results.

    Thank you for the great post.

    -Mohamed

  52. It’s a shame that this goes on as much as it does. On one of my other websites, I have a site that does pingbacks just about every day. At first, I thought it was a good thing, but I see now that I am supplying all their content for their blog!

  53. Thanks for the info! Used it and found someone stealing our content before I even finished reading your article! Creating unique content is our goal and I am shocked because someone took it upon themselves to use our articles in almost completeness as their own.

  54. I discovered that ‘www.traditionalmusic.co.uk’ has copied my entire collection of music hall lyrics which I’ve actually spent many collecting. Some of these songs date as far back as early 1800s so are in public domain anyway. I started to scan, type and upload them about 2 months ago as my retirement pastime. Yesterday, I found that http://www.traditionalmusic.co.uk have copied every entry of my entire collection along with performers, composers, dates and even, I noticed, my typing errors. 1600 items to date. Googling some of the rarer titles, I find that they are far outranking me!!!
    Following your advice, I searched ‘whois’ for the owner but found, strangely, that traditionalmusic.co.uk is actually available for sale. Is this possible?… and can I report this to anybody?

  55. According to Google, content scrapping is called Content Syndication, here is a snippet from webmasters support:

    Syndicate carefully: If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you’d prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article.

    Source: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=66359

    My question is if the copied content has a link back to the original author’s blog, is it not considered a spam? What I understood from the above text block is, Google will even index the syndicated post and it displays it in the search result (if relevant to user).

  56. Thanks for the tips! I just had one of my posts “curated” for the first time (that I know of) and I feel weird about it… I put a lot of work into it, and some guy has a blog that’s just filled with content he takes from other people? It’s only a partial post and there is a link to my site, but I would have liked it if I was asked, or even just a comment to let me know about it… seems like that would be the expected etiquette? I haven’t seen any traffic to my site from it yet, so time will tell… maybe I’ll see the bright side of it. I’ll take the steps you recommend with the RSS feed for any future instances, thanks.

  57. Thanks Kristi. You just helped me to catch someone who had stolen my content. As advised I have contacted them and they pulled it out of their website.

  58. Great post ,
    My site is about tech gadgets and web 2.0 . So i do many reviews on products in it .some , i do it hands on and to test out a single gadget like samsung tab 2 for example takes me 2 or 3 days .
    Some site just blatantly copy the full post and its really hurts . If its a smaller site then maybe i dun care much , but the site that copy my content is big and with a huge follower base . Stealing my work 100 percent without even giving some credit back to me really hurts ….
    Im just 6 month into blogging but i intend to make it big …now met with thief like this really stall my way but gonna keep it up anyway ^^

    Once again , Thank You very very much for you post ^

  59. I don’t know who’s worse. You or them.
    You scan the net for your keyword competitors content and copy their idea’s. Many of the guest posts you write aren’t based on your own experience and the blog owners know this but use you for your content.

  60. But why on earth do these exist in the first place?! What possible benefits could they get out of scraping for content and who reads these phony websites?

  61. Great article, I am starting to blog now and I do worry about intellectual property theft, etc. So far I haven’t had anyone (to my knowledge anyway) stealing my content (early days I guess; or just poor content I produce!) however I had people stealing photographs of projects I have the copyright of and show my work (house extensions, home design, etc; I am an architect). Any idea how I can find sites that used my pictures?

    Thanks a lot for any feedback.

  62. I received a comment today telling me that I have no right to demand or ask for a link back for ‘fair use’ excerpts copy/pasted from my site and placed somewhere else, whether it be an online forum, blog or newspaper. The commentator said I was entitled to attribution, but nothing more.

    Doing a search to understand if this is true, I landed here. Do you have any expertise or knowledge on this?

  63. You know what, I found one of my best article a few years ago on another site. I was so mad that I couldn’t think straight and obviously didn’t know what to do about it. Now that I’ve read your article, I will take precautions and hopefully this will never happen again. Thanks! Great article :)

  64. Hi Kristi,
    I liked your post. It was very informative. I have to tell you that I am a total novice in Internet language and hence I always am lost after a while. I need a little help and I hope you can help me solve my problem.
    I have my own blog Bluemoonchef.blogspot.com
    It is a food blog.
    The problem I am encountering it is that one particular post of mine ( Quick Red Hot Salsa) seems to have suddenly become very popular and I have been receiving a lot of appreciative comments but all these readers seem to be promoting “cigarettes” and hence these comments are automatically going into the spam box. I have requested often that they do not add their web address but to no avail.
    I do not want to promote any cigarettes or any thing detrimental to good health.
    I would like the comment to be posted but without their web address that promotes ill-health.
    What should I do? ( in simple steps for dummies like me).
    Thanks,
    Cheers!!
    Tulika Chari

  65. Hi Kristi,
    I was just looking thru the comments to your post, to see if people had similar problems and I found that one of the comments( the top 5 lines) are exactly the same that I have received.
    Does that mean something? Or is it just that the person uses same language everywhere including the time they spent on the Internet.
    Thanks,
    Tulika Chari

  66. A company called “turnitin” markets an “Originality Checker” to schools for grading papers, and brags at how it has been scraping contents off of millions of websites to accomplish this.

    The problem is, especially if your website is e-commerce, that they are scraping your content and making money from it. This is, of couse, illegal, as they are violating copyright law.

  67. Yes I have seen my original content over other blogs with out my permission they have stolen my content! because of these spammers my site was penalized by Google!
    Still I don’t know how deal with copied content can anyone say me how to deal with those spammers and content thieves

  68. There’s been some posting around about feedreader.com, which is doing a lot of frame scraping–ie, they are stealing your traffic by framing your content. Adding this javascript to your header will stop your page from being framed like that.

    if (window!= top) top.location.href = location.href;

  69. Hmmmm.. my javascript got edited out! anyway, do a search for the above, and you should find a google post about it.

  70. Thanks for this post… I always wondered what Trackbacks were lol

  71. Thanks a lot for sharing such a wonderful information..

  72. very nice information

  73. well very nicely written but unfortunately there isn’t any way to stop them

    but bes thing you can do is to inset your site links in your posts i.e link blog posts with one another . Its simply the best way to handle this then no one will copy from you :)

  74. Do you think the google authorship is having any impact on the content that is getting scrapped? I am curious if google knows where the article came from if it is making a difference.

  75. Thanks! I bookmarked this so I can come back and try a few of the suggestions at a later time.

  76. Awesome. I just found this http://internet-search-engine-marketing.com/business/the-shift-to-visual-social-media-6-tips-for-business-infographic?hw=1080-1870-1 and sent them a message to cease and desist (ie take down). Jeez, it’s amazing what some people will do – whatever happened to old fashioned referencing! I have had both Entrepreneur and Hubspot feature this post and they sourced it (Ent.com it was a guest post). It just is super rude, but thank you Kristi – I was glad when I did a search and an article by you came up – I knew it would be one to bookmark! On that note, I did have a fairly reputable site forget to reference one of my articles recently and I contacted nicely and it turned out the intern was learning, and it was a great way to help someone know the best way to do things and make a connection – everyone wins – I wanted her to share it, and she learned how to do it in a way that sources the original content. Scraping content is just another thing altogether!

  77. Nice and very important article!

    Here is an example of a site called: virtualvcp.in that has clearly and blatantly infringed on one of our sites called: http://www.cloudtweaks.com. They have copied almost everything to a tee and have moved (OUR) content from Germany to Russia and back again whenever we have approached their hosts threatening action. These guys are a notorious group of spammers/scammers as they have been doing this to several technology sites. It’s tricky to stop them as the laws in terms of DMCA take-down notices are not always relevant when the information/data is hosted outside of North America.

    Some of your suggestions are a good starting: You can even try to work with the .htaccess and use a CDN service or threaten DMCA take-downs. However, nothing is foolproof at this time. Its still work in progress and we are working hard at future prevention of such shady business practices.

    To be continued….

    John

  78. Thanks. Interesting stuff, but not sure about Copyscape’s efficacy. Google found several blogscrapes of mine (I searched for a unique string of text), while Copyscape found none. –Doug Laney, VP Research, Gartner, @doug_laney

  79. Great Article. Now i have to work on it too… I have seen lots of sites are copying my article.

  80. I just had a 2,000 word article on preppers that I found on someone else’s site when I was trying to see how it ranked. My site was pushed back behind the duplicate content comment on google so no one could find it. I sent him a bill via paypal and said he had 24 hours to pay it or remove the article or I would contact his advertisers and hosting company. He removed it.

  81. Thanks for sharing, good to know.

  82. Hello would you mind stating which blog platform you’re using? I’m planning to start my own blog
    in the near future but I’m having a tough time making a decision between BlogEngine/Wordpress/B2evolution and Drupal. The reason I ask is because your design seems different then most blogs and I’m looking
    for something unique. P.S My apologies for getting
    off-topic but I had to ask!

  83. Wonderful blog you have here but I was curious if you knew of any message boards that cover the same topics discussed here?
    I’d really like to be a part of group where I can get suggestions from other experienced people that share the same interest. If you have any suggestions, please let me know. Thanks!

  84. Hola! I’ve been following your web site for some time now and finally got the courage to go ahead and give you a shout out from Kingwood Tx! Just wanted to mention keep up the great work!

  85. My brother suggested I might like this website. He was
    once totally right. This put up truly made my day.

    You cann’t imagine just how so much time I had spent for this information! Thank you!

  86. we cannot stop content scraping now… but some of the big website scrap too but the good thing they did is put the the original author for every site they scrap well thats the right thing to do.. remember sharing is caring and google loves that :)

  87. Thanks for sharing for this wonderful informative article, but i agree on liza too we cannot stop content scraping.. let google punish those scraper

  88. This continually is amazing to me how bloggers such as yourself can find the time and also the dedication to keep on creating fantastic blog posts. This is wonderful and one of my have to read on the web. I simply want to say thanks.

  89. Fantastic post! I never knew about the linking part in Webmaster tool. I had no idea that one could do those stuff in Webmaster tool. Thanks for this informative article.

  90. Howdy, i read your blog occasionally and i own a similar one and i was just wondering if you get
    a lot of spam remarks? If so how do you reduce it, any plugin or
    anything you can recommend? I get so much lately it’s driving me crazy so any support is very much appreciated.

  91. I really don’t have the time to go after this people so I guess I will just install the plugin.

    What I’m really concerned about is that I already have enough plugins installed on my blog.

    And I heard somewhere that having too much plugins not only slow down loading speed, it also open a door for hackers.

    What do you think about this; I’m sorry it’s a little off topic.

  92. you have given a very round about way of combating content scraping…I was using atcontent plugin for some time and it was good…but I think too many plugins just slow you down…webmaster tool is a better idea and you can know what’s happening but you have to know your way around…

    I get some spam comments –do they steal your traffic

    visit my site if you can and point out any suggestions if you want

  93. You stated: “Unfortunately, unless you want to continuously search for your post titles in Google, you’ll only be able to easily track down sites that keep your in-content links active.” Good news – there is another simple way to find poachers. Perform a Google search on a unique phrase. Google will lead you to the sites that are using your content. I have done this in the reverse direction and that’s how I found you! No, you are not the poacher. I was sent an email that pointed to a file that contained a modified form of your hard-earned content. I suspected that It was ripped-off. I found you with this technique and I informed you about it. I didn’t follow-up on your response – who knows what happened. Good luck on keeping-them-honest. Regards, Sid

  94. I have the same problem and also index speed problem. So many sites are copying my article by RSS. Are they any way to understand that which article google indexed first, who is the owner of original content?

  95. Hello! I understand this is kind of off-topic but I needed to ask.
    Does running a well-established blog such as yours require a large amount of work?
    I’m completely new to writing a blog however I do write in my diary on a daily basis. I’d like to start a blog so I can
    share my experience and thoughts online. Please let me know if you have any kind of ideas or tips for new aspiring blog owners.
    Thankyou!

  96. I leave a response each time I especially enjoy a article on a website or if I have something to add to the discussion.
    It’s a result of the fire displayed in the post I looked at. And on this post Content Scrapers – How to Find Out Who is Stealing Your Content & What to Do About It. I was excited enough to leave a comment ;-) I actually do have 2 questions for you if you don’t mind.

    Could it be just me or does it look like like a few of the comments
    come across like left by brain dead folks? :-P And, if
    you are posting on other online sites, I’d like to follow you. Could you make a list every one of all your public pages like your Facebook page, twitter feed, or linkedin profile?

  97. wow !.. within 10 mins of reading your article, I catch one scraper .. tx

  98. A very comprehensive article on the tips, I am facing the issue for a month now. I guess it is the right time to file DMCA – but do you think the request by new bloggers are paid heed by the associations.

  99. Hey are using WordPress for your blog platform? I’m new to the blog world but I’m trying to get started and create my own.
    Do you require any coding expertise to make your own blog?

    Any help would be greatly appreciated!

  100. Woah! Worth sharing. DMCA filing and sending email to that person can work more effective to stop person from copying your content. Other wise you can paste code to disable copying option on your web page and images.

  101. Hi Kristi,
    Thanks for the great post! I’m new to blogging, and your post will help people like me and veterans alike.

    Peace, Jason

  102. Thanks for this post. I was going to install the RSS plugin you mentioned but from one of the comments we discovered Tynt which is perfect for people copying and pasting. I might see if we can run both so nobody can copy our blog material without us being aware!

  103. You might not come across my comment amid the 200+ comments on this page – but great job done on the research here. The Copyscape link is very useful – I just tried a few links and found some copied content! I arrived on your blog searching for how to report content theft, because I discovered a blog that has been copying my entire feed or something – exact text and pictures – on their blog!

    A great post, Kristi. Thanks!

  104. In my opinion it is best to leave content scrapers apart. Filing DMCA report and other sort of legal stuff is not a good approach because:

    1) You are wasting hell a lot of your time in fighting rather than concentrating on your future posts.
    2) Even though you take them down, there are chances that more content scrapers are gonna come in the days to come.
    3) Google is clever enough to spot content scrapers and de-evaluate them. When Google does this job, why don’t you just leave it to them.

    It’s not about getting link benefit. As you said, they don’t have any value. But cheerfully smile when somebody is scrapping your content because it means that you are getting popular. Search engines would definitely think that you have a great website that people are copying like crazy.

    But importantly make sure your site gets indexed quicker than the scrappers. This can be done by setting up xml sitemap and making the navigation clean and easy for users.

    Cheers

Comments are closed.

← Previous ArticleNext Article →