4 WordPress Plugins to Fight Content Scrapers


Scrapers are the bane of any blogger’s existence. Web scraping sweeps in, steals your content, claims it is their own, and sometimes there is no way of proving otherwise. Surprisingly, Google hasn’t been too smart at identifying the original content author in many cases. Very often, my Google Alerts notify me of my scraped articles rather than my original (guest) posts and I’ve seen scrapers outranking original articles for long tail searches many times.

There is occasionally a story of a blogger who managed to get back the rights to their content – but it’s more like fighting the windmills. You kill one scraping blog and dozens of them get born overnight. Therefore it is much better to try to prevent scraping (or at least get labeled as the original author) rather than rely on being one of those rare successes.

Plugins to Prevent Web Scraping

1. Google Plus Authorship

web scraping

Google has been trying to fight scrapers for ages and one of its patents (which is part of AuthorRank patents) suggest using authorship to:

 “. . .detect and protect against revision of content after it has been posted by a person or entity.”

Implementing Google Authorship is much easier nowadays (here’s a quick guide), but on many blog set-ups (where there’s no author byline, for example), it can still cause confusion. In these cases, this plugin will help.

It allows you to add a G+ profile picture to search results, confirm authorship, and even grant authorship to multiple authors. It works on a three step system that is very easy to follow, and there are no bugs to worry about.

2. Feed Delay

web scraping

Half the risk to a small to medium sized blog is having a scraper bot picking up content, publishing it without attribution and then getting the page indexed first (weirdly enough, Google hasn’t been able to knock these sites off or even find the original owner of the content).

Since there are probably at least a couple of bots hiding in your RSS subscriptions, your best bet is to delay feed from being reposted. This plugin will do that for you.

3. Anti Feed-Scraper Message

web scraping

Most scraping is done by bots, without any actual oversight from humans. So they have no control over what content is published, or how. This is a major plus for you, as you can add a link to your blog in all content, which will show up upon reposting.

Anti Feed-Scraper Message does this, showing Google and all readers where the post originally came from. It also keeps any accusations from the message, so protects you from scandal claims by the scrapers. The message reads: [Post Name] originally appeared on [Site Name] on [Post Date].

4. Copyright Proof

web scraping

Along with the one above, this plugin can be used. It allows you to digitally certify your ownership at the time of publication, making a certificate that you can show in the case of someone stealing your content. It has a copyright, licensing and attribution license at every post, as well. There is an additional feature for anti-theft if you choose to use it.

Do you know of a good plugin for protecting content against Scrapers? What about outside of WordPress?


More in: , 9 Comments ▼

Ann Smarty Ann Smarty is the founder of Viral Content Bee, a social media marketing platform, and the founder of SEO Smarty, an SEO consulting and link building agency.

9 Reactions
  1. Getting scraped is horrible, but having the scraped content rank ahead of the original is a travesty. Great tips here to help mitigate the risk.

    • Yes, Robert, and Google claims to be able to tell the original versus a scraped version, but as Ann’s article attests, it does not always do that.

      – Anita

  2. While we encourage our content to be shared (under a CC license), I do find our articles around without the attribution from time to time.

    We implemented Google authorship sometime ago and I like the idea of the message in the feed. I’ll be adding that Anti Feed-Scraper Message plugin.

    Thanks for this!
    David

  3. Thanks for the tips. I will also be adding the Anti Feed-Scraper Message plugin straight away.

  4. Great post, I didn’t know the G+authorship was the best weapon against scraping. Thanks!