You run a website and you are worried about duplicate content. Hey, everyone is. The rules are pretty clearly written in Google’s policy on the matter, yet the way those rules are enforced isn’t. No one is ever quite sure how far those guidelines go and how exactly the system for rule breaking is run.
On this issue you have to look a bit more at precedent to understand what are myths and what are realities in the world of duplicate content.
Duplicate Content Myths
These are probably the most common three myths that everyone believes, but aren’t what they seem.
Myth 1 – Internal Duplicate Content Penalty May Get Your Site Buried
I have lost track of how many blog posts I have seen talking about duplicate content penalties. Guess what… it doesn’t exist.
Let’s go back to basics: There are two ways to lose Google rankings:
- Get hit by a manual penalty: These are always followed by a “friendly” message in your Google Search Console notifying you that your site has lost rankings for some reason
- Get hit by an algorithmic update: These are usually confirmed by Google representatives. Google says they won’t be confirming some of the future updates (like Penguin or Panda) because they are now part of the algorithm but so far they have always been spotted by website owners. Follow people like @rustybrick and @dr_pete: They will report once anything seems to go wrong with many people’s rankings. That’s a good sign that you need to check yours too!
Now, none of Google representatives has ever confirmed the existence of “duplicate content penalty”, from what I know.
The only way duplicate content can harm your site is that Google may be confused which of non-original pages they need to rank, and they may accidentally rank the one you didn’t intend them to.
It’s a common problem for sites that run on WordPress, for example, where category and tag pages have content from your articles and because those category sections are linked a lot from around your site, they may outrank your blog posts. An easy way to troubleshoot the problem at home is to use easy tools like SE Ranking which shows you is that problem exists in your case:
Here’s a commonsense explanation: Google wants only one instance of original content in their rankings: They don’t want their users to click search results and see the same content again and again. So they have to choose one instance and bring the rest down (or filter them out).
It’s not a penalty (meaning it doesn’t mean Google punishes a particular page and you need to appeal to them to get un-punished) and in most cases they are very good at picking the most suitable page. According to an interview with Matt Cutts, more than 25% of web content en masse is a duplication. They are aware of the problem and have learned to handle it well.
It’s just that you don’t ever want to get Google puzzled, so you may want to look into any possible issues on your site to give them a clearer map of what’s more important on your site.
Myth 2 – Scrapers Are a Ranking Killer
So many of my posts have been picked up by scrapers. Guess what I never bother doing? Stressing about it. Remember the thing I said about Google having learned to handle the duplicate content problem well? It applies here too!
Google had a problem with scrapers at some point, but after a few algorithm updates they pretty much got it down. The original source isn’t always automatically recognized but in most cases it will be. In others it just takes a quick investigation to clear things up.
Look at it this way: you have a website with a ton of content written by you, for your audience, within a set period of time, on a particular topic and in a particular style. The scraper has a wide array of stolen content that won’t sound the same, be consistent, be posted in a reasonable manner or following any logic. Doesn’t take a genius (or more than a basic crawler) to see what is up, right?
It doesn’t mean you shouldn’t be monitoring your content to spot the thieves. PlagiarismCheck.org is an easy way to do that. It stores all the checks, provides you with the detailed analysis of any spotted plagiarism and it’s very affordable too.
I like the fact that they are very transparent about the methodology making me trust their results.
Once you have spotted the biggest offenders stealing your content on a regular basis, go ahead and ask Google to remove their pages from their index.
Myth 3 – There’s No Way to Re-Use Your Content Elsewhere
Now, make sure to read this section thoroughly. Yes, you can use content you wrote wherever you want (provided you abide with policies of the website where you published it initially). No, you still don’t want to confuse Google and force them to figure out the original.
Here are a few possible scenarios and how to handle them properly:
1. You want to publish your guest post on your site to let your site subscribers access the content easily
Solution: Check with the editor and/or with official policies of the site you published your content initially. If you are ok to re-use, go ahead and publish to your own site but use the canonical tag to point Google to the original or noindex it.
2. You want to syndicate your own article to a more popular media outlet for increased exposure
Solution: Pick an outlet that enables you to add the canonical tag pointing to the page on your site. One great example of such blog is Social Media Today. They let you publish non-original content adding your URL as the source.
3. You want to re-publish your guest article to Medium or LinkedIn Long-Form Content section
If it’s the article on your own site, you’d better refrain from doing it, simply for the fear that Medium or LinkedIn page may outrank yours. Try using the outline of the original content elsewhere instead of copy-pasting the whole article. Another option is to convert your content into a new format (for example, create an infographic using Canva or design a digital brochure)
Some huge publications don’t mind you re-publishing full content elsewhere, provided you wait a few weeks and also add the “source” link at the beginning of the article. Entrepreneur is one example, so, again check the policies or ask the editor!
4. You want to translate your article and use it on a foreign media outlet
It’s a very old question which was addressed by Matt Cutts back in 2011: In short, you are safe to translate and republish the same content in multiple languages. Don’t use automated translation though (because that will be flagged as spammy). Use authentic human translation. It can be as easy as finding a well-rated Fiverr gig or hiring someone through sites like Preply. Both options are highly affordable.
Duplicate content doesn’t just mean something that appears at more than one URL. That is a reasonable thing to have once in awhile and just quoting another person will technically be duplicate content. It is about maliciously breaking the rules for your own gain. Google’s crawlers are smart and their human workers are smarter.
Between the two you can generally feel safe about your content as long as it is well written and valuable. So next time you find yourself fretting over this policy and its sometimes vague and arbitrary seeming rules just ask yourself, “Is this valuable?” If the answer is yes then you have nothing to worry about.
Copying Machine Photo via Shutterstock