What Does Duplicate Page Content Mean?
This issue means that you have two or more URLs that have exactly the same content on them. This could have happened due to a server configuration problem, a CMS issue, or by adding the same content to multiple pages.
Why It's Important
Duplicate content on your site is a big deal. At one point, it was simply an annoyance for search engines. But in recent years, search engines have made the elimination of duplicate content from their indexes a major priority, and likewise have begun de-valuing sites with duplicate content. Avoiding duplicate content has become one of the most important issues for the SEO of your site.
Dangers of Duplicate Content
- Wasted Crawl Budget - Search engines do not have unlimited resources, so they usually set limits on how much of your site they're willing to crawl. The number of pages they crawl on your site is referred to as the "crawl budget", and like any limited resource, it must be rationed and prioritized. Ideally we want search engines to crawl and index the most important pages on your site before crawling or indexing less-important or duplicate pages. Therefore, if there is much duplicate content on your site, search engines could use its entire crawl budget on a small number of unique pages and the rest on a large number of duplicate versions of each of them.
- Lower Ranking or De-Indexation - Search engines really hate duplicate content since it wastes their resources and provides little value for searchers. It can also be an indicator of low-value or thin content. Because of this, recently many search engines have begun to use duplicate content as a ranking factor, looking unfavorably on sites that have problems with duplicate content. Not only will the duplicate pages be affected, but your entire site's rankings could go down as a result of duplicate content. Some non-duplicate pages could even be de-indexed. This was the focus of Google's Panda update in early 2011.
How to Fix
There are several options for fixing URLs that share exactly the same content:
- Choose one URL as the preferred (canonical) page, and do a 301 redirect from the others to this page.
- Choose one URL as the preferred (canonical) page, and add the rel=canonical tag to all of the pages with the canonical link pointed to this preferred page.
- Exclude the duplicate URLs from being crawled by using the robots.txt file, meta robots tag, or x-robots tag.
- Rewrite the content on each of the duplicate URLs so that every page has substantially unique content.