What Does Duplicate Page Content Mean?
This issue means that you have two or more URLs that have exactly the same content on them. This could have happened due to a server configuration problem, a CMS issue, or by adding the same content to multiple pages.
Keep in mind, search engines may have a different interpretation of what duplicate content is than you do. Search engines expect every URL to contain unique content. While you may think of these as all the same page, they are all unique pages to a search engine:
Why It's Important
While most search engines don't have a "duplicate content penalty" in their algorithms, duplicate content can still cause a major headache for your SEO success:
Dangers of Duplicate Content
- Search engines may not know which version of the page to rank for a keyword – If there are multiple URLs with similar or identical content, search engines may get confused which could cause your URLs to get filtered or rank poorly.
- Wasted Crawl Budget – Search engines do not have unlimited resources, so they usually set limits on how much of your site they're willing to crawl. The number of pages they crawl on your site is referred to as the "crawl budget", and like any limited resource, it must be rationed and prioritized. Ideally we want search engines to crawl and index the most important pages on your site before crawling or indexing less-important or duplicate pages. Therefore, if there is much duplicate content on your site, search engines could use the entire crawl budget on a small number of unique pages and the rest on a large number of duplicate versions of each of them. Sites with only a few hundred or thousand URLs don't need to worry about this, but for larger sites, this can be a big issue.
How to Fix
There are several options for fixing URLs that share exactly the same content:
- Update your server configuration or content management system so that these duplicate pages are not published.
- Be consistent in how you link to URLs on your site (e.g. don't sometimes link to pages with trailing slash http://www.example.com/products/blue-product/ and other times without http://www.example.com/products/blue-product)
- Choose one URL as the preferred (canonical) page, then add the rel=canonical tag pointing to this preferred page on all the other non-canonical versions.
- Choose one URL as the preferred (canonical) page, then do a 301 redirect from the other non-canonical versions to the canonical one.
- Eliminate unnecessary URL parameters
- Exclude the duplicate URLs from being crawled by using the robots.txt file, meta robots tag, or x-robots tag.
- Rewrite the content on each of the duplicate URLs so that every page has substantially unique content.