What is the Rel=Canonical Tag?
A typical rel=canonical tag:
<link rel="canonical" href="http://www.example.com/">
Oftentimes, a page's content can be accessed via multiple URLs (especially if using URL parameters), or a group of pages is so similar it's only worth crawling or indexing one of them. The rel=canonical element is a way to identify which of these URLs is the best page to represent this group (also called the "canonical" page) and should be indexed. All other URLs will not be indexed.
For example, all of the following URLs have exactly the same content on them:
If no canonical tags were used, these URLs may all be flagged as duplicate content. Instead, choose one version of the URL (e.g. https://www.example.com/products/blue-product/) as the canonical. Then add a rel=canonical tag to all of these pages with the target set to the canonical URL. This would signal to search engines to index the canonical URL only and ignore all others, thus eliminating the duplicate content issue.
What Does "Rel=Canonical Empty / Missing" Mean?
If a URL is listed in the table below, this means that there was no rel=canonical content found on the page. This could mean one of two things - either the tag is missing completely, or the "href" attribute was left empty with no content (e.g. <link rel="canonical" href="" />).
Why It's Important
The omission of the rel="canonical" tag on your site does not mean you have an issue with duplicate content. However, if you have multiple pages with duplicate or very similar content, omitting the rel=canoncial tag will likely result in duplicate content issues, which can be a major problem for the SEO health of your site.
Dangers of Duplicate Content
Wasted Crawl Budget- Search engines do not have unlimited resources, so they usually set limits on how much of your site they're willing to crawl. The number of pages they crawl on your site is referred to as the "crawl budget", and like any limited resource, it must be rationed and prioritized. Ideally we want search engines to crawl and index the most important pages on your site before crawling or indexing less-important or duplicate pages. Therefore, if the rel=canonical tag is not used properly, search engines could use its entire crawl budget on a small number of unique pages and the rest on a large number of duplicate versions of each of them. Instead, by using the rel=canonical tag, we can tell search engines only to crawl and index the unique pages on our site.
Lower Ranking or De-Indexation - Search engines really hate duplicate content since it wastes their resources and provides little value for searchers. It can also be an indicator of low-value or thin content. Because of this, recently many search engines have begun to use duplicate content as a ranking factor, looking unfavorably on sites that have problems with duplicate content. Not only will the duplicate pages be affected, but your entire site's rankings could go down as a result of duplicate content. Some non-duplicate pages could even be de-indexed. This was the focus of Google's Panda update in early 2011.
How To Fix
The best practice is for all URLs on the site to contain a rel=canonical tag. You'll need to decide, "Is this the canonical URL?"
Yes, this is the canonical URL – Add a rel=canonical tag with the target pointing back to this page’s URL (self-referencing rel=canonical tag).
No, this is not the canonical URL – Add a rel=canonical tag with the target pointing to the URL that should be the canonical one.