When Rel=canonical Doesn’t Work: a Tale of Not Similar Content and Noindex
This post will be not the one I initially intended to write. My idea was to test using rel=canonical and noindex simultaneously as I still see many questions about the usage of these tags. But my expected experiment gave me unexpected results even before I added the “noindex” part to it. And I thought: Oh wow, that is even more interesting! And I’ll share it with you today!
What is rel=canonical
First things first. What this tiny rel=canonical is and how should it normally influence your pages indexing?
Rel=canonical is set on a page level and found in the <head> section. In the source code it looks like this:
<link rel="canonical" href="https://marketingsyrup.com/" />
Types of Rel=canonical
- Self-referencing canonical – this tag is found on each page even if it doesn’t have any full or partial duplicates. #SEObestpracticies
- Canonical pointing to another page on the same domain – helps to handle internal duplicate content issues.
- Cross-domain canonical – handles external duplication.
Benefits of rel=canonical
- It is a good way to solve duplicate content issues.
- With rel=canonical, all the duplicate URLs will be still available to users (in contrast to creating a 301 redirect when only the unique page becomes available, and all the duplicates redirect to it).
- Rel-canonical passes authority. So if your duplicate page has some links pointing to it, your canonicalized page will get all the link juice.
My experiment looked pretty straight-forward. And honestly, I didn’t expect to find anything new (experiment, huh) as an idea of using rel=canonical and noindex together has never seemed right to me. Still, I wanted to test it myself and also needed some screenshots. So the algorithm was the following:
- Create a page with self-referencing canonical and get it indexed by Google.
- Change the canonical so that it would point to another page on the website. Re-index the page.
- Add a noindex tag to the page and see what happens.
Why Rel=Canonical Didn’t Work
The algorithm seems easy, but I stuck on the 2 step for some time as Google wouldn’t respect my canonical tag.
My test page looked like that:
Here is what the URL Inspect tool said after the 1 step was complete:
Everything is sound: the page is indexed, Google chose the right canonical (self-referencing).
Then I changed the canonical tag to point to my post about Dynamic Search Ads. Google re-indexed the test page but decided not to pick up my canonical, it used the initial one instead. After that I submitted the Test page to re-indexing multiple times but still saw this:
Basically, Google just ignored my manually set canonical tag.
But then one thing occurred to me: the Test page content and the Dynamic Search Ads post content are very different:
So I decided to make these pages more similar and see what happens. I copied part of the content from the original page to the test page. Now they looked like that:
Just a few hours later Google picked up the canonical, and the test page became “a shadow” of the original page. Checking cache also proved that. And here is what the URL inspect tool showed:
I’m sure there are some exceptions. But as rel=canonical was designed for preventing duplicate content issues, it’s better to use is for this particular purpose.
So if you see that your canonical doesn’t work (and the implementation is sound), make sure that the canonical and canonicalized pages are similar enough.
Rel=Canonical and Noindex
After all these manipulations I still decided to show how Google will treat your canonical tag if you also add a noindex tag to the page.
The answer is: don’t even consider using rel=canonical and noindex simultaneously as they don’t work together:
If you want to noindex a page, having a canonical is fine, it’ll be ignored anyway.
But if you want Google to understand and use your canonical tag… why are even considering noindex? 🤔
Rel=Canonical vs Noindex
Should you use noindex or canonical?
This is a fundamentally different question and the answer depends on the context. Rule of thumb:
- Use rel=canonical when the duplicate page is important and might get external links pointing to it. This way you will make sure that authority from these links is not lost (e.g. one product is available on multiple URLs as it’s accessible through multiple product categories).
- Use noindex if duplicate pages don’t add any value and are needed for navigation or other purposes (e.g. tags, blog categories, though it doesn’t mean that they should always be noindexed, it depends on a website).
A Bonus Canonical Story
As you see, rel=canonical is a tricky guy and there are so many ways you can implement it wrong. So here is an interesting example for you.
A website has 2 identical pages:
b) homepage.com/index.html (ugh!)
We really don’t need the second page to be accessible by users but still it might get links, so we choose the best solution in this case – a 301 redirect from page A to page B.
After more than a month I still see that the page with index.html gets lots of impressions and clicks according to GSC. And here is what I see in the search results:
The page redirects correctly, so Google should have already changed how it looks in SERPs… But maybe not. If you look at the canonical tag on Page A, you’ll see this:
This looks like some “authority loop”. So there’s no wonder why Google shows this page even though it is redirected to another one.
Canonical is a very strong tag. And Google will respect it if you respect its implementation. Have you had any funny or not-so-funny stories about canonicals? Share them in the comments!
I cook digital marketing dishes. Take 3 tablespoons of on-page SEO, add 2 pinches of backlinks and sprinkle it all with paid advertising. Season to taste with actionable data from Analytics and bake until golden brown. Serve hot.