In this week’s Google Webmaster Hangout, a question was raised about using rel canonical across an entire site, with each page using the rel canonical to point to itself, in order to prevent the possibility of duplicate content being indexed.
The question asked was “Is it still okay to put rel canonical on every single page pointing to itself, just in order to avoid duplicate parameters and things like that?”
John Mueller responded yes, and the person then clarified to ask if it matters if it is site wide across a million pages.
It doesn’t matter how many pages. You just need to make sure that it points to the clean URL version, that you’re not pointing to the parameter version accidentally, or that you’re not always pointing to the homepage accidentally, because those are the types of mistakes we try and catch.
You can do that across millions of pages and we’ll try to take that into account.
Sometimes URLs with URL parameters – such as parameters used for tracking or advertising – end up being indexed. And while duplicate content isn’t a penalty per say, there is the possibility it can cause issues, particularly if a URL parameter URL ends up being linked, which often happens when others link to a page they may have originally clicked through from a social media site or ad campaign. Other reasons we see this being used is for www versus non-www (although preferred domain in Google Search Console should definitely be used for this), pagination, depending on how your server handles upper and lower case characters within the URL, as well as for PageRank reasons.
That said, it isn’t necessary to use rel canonical across every page on a site. But if you make extensive use of parameters, it might make sense to do this site wide as a preventative measure.
Using rel canonical for duplicate content issues isn’t new, but it is nice to have the clarification that it wouldn’t cause issues even if deployed across millions of sites when each page is canonicalized to itself.
Google also released a blog post on best practices for using rel canonical a couple of years ago, detailing how to use to properly as well as some of the issues they see in the implementation.
Jennifer Slegg
Latest posts by Jennifer Slegg (see all)
- 2022 Update for Google Quality Rater Guidelines – Big YMYL Updates - August 1, 2022
- Google Quality Rater Guidelines: The Low Quality 2021 Update - October 19, 2021
- Rethinking Affiliate Sites With Google’s Product Review Update - April 23, 2021
- New Google Quality Rater Guidelines, Update Adds Emphasis on Needs Met - October 16, 2020
- Google Updates Experiment Statistics for Quality Raters - October 6, 2020
Tom B says
Does that mean content such as: https://www.originalsite.com/blog/read-this-now.html should have a canonical tag leading to https://www.originalsite.com/blog/read-this-now.html…
…while a duplicate post at https://www.syndicatedsite.com/blog/blatant-copy-of-read-this-now.html should use https://www.syndicatedsite.com/blog/blatant-copy-of-read-this-now.html as its canonical?
Because that seems to defeat the purpose of the canonical.
-TB
Jennifer Slegg says
No, it would mean using rel canonical to itself on the originalsite example, while the syndicatedsite would canonical to originalsite.