If you have a lot of duplicate or very similar content on the same site, without using proper canonicals to indicate which version Google should index, Google will default to the shortest URL over the longer URL.
I asked Gary Illyes from Google about how Google selects what content is canonical when dealing with duplicate or similar content on the same site, when there aren’t canonical or other signals to do so. Normally, Google will choose the originating source to display, but when the duplicate content is on the same site, it gets a bit murkier over what Google would choose to be the URL to display.
@methode For a duplicate content scenario on the same site, would Google favor the shortest URL structure page over a longer URL?
— Jennifer Slegg (@jenstar) October 7, 2015
And he confirmed it in his response.
@jenstar Umm, where did that come from? 🙂 Yes, in general the canonicalization algos prefer shorter URLs if you leave it up to them.
— Gary Illyes (@methode) October 7, 2015
I raised the question after noticing Rand Fishkin commenting on Twitter about an issue that Google was choosing one version of a very similar category (https://moz.com/ugc/category/whiteboard-friday; part of the YouMoz section) to display over another version (https://moz.com/blog/category/whiteboard-friday; part of the main blog), the latter being the page that Fishkin felt was the better option for Google to display.
Both pages had links pointing to them, and each page had the canonical listed as itself.
That said, if you have a similar issue and are not clear why Google is choosing one duplicate or similar page over another, the URL could provide the clue.
But the problem is solvable if you find yourself having the same issue. Simply use the correct canonical to select one page over the other (which is what Moz is doing for the individual Whiteboard Friday pages) or employ a redirect.
Jennifer Slegg
Latest posts by Jennifer Slegg (see all)
- 2022 Update for Google Quality Rater Guidelines – Big YMYL Updates - August 1, 2022
- Google Quality Rater Guidelines: The Low Quality 2021 Update - October 19, 2021
- Rethinking Affiliate Sites With Google’s Product Review Update - April 23, 2021
- New Google Quality Rater Guidelines, Update Adds Emphasis on Needs Met - October 16, 2020
- Google Updates Experiment Statistics for Quality Raters - October 6, 2020
Dan Shure says
Definitely an interesting comment from Gary. I can see Google doing that.
Moz’s situation was quite different however. It seemed to be an issue where due to all the old legacy URLs (seomoz.org, non-https moz.com as well as old duplicate categories) they are sending a mis-direction to not index the right /blog category URLs at all. I know it’s something Google should figure out (that Moz really wants those pages indexed) but it doesn’t seem to be a duplicate content issue.