Having multiple URLs hosting nearly similar or identical content has many disadvantages. First of all, the search ranking score is being divided between the two URLs. Secondly, search engines may elect to penalize the pages if the content duplication, which is what the search engine will think of the situation, is widespread. There are other disadvantages that are not listed as well.
The canonical URL tag may be helpful in these situations:
- A website uses a slightly different GET variable to display the same content with minor variations. For example, http://ww2db.com/person_bio.php?person_id=1 displays Admiral Isoroku Yamamoto’s biography and http://ww2db.com/person_bio.php?person_id=c1 displays the same biography specifically a Japanese officer. The latter may point to the former as its canonical. This is also helpful in the situation where session IDs, login IDs, etc. are being carried in the URL as GET variables.
- A website uses a different URL for its mobile content or print-only content. For example, if http://www.example.com/mobile shows the same content as http://www.example.com except for different page layout, then the mobile page may point to the main page as its canonical.
Below is an example of how to establish a canonical. Note it resides in the HEAD section of a web page.
This is the biography for the Japanese officer Isoroku Yamamoto.
As you can see, in the above example, the biography of Admiral Yamamoto as a Japanese officer points to his regular biography page as its canonical, thus all the search ranking related information for the variant version of the page goes back to the original version of the page.
It works very similar to a 301 redirect, although not as comprehensive and not as absolute. Nevertheless, it may be helpful as it may cause less headaches to the web developers, especially if the website’s size is large.
It is important to note that the canonical URL must reside on the same root domain as the variant. For example, the Japanese officer version of the biography as listed above cannot point to http://www.combinedfleet.com/officers/Isoroku_Yamamoto as its canonical as it is on a different domain.
The canonical URL tag is a hint for search engine crawlers; it is not a directive, thus it may or may not be absolute that search engines will follow it. However, major search engines such as Google, Yahoo, and MSN Live Search had noted that they will respect this hint.