Canonicalization

URL Canonicalization linking is used when a domain has more than one page with simular content.
The method can be used to ensure that all incoming links are attributed to a single page and to avoid duplicate content penalies for simular pages on a single domain. (you cannot use this method to specify a canonical page from pages on different domains)

This method is useful if for instance you have a website with price lists in different currencies there could be several pages with almost identical content, specifying a canonical page will not only allow you to avoid duplicate content penalties for these pages, but google will also view all inbound links for the simular pages as though they were linking the single page you specify in the canonical attribute. this can obviously have a major positive effect on that pages position in google’s results.

How to Specify a canonical page.

Fistly determine which page you want to use as the original vesion of your simular pages, this is the page that all incoming links currently spread across the duplice content pages will be attributed to.

once you have determined this you can go ahead and add the following attribute in the HEAD section of each of the other pages that has simular content: (do not add it to the one you have chosen as your original version)

<link rel=”canonical” href=”full_original_page_url_goes_here” />

There is another aspect to canonicalization, which applies to home pages on a domain and the variences on the url that can be used to link that page for instance:

domain.co.uk/my_website/index.html
domain.co.uk/index.html

In this instance it is feasible, and highly likely that you would have inbound links pointing to both versions of the same page although this would not attract a duplicate penalty, it would just result in only one version of the page being indexed therefore the link juice pointing to the unindexed version is going to waste. But heres the good bit, you can use the rel=”canonical” full_original_page_url_goes_here attribute to redirect the link juice from the unindexed version to the version that IS indexed, this could also result in a potentially massive increase in inbound links for that page and have a major effect on its search engine position.

Webmasters who host their pages on Apache servers have an altogether different solution made availiable to them to avoid the links attributed to a single page from being split between different urls although I would only reccommend this technique for brand new domains/websites, as it results in urls becoming unusable This is acheived by editing the htaccess file on the servers public root folder.

For instance, adding the following commands to the bottom of the htaccess file:

RewriteEngine on
RewriteCond %{HTTP_HOST} ^yourdomain\.com
RewriteRule ^(.*)$ http ://www.yourdomain.com/$1 [R=permanent,L]

would stop anybody being able to access your site without using the www. prefix.
therefore nobody would create links to this url.

Leave a Reply