Avoiding URL Canonicalisation With mod_rewrite And Apache

URL canonicalisation is where you have a website with different URLs outputting the same content. When search engine spiders see all this content that is the same they can get confused as to what page to display in search engine result pages. The following URLs, although they are different, actually produce the same content.

http://www.example.com
http://example.com
http://www.example.com/
http://www.example.com/index.html

The way to solve this issue is to redirect any requests to a single page using mod_rewrite. Add a .htaccess file to your root directory and include the following line to turn on the engine.

RewriteEngine On

The following rule will redirect the www page to the non-www page.

#Redirecting non-www to www.domain.com:
RewriteCond %{HTTP_HOST} ^domain\.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

Use the following rule to redirect from the index.html page to the directory name.

#Redirecting /index.html to /:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html
RewriteRule ^index\.html$ http://www.domain.com/ [R=301,L]

If you want to detect for the existence of mod_rewrite you can include all of the previous lines in an if statement like this.

<IfModule mod_rewrite.c>
RewriteEngine On

#Redirecting non-www to www.domain.com:
RewriteCond %{HTTP_HOST} ^domain\.com$ [NC]
RewriteRule ^(.*)$ http://www.domain.com/$1 [R=301,L]

#Redirecting /index.html to /:
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9}\ /index\.html
RewriteRule ^index\.html$ http://www.domain.com/ [R=301,L]
</IfModule>

 

Add new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
3 + 3 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.