October 2007 Archives

I came across a situation recently where a site was getting hammered by a couple of different ip addresses. The one thing they had in common was that they were all using the same User Agent, VoilaBot in this case. Instead of blocking each IP using IPTables, the following was added to the .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} VoilaBot
RewriteRule .* - [F,L]
Any client with "Voilabot" in the User Agent now gets a forbidden page.

If you want to ban multiple User Agents, you can do the following:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} UserAgent1 [OR]
RewriteCond %{HTTP_USER_AGENT} UserAgent2
RewriteRule .* - [F,L]

It's not unusual to have multiple domains pointing at one site, possibly for brand protection or vanity domains.

If these extra domains are simply added in as aliases of the main domain, they will all appear as separate sites to search engines such as Google or Yahoo. This means that the various domains probably incur a duplicate content penalty.

Fortunately, this is easy to get around. Presuming that your primary domain name is example.com, the following can be put at the top of the .htaccess file in your document root:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
The RewriteCond line checks to see if the HTTP_HOST¹ is NOT example.com. If this is the case it will do a permanent(301) redirect to http://example.com/$1, where $1 is the part that was after the HTTP_HOST.

As written above, example.com will appear as the canonical url rather than www.example.com. If the user does enter www.example.com, they will get redirected to example.com. To change this behavior, add in the www in the third line.

¹HTTP_HOST is the domain name that the user is using to access the site  when the .htaccess file is called by Apache.

About this Archive

This page is an archive of entries from October 2007 listed from newest to oldest.

September 2007 is the previous archive.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.01