I came across a situation recently where a site was getting hammered by a couple of different ip addresses. The one thing they had in common was that they were all using the same User Agent, VoilaBot in this case. Instead of blocking each IP using IPTables, the following was added to the .htaccess file:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} VoilaBot
RewriteRule .* - [F,L]
Any client with "Voilabot" in the User Agent now gets a forbidden page.

If you want to ban multiple User Agents, you can do the following:
RewriteEngine on
RewriteCond %{HTTP_USER_AGENT} UserAgent1 [OR]
RewriteCond %{HTTP_USER_AGENT} UserAgent2
RewriteRule .* - [F,L]

It's not unusual to have multiple domains pointing at one site, possibly for brand protection or vanity domains.

If these extra domains are simply added in as aliases of the main domain, they will all appear as separate sites to search engines such as Google or Yahoo. This means that the various domains probably incur a duplicate content penalty.

Fortunately, this is easy to get around. Presuming that your primary domain name is example.com, the following can be put at the top of the .htaccess file in your document root:

RewriteEngine on
RewriteCond %{HTTP_HOST} !^example\.com$ [NC]
RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]
The RewriteCond line checks to see if the HTTP_HOST¹ is NOT example.com. If this is the case it will do a permanent(301) redirect to http://example.com/$1, where $1 is the part that was after the HTTP_HOST.

As written above, example.com will appear as the canonical url rather than www.example.com. If the user does enter www.example.com, they will get redirected to example.com. To change this behavior, add in the www in the third line.

¹HTTP_HOST is the domain name that the user is using to access the site  when the .htaccess file is called by Apache.

Making the most of Apache

| | Comments (0)
While I'll be adding tips and tricks that I've learnt over the years one book that I'd highly recommend is the Apache Cookbook.

It's jam packed with really nice tips and tricks to get Apache to do exactly what you want no matter how crazy that may seem

Apache Cookbook


I would never claim to be an expert with Apache's mod_rewrite module, but I have come to use it over the years to perform a number of little tricks to make my life easier.

What's this site about?

Apache's mod_rewrite module and how to make use of it.

Who is it aimed at?

Webmasters of all skill levels.

Will these tips work with IIS?

By default no, but there are some 3rd party ISAPI modules for IIS that emulate some of mod_rewrite's functionality


If anyone knows of any cool tricks using mod_rewrite and would like to contribute just pop me an email to let me know!