A Look At robots.txt Files
A robots.txt file is a simple, static, file that you can add to your site in order to stop search engines from crawling the content of certain pages or directories. You can even prevent certain user agents from crawling certain areas of you site.
Lets take a real-world example and look at what you would do if you decided to set up a Feedburner feed in place of your normal RSS feed. I won’t go into why you would do this much, other than to say that you get some nice usage statistics. Once you have recoded your blog to issue the Feeburner feed you then need to stop search engines from indexing the old feed. You would then put a robots.txt file in place with the following content.
User-agent: *
Disallow: /feed
Read the rest of this entry »
Philip Norton
Lead Developer, Research and Development








