Block bad bots with a robots.txt file
In an earlier post, I talked about how to block bad bots and spiders from searching your website using a .htaccess file. In addition to that method, I am also using a similar technique with the robots.txt file.
Which is the better method? Honestly, I don’t know, but I am using both methods. I know that sometimes bad bots will just ignore the robots.txt file, and so I suppose some bots will ignore the .htacess file. Of course nothing is foolproof, but using both methods will no doubt keep a large percentage of the bad bots out.
So what is the robots.txt file? Its a text file that you put in the root, or main directory of your website (not to be confused with the html robots meta tag that you use on each web page). It contains various commands for the various bots and spiders look at for directions on how to index your site.My primary interest in robots.txt files is to prevent “bad bots” from scouring my site for such things as my content, or email addresses. However I do want the “good bots” such as googlebot to index my site.