PDA

View Full Version : spiders .htaccess & robots.txt



visible soul
11-05-2003, 01:33 PM
Hello all-

Do search engine spiders crawl directories outside of the "www" folder? Does an .htaccess file keep robots out of a directory or do I need to add directories in my root to my robots.txt?

On a related issue. I am running a bulletin board on my VDS. I've noticed Googlebot crawling it a lot lately. Is there something additional that I need to do to protect my sql database so that robots don't index restricted information?

Also is it possible to add a specific bulletin board category or thread to the robots.txt file so that spiders index only selected categories or threads?

Thanks for your replies.
-DKC-

torrin
11-05-2003, 01:52 PM
Do search engine spiders crawl directories outside of the "www" folder?

Yes, if it's visible to the world. Make sure your directories aren't by configuring the server correctly and using .htaccess and .htpasswd files.


Does an .htaccess file keep robots out of a directory or do I need to add directories in my root to my robots.txt?

Yes as long is you put appropriate passwords. As for robots.txt that will keep search engines out too. Well, it will keep the search engines out that actually pay attention to robots.txt. There are some that don't. .htaccess and .htpasswd is more secure for this.


On a related issue. I am running a bulletin board on my VDS. I've noticed Googlebot crawling it a lot lately. Is there something additional that I need to do to protect my sql database so that robots don't index restricted information?

I don't think the search engines will try to index a sql database, but I'd have to know more about your set up to answer this.


Also is it possible to add a specific bulletin board category or thread to the robots.txt file so that spiders index only selected categories or threads?

I don't know.

visible soul
11-05-2003, 09:48 PM
Muchos gracias torrin.

I appreciate your input. I thought that the htaccess would prohibit all unauthorized access including spiders but I wanted to get confirmation here. This board is great.

-DKC-