To administer your WestHost account, please enter your
Domain Name or Server Manager Username.

WestHost: Professional Website Hosting Company








Results 1 to 7 of 7
  1. #1

    Default Use .htaccess to Strip Session ID from Bots

    I have a website, www.zestgourmet.com, that uses osCommerce as a shopping cart. I LOVE osCommerce, but the session id that it appends to URL's has been problematic with search engines. URL's get an id attached to them that looks like this... http://www.zestgourmet.com/store/pro...fjsldlfjssf928. In the administration panel I have Prevent Spider Sessions set to True, this helps but is not foolproof. Yahoo has mapped much of my site with the session id. I'm trying to remove these listings from Yahoo and found the following solution using an .htaccess file. I know very little about what the code is trying to acomplish. Can someone tell me if the following code is safe and would be effective? Thanks!

    # $Id: .htaccess,v 1.3 2003/06/12 10:53:20 hpdl Exp $

    # Set some options
    Options -Indexes
    Options FollowSymLinks

    RewriteEngine on
    RewriteBase /
    #
    # Skip the next two rewriterules if NOT a spider
    RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot) [NC]
    RewriteRule .* - [S=2]
    #
    # case: leading and trailing parameters
    RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC]
    RewriteRule (.*) $1?%1&%2 [R=301,L]
    #
    # case: leading-only, trailing-only or no additional parameters
    RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC]
    RewriteRule (.*) $1?%1 [R=301,L]

    #
    # This is used with Apache WebServers
    #
    # For this to work, you must include the parameter 'Options' to
    # the AllowOverride configuration
    #
    # Example:
    #
    <Directory "/usr/local/apache/htdocs">
    AllowOverride Options
    </Directory>
    #
    # 'All' with also work. (This configuration is in the
    # apache/conf/httpd.conf file)

    # The following makes adjustments to the SSL protocol for Internet
    # Explorer browsers

    <IfModule mod_setenvif.c>
    <IfDefine SSL>
    SetEnvIf User-Agent ".*MSIE.*" \
    nokeepalive ssl-unclean-shutdown \
    downgrade-1.0 force-response-1.0
    </IfDefine>
    </IfModule>

    # Fix certain PHP values
    # (commented out by default to prevent errors occuring on certain
    # servers)

    #<IfModule mod_php4.c>
    # php_value session.use_trans_sid 0
    # php_value register_globals 1
    #</IfModule>

  2. #2
    Senior Member jalal's Avatar
    Join Date
    May 2003
    Location
    Germany
    Posts
    1,377

    Default

    It looks safe enough. Whether its effective, you'll have to try it out.

    I'm a great fan of osCommerce. If the above does work, let us know, it would be useful for many others.

    cheers

  3. #3
    Senior Member torrin's Avatar
    Join Date
    May 2003
    Location
    Vista, CA
    Posts
    534

    Default

    Be sure to put the bottom half in the httpd.conf file.

  4. #4

    Default

    I was wondering about that... so, the rewrite rules go in the .htaccess file and everything else goes in the httpd.conf file? Where is this file on our servers? Thanks to everyone for your advice!

  5. #5

    Default

    I figured out where the httpd.conf file was and saw that the settings were already correct in there.

    I've gone ahead and updated my .htaccess file and will let others know how this works out. By the way, here is what the contribution (http://www.oscommerce.com/community/contributions,2819) does.

    Spider Session Remover v1.0 (Jan 15th 2005)
    ==================================

    This is the official release of the Spider Session Remover.

    This contribution uses Apache mod_rewite to look for specific spiders, and remove the
    session (osCsid) from the URL, and return a '301' back to the spider. As to what the search engines will do, they'll see the 301-Moved Permanently response,
    re-fetch the page from the new (osCsid-less) URL given in that response, and, ......
    after a while, update their database to use the new URL.

  6. #6

    Default

    Okay, here are the results so far. I checked my access log and found the following entry...

    68.142.250.55 - - [04/Mar/2005:11:19:27 -0700] "GET /store/faq.php?osCsid=dbb4717fc8dc6fd14cf49b5043941cdc HTTP/1.0" 301 320 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
    68.142.250.55 - - [04/Mar/2005:11:19:31 -0700] "GET /store/faq.php HTTP/1.0" 200 32880 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
    I think this tells me that the code is working! However, I also found this entry...

    66.196.91.47 - - [04/Mar/2005:11:12:03 -0700] "GET /store/product_info.php?products_id=34%3FosCsid=64ed54b2a f74d022f1504bcf42e90cd7 HTTP/1.0" 200 28834 "-" "Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
    Because the session id in this url is (%3FosCsid=) instead of (&osCsid=) will the .htaccess code correctly redirect to the 301 page?

  7. #7

    Default

    Thanks to Jalal for helping me make edits to the .htaccess file. The final file is provided below. It now catches both &osCsid as well as %3FosCsid. I removed everything else from the original file above except the following. I found most of the information was already in the httpd.conf file and the other code above caused my server side includes (SSI's) to fail.

    RewriteEngine on
    RewriteBase /
    #
    # Skip the next two rewriterules if NOT a spider
    RewriteCond %{HTTP_USER_AGENT} !(msnbot|slurp|googlebot|becomebot) [NC]
    RewriteRule .* - [S=4]
    #
    # case: leading and trailing parameters
    RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+&(.+)$ [NC]
    RewriteRule (.*) $1?%1&%2 [R=301,L]
    #
    # case: leading-only, trailing-only or no additional parameters
    RewriteCond %{QUERY_STRING} ^(.+)&osCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+&?(.*)$ [NC]
    RewriteRule (.*) $1?%1 [R=301,L]
    #
    # case: leading and trailing parameters
    RewriteCond %{QUERY_STRING} ^(.+)%3FosCsid=[0-9a-z]+%3F(.+)$ [NC]
    RewriteRule (.*) $1?%1&%2 [R=301,L]
    #
    # case: leading-only, trailing-only or no additional parameters
    RewriteCond %{QUERY_STRING} ^(.+)%3FosCsid=[0-9a-z]+$|^osCsid=[0-9a-z]+%3F?(.*)$ [NC]
    RewriteRule (.*) $1?%1 [R=301,L]

Similar Threads

  1. virtual subdomains using .htaccess
    By ryanz in forum General Discussion
    Replies: 2
    Last Post: 03-02-2010, 12:54 AM
  2. .htaccess causes 500 IES
    By bossbn in forum General Discussion
    Replies: 5
    Last Post: 11-27-2007, 06:31 PM
  3. .htaccess - can't password protect Webalizer directory
    By extexas in forum General Discussion
    Replies: 1
    Last Post: 08-09-2005, 02:19 PM
  4. Ban malicious bots with this Perl Script
    By zestgourmet in forum CGI Scripts / Perl
    Replies: 1
    Last Post: 03-08-2005, 05:58 PM
  5. .htaccess trouble
    By foeggy in forum PHP / MySQL
    Replies: 6
    Last Post: 02-12-2004, 01:51 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •