Re: [PHPLIB] Problems when submiting a PHPLib site to Altavista From: Claes Månsson (cj <email protected>)
Date: 01/28/00

Vibol Hou skrev:
>
> Claes,
>
> That's actually a pretty good idea. Can you post your code snippet so we
> can take a look at it and possibly use it also? What I do suggest is that
> the code look for users coming from a particular host, somewhat like the way
> Analog does it's checking for host engines...

I haven't looked in to what Analog is or does. The main reason I check against
$HTTP_USER_AGENT, is that it was the simplest way of doing it, that I could
think of. I do not have any major concerns about someone spoofing their user
agent string, since any information that can be gotten in "robot" mode, is out
in the open anyway in "cookie" and "get" mode.

You wanted raw code? It's a bit ugly, but here goes. It's unfortunately
against PHPLIB 6.1 and not against a current 7.x release though...

in session.inc, add the following function (cut&paste, should lines be broken up):

  ##
  ## Is the user agent a Robot?
  ##
  function is_robot($user_agent) {

        return ereg("Wget|Lycos_Spider_(T-Rex)|ia_archiver|Slurp|Scooter|FriendlySpider|DIIbot|Googlebot|PJspider|FAST-WebCrawler|Gulliver|gazz|SiteSnagger|ip3000.com-crawler|AltaVista|combine|COMBINE|Ultraseek|ArchitextSpider|WiseWire-Spider|Crawler", $user_agent);
  }

in function get_id(), add the following case among the others:

        case "robot":
          $id = "";
        break;

in function start(), add the lines marked "ADD:":

  function start($sid = "") {
    global $HTTP_COOKIE_VARS, $HTTP_GET_VARS, $HTTP_HOST, $HTTPS;
    
        $this->name = $this->cookiename==""?$this->classname:$this->cookiename;

ADD: global $HTTP_USER_AGENT;
ADD: if( $this->is_robot($HTTP_USER_AGENT) ) {
ADD: $this->mode = "robot";
ADD: } else {

          if ( isset($this->fallback_mode)
          && ( "get" == $this->fallback_mode )
          && ( "cookie" == $this->mode )
          && ( ! isset($HTTP_COOKIE_VARS[$this->name]) ) ) {
 
          if ( isset($HTTP_GET_VARS[$this->name]) ) {
            $this->mode = $this->fallback_mode;
          } else {
            header("Status: 302 Moved Temporarily");
            $this->get_id($sid);
            $this->mode = $this->fallback_mode;
            if( isset($HTTPS) && $HTTPS == 'on' ){
              ## You will need to fix suexec as well, if you use Apache and CGI PHP
              $PROTOCOL='https';
            } else {
              $PROTOCOL='http';
            }
              header("Location: ". $PROTOCOL. "://". $HTTP_HOST.$this->self_url());
            exit;
          }
        }
ADD: }

And you should be all set.

Regards
/Claes
-
PHP3 Base Library Mailing List. Send messages to <phplib <email protected>>.
To unsubscribe, send "unsubscribe" to <phplib-request <email protected>> in
the body, not the subject, of your message.