picture of Tim Perdue
Almost any developer knows that search engine placement is critical to the success of a web site. What many people don't know is that a lot of search engines cannot index many database-driven pages (basically any page with a '?' or '&' in the URL).
So when I set about building gotoCity.com, one of my goals was to make the site database-driven, but still indexable. I didn't want any use of cookies or mile-long URLs, and the site had to be co-brandable to top it all off. That meant the look-and-feel of the site had to be dependent on which "affiliate" site was being accessed (gotoCity.com is just one incarnation of the "DirectriCity" engine).
To pull this off, I started with a subtle Apache feature that can "force" a script to be called for any certain directory tree. In my case, I wanted all URLs that fall under "/local/" to call a script. This would be MUCH easier than creating 200,000 localized, co-branded web pages and a genuine directory structure to match it.
So in Apache's access.conf file, I added the following lines:
<Location /local>
ForceType application/x-httpd-php3
</Location>
This forces everything under the "/local/" directory to call a script called "local" in the root of my server. "local" then uses PHP to parse the URL and act accordingly:

<?php

/*
Sample URLs:
/local/2/us/IA/a/50613/apartment/ - apartments in Cedar Falls, IA
/local/2/us/IA/a/         - group of cities in Iowa
/local/2/us/              - States in the US
/local/1/us/IA/a/50301/       - Des Moines City.net
*/

$url_array=explode("/",$REQUEST_URI);  //BREAK UP THE URL PATH
                       //    USING '/' as delimiter
$url_affil_num=$url_array[2];      //Co-Branding info
$url_country=$url_array[3];        //Which Country?
$url_state=$url_array[4];          //Which State?
$url_alpha_state=$url_array[5];    //Cities starting with a-g
$url_zip=$url_array[6];        //If present, build thispage
$url_content=$url_array[7];        //If present, build a sub-
                       //    page for this city

/*
separate includes were designed to fetch the affiliate cobranding
and build appropriate style sheets on the fly. Data validation is
done prior to each query. If a URL is incorrect, bow out gracefully
or redirect home
*/

if($url_zip) {

/*
If present, query the Zip code database and build the page.
Inside the city.inc, we will check for which "content page" we are
building, if any.
*/

    
include('include/city.inc');
    exit;

} elseif (
$url_state) {

/*
If URL PATH ends here, query Zip code database,
selecting DISTINCT cities in this state
*/

    
if (!$url_alpha_state) {
        
$url_alpha_state='a';
    }
    include(
'include/state.inc');
    exit;

} elseif (
$url_country) {

/*
If URL PATH ends here, query ZIP code database,
selecting DISTINCT states for this country
*/

    
include('include/country.inc');
    exit;

} else {

/*
must be mal-formed. Redirect to home
*/

    
Header"Location:  http://db.gotocity.com/local/2/us/");
    exit;
}

?>
As you can see, if you break everything up into includes, and you have a sensible hierarchy, this task is not terribly difficult. I now have over 200,000 dynamic web pages that can be traversed by *any* web indexing engine! The number increases proportionately with each new affiliate (just create a new affiliate record in the affiliate table).
Next week, I will talk about Logging. I built a sophisticated, PHP logging and redirection engine to power this system.
I would appreciate any comments/criticisms about this article or how my system could be improved.
--Tim