Almost any developer knows that search engine placement is
critical to the success of a web site. What many people don't
know is that a lot of search engines cannot index many database-driven
pages (basically any page with a '?' or '&' in the URL).
So when I set about building
gotoCity.com, one
of my goals was to make the site database-driven, but still indexable. I didn't want
any use of cookies or mile-long URLs, and the site had to be
co-brandable
to top it all off. That meant the look-and-feel of the site had to be dependent
on which "affiliate" site was being accessed (gotoCity.com is just one incarnation of the
"DirectriCity" engine).
To pull this off, I started with a subtle Apache feature that can "force" a
script to be called for any certain directory tree. In my case, I wanted all
URLs that fall under "/local/" to call a script. This would be MUCH easier
than creating 200,000 localized, co-branded web pages and a genuine directory
structure to match it.
So in Apache's
access.conf
file, I added the following lines:
<Location /local>
ForceType application/x-httpd-php3
</Location>
This forces everything under the "/local/" directory to call a script called
"local" in the root of my server. "local" then uses PHP to parse the URL
and act accordingly:
<?php
/*
Sample URLs:
/local/2/us/IA/a/50613/apartment/ - apartments in Cedar Falls, IA
/local/2/us/IA/a/ - group of cities in Iowa
/local/2/us/ - States in the US
/local/1/us/IA/a/50301/ - Des Moines City.net
*/
$url_array=explode("/",$REQUEST_URI); //BREAK UP THE URL PATH
// USING '/' as delimiter
$url_affil_num=$url_array[2]; //Co-Branding info
$url_country=$url_array[3]; //Which Country?
$url_state=$url_array[4]; //Which State?
$url_alpha_state=$url_array[5]; //Cities starting with a-g
$url_zip=$url_array[6]; //If present, build thispage
$url_content=$url_array[7]; //If present, build a sub-
// page for this city
/*
separate includes were designed to fetch the affiliate cobranding
and build appropriate style sheets on the fly. Data validation is
done prior to each query. If a URL is incorrect, bow out gracefully
or redirect home
*/
if($url_zip) {
/*
If present, query the Zip code database and build the page.
Inside the city.inc, we will check for which "content page" we are
building, if any.
*/
include('include/city.inc');
exit;
} elseif ($url_state) {
/*
If URL PATH ends here, query Zip code database,
selecting DISTINCT cities in this state
*/
if (!$url_alpha_state) {
$url_alpha_state='a';
}
include('include/state.inc');
exit;
} elseif ($url_country) {
/*
If URL PATH ends here, query ZIP code database,
selecting DISTINCT states for this country
*/
include('include/country.inc');
exit;
} else {
/*
must be mal-formed. Redirect to home
*/
Header( "Location: http://db.gotocity.com/local/2/us/");
exit;
}
?>
As you can see, if you break everything up into includes, and you have
a sensible hierarchy, this task is not terribly difficult. I now
have over 200,000 dynamic web pages that can be traversed by *any*
web indexing engine! The number increases proportionately with each
new affiliate (just create a new affiliate record in the affiliate
table).
Next week, I will talk about Logging. I built a sophisticated,
PHP logging and redirection engine to power this system.
I would appreciate any comments/criticisms about this article
or how my system could be improved.
--Tim