In this article I will show how to use a modular system to create a website that is constructed on the fly using dynamic modules, which for performance sake can be cached if you want. Each module is a php script on its own, returning html data to be included in the final html page.

Overview

Walkthrough

To demonstrate how it works, let's see a file called hello.my:

<title>Hello world</title>
<my-style name=test>
hello world
If a request is being made for index.my, the webserver redirects this request to the php parser, which just scans for <my- .. > tags. It finds the 'style' tag, and looks for the module style.php in its module directory.
The file style.php is included and the function handle_style ($arglist) is called, where $arglist is an associative array of all specified parameters to the tag (here:
<?php $arglist[name] = "test"?>
).
The handle_style function must return a string containing html. How the module determines the html doesn't concern the system. Say, the handle_style module returns <font face=Arial size=2 color=yellow> and the parser includes this in the final html.
Now, the final html returned to the user is:

<title>Hello world</title>
<!-- my-style cached output (14:32) --//>
<font face=Arial size=2 color=yellow>
<!-- my-style end of output --//>
Hello world
You can endlessly expand on this model, adding modules is easy, as each module only has to conform to the calling convention my_modulename ($options) and return some html that can be replaced instead of the original call in the .my file.
Now, why would you do this? There are three important reasons:

Caching

The parser optionally stores the output from the module in its cache directory. The next time someone calls <my-style name=test>, the parser checks if it can use the cached entry by checking on the expiration time of this particular entry. There is a default for all entries, but you can set it for each tag specifically by using the cache=x parameter:
<my-style name=test cache=3600>
Now if there is a cached entry, it will only be used if it is not older than one hour. After this hour, the module will be called again and the newly returned html will update the cache entry.
You will not want to use the cache for all tags, but for e.g. database driven modules or modules that use external webserver to retrieve information are likely to have a big performance penalty, so the caching might prove a big speedup here. Let alone when the underlying database or webserver is down, YOU then have a cached entry to serve!

Security

Another aspect of using this system is that you hide the php totally from the files, so you can easily let ordinary users use your tags to create dynamic content without security issues:

<title>Joe's homepage</title>
<my-style name=yellow-blue>
<my-guestbook name=joe showentries=5 addtext="Please add to my guestbook" cache=0>
<my-chat room="Joe's chatroom">
Here, my-style is a cached entry just defining the <body..> tag with a predefined colorset. The user doesn't have to know the exact color definitions, it just chooses one of a set designed by you, the provider.
The my-guestbook is a script made by you, the provider, returning the last 5 written entries and providing a means to add new entries. The user doesn't know where the script is stored, the script can even be executed outside of the document directory, so the security is totally under your control. You only provide a safe restricted interface to your modules by using these tags.

Consistency

Another important thing is that you can program your modules to use a common shared set of variables to force a certain layout or style in the resulting html pages. In the previous example, the my-style module could have set a global array $colors, containing color definitions that should apply to the whole page, e.g. $colors[td_bgcolor], $colors[td_text] etc.
Both the my-guestbook and the my-chat module can access these variables to layout their output as well. So you can have a designer define some colorsets, fontsets etc. and the creators of the .my files only have to enforce the design by providing a simple name in one tag: my-style (in this case)

Implementation

Now it's time for some php. There is only ONE script needed for this, the parser. It reads the .my files and replaces the special tags with the output from the modules, or the cache if applicable. The parser is called by Apache redirecting all calls for .my to this script. To do this, use the following mod_rewrite call in .htaccess:

RewriteEngine on
RewriteRule \*.html    /lib/parse.php
The parser /lib/parse.php can determine which file was originally asked for by examing the $REDIRECT_URL variable, and use this to call the parse function which returns the parsed html:

<?php

if ($REDIRECT_URL
    echo 
parse ($DOCUMENT_ROOT $REDIRECT_URL);

    
//The parse function just reads the file and calls parse_it for every line to build up the output in $buf:

function parse ($file) {
    
$buf "";
    if (
$f fopen ($file"r")) {
        while (
$str fgets ($f4096)) {
            
$buf .= parse_it ($str);
        }
        
fclose ($f);
    }
    return 
$buf;
}

?>
The parse_it function scans the string to see if any tag occurs, and if it does, break down the tagname in $fun and its optional parameters into the associated array $arglist.

<?php

function parse_it ($str) {
    global 
$loaded;

    if (
eregi ("<[Mm][Yy]-([A-Za-z0-9]*) ([^>]*)"$str$regs)) {
        
$tag $regs[1];
        if (!
$loaded[$tag]) {
            include 
"res/$tag/$tag.php";
            
$loaded[$tag] = 1;
        }

        
$fun "handle_$tag";
        
$list explode (" "strtolower ($regs[2]));
        
$cache_file "cache/$tag";

        for (
$i 0$i count ($list); $i++) {
            if (
$argname strtok ($list[$i], "=")) {
                
$arglist[$argname] = strtok ("=");
                if (
$argname != "cache") {
                    
$cache_file .= "_" $argname "=" $arglist[$argname];
                }
            }
        }

        
$buf "<!-- $tag start here //-->\n";
        
$buf .= $fun ($arglist);
        
$buf .= "\n<!-- $tag ends here //-->\n";
        return 
$buf;
    } else {
        return 
$str;
    }
}

?>
This function calls the function $tag directly in the file included from res/$tag/$tag.php and uses the output to return to the parse main loop.
As you notice, the variable $cache_file is also being built pointing to cache/$tag... where ... is a string combining all parameters to form a unique cache entry for this tag. To actually use the cache, we must add some code between the end of the for loop and the creation of $buf:

<?php

$read_cache 
0;
$write_cache 0;
if (!(isset (
$arglist["cache"]) && ($arglist["cache"] < 10))) {
    
$write_cache 1;
    if (
file_exists ($cache_file)) {
        if (!isset (
$arglist["cache"])) {
            if ((
filemtime ($cache_file) + $default_cache_time) > date ("U")) {
                
$read_cache 1;
                
$write_cache 0;
            }    
        } else {
            if ((
filemtime ($cache_file) + $arglist["cache"]) > date ("U")) {
                
$read_cache 1;
                
$write_cache 0;
            }
        }
    }
}

?>
Here we determine whether $buf can instead be built from the $cache_file by checking the following:
Thus we can now make the $buf creation conditional to the $read_cache variable, by replacing the line that says:
<?php $buf $fun ($arglist); ?>
with the the following code:
<?php

if ($read_cache || (!strlen ($buf .= $fun ($arglist)))) {
    if (
$f fopen ($cache_file"r")) {
        while (
$str fgets ($f4096)) {
            
$buf .= $str;
        }
        
fclose ($f);
    } else 
$buf .= "<!-- $tag: error - cache is empty //-->\n";
}

?>
Note that the module can force the cache (e.g. when database is down) by returning an empty string (risking the cache has not yet been written, but in any case it does not generate a database error, but here an empty place in the page with an html comment).
The last step is see if we should also update the cache, by continuing to add the following code before return $buf:

<?php

if ($write_cache && ($f fopen ($cache_file"w"))) {
    
fputs ($f$buf);
    
fclose ($f);
}

?>
v Now the parser is done and you can start creating both the res/ and cache/ subdirectories and fill res/ with modules xxx.php that each implements the function handle_xxx ($arglist).
An interesting side-effect is that in the handle_xxx function you can use the parse_it function again to parse external retrieved data. Say that you have a <my-include file=..>, that file can also contain new <my-..> tags if you implement res/include/include.php as follows:

<?php

 
function handle_include ($arglist) {
    if ((
$arglist["file"]) && ($f fopen ($arglist["file"], "r"))) {
        
$buf "";
        while (
$str fgets ($f4096)) 
            
$buf .= $str;
        
fclose ($f);

        return 
parse_it ($buf);
    }
    return 
"<!-- include: error opening " $arglist["file"] . " //-->";
}

?>
Using caching here will probably provide useful if including from external sites:
<my-include file="http://remote.server.com/foo.my" cache=3600>
This way, the file is only fetched from the remote server once an hour, for all the other requests the cached entry is used, which is much faster.
-JP