|
 
Golden Rules for Optimizing Your Pages
Marion Weerning
PHP invites beginners to insert complicated scripts and database queries in their sites,
while possessing only superficial knowledge of how the internet runs. I often read this and
think of my own web experience: I felt that I was able to create anything (because I believe
that you can materialize ANYTHING if you are creative, determined and patient enough)
but I was not a good PHP builder.
So I began to read manuals, magazines and PHP forums around the web in search of
background information. How could I optimize my portal? It began to get heavier
from week to week with my new features based on queries that made my sites slower
and slower (maybe not for those on ADSL, but at any rate I started to get the
feeling that things could be optimized by applying some golden rules).
Now I have got a whole collection of these rules and I thought that they
could be useful for everybody who writes pages without being a highly
skilled web builder. I divide them into eight sections:
- Optimize your HTML code
- Optimize your PHP code
- Reduce your PHP Code
- Optimize your database
- Optimize your MySQL queries
- Optimize your MySQL query output
- Use caching and buffering techniques
- Benchmark your code (if you want)
Optimize Your HTML Code
This first chapter may not bring new features for most of you, but the first step in
optimizing your site must begin by determining whether your HTML code is optimal.
Optimize Your PHP Code
- Introduce your PHP-code with <?php because a simple <?
may cause interference with XML-code.
- Prefer '.....' to "....." because the server side
PHP engine parses everything it finds between the " " while it doesn't
control what is written between the two single quotes.
But be careful if you use variables. You have to write
<?php echo 'This is my var:', $var, '!!'; ?>
instead of
<?php echo "This is my var: $var !!"; ?>
By the way:
<?php echo 'This is my var:',$var,'!!'; ?>
should be some milliseconds faster than
<?php echo 'This is my var:'.$var.'!!'; ?>
(concatenated by a dot instead of the comma) and faster than
<?php print 'This is my var:'.$var.'!!'; ?>
- Use SWITCH instead of IF when you have many options because it makes a lookup table and goes straight to the right case.
- Use IF instead of SWITCH if you have only few options because it evaluates every option until it finds a match.
Sometimes a trinary expression may simplifiy an IF/ELSE-expression because it reduces the written code:
<?php echo ($var==1)?'var is 1':'var is not 1'; ?>
instead of
<?php if ($var==1){ echo 'var is 1'; } else{ echo 'var is not 1'; } ?>
- Define the max-variables you use in your FOR-loops before starting the loop.
Prefer:
<?php $max=filesize('myfile.dat'); //just to make an example for ($i=0; $i<$max; $i++){ //your code } ?>
to
<?php for ($i=0; $i<filesize("myfile.dat"); $i++) ?>
So $max will be defined only once (and not as often as the loop is executed).
- Define the output-variable before starting the loop and echo the output after ending the loop:
<?php $var=''; //that means inverted commas (simple quoted) for ($i=0; $i<20; $i++){ $var .=$arr[i],'<br />'; } echo $var; ?>
Apropos loops. The difference between:
<?php for ($i=0; $i<10; $i++){ //your code } ?>
and:
<?php while ($i<10){ //your code $i++; } ?>
...should consist only in a few milliseconds.
- Use INCLUDE() when you reuse that code portion more times in your site (e.g. including the variables to get connected to your database). Outsourcing portions of PHP code by include ('yourcode.php'); makes your script more readable, but too many files to include may slow down your site.
Apropos INCLUDE(). If you need absolute security include only PHP-files (not INC-files) because no browser is able to read them.
- For replacing strings STR_REPLACE() is faster than PREG_REPLACE(). The slowest solution is EREG_REPLACE() that you should use only for regular expressions.
- For searching for substrings STRPOS() is faster than PREG_MATCH(). The slowest solution is EREG() that you should use only for regular expressions.
- Use UNSET() to unset a variable that is not used anymore to reduce memory usage. This is useful especially for large arrays and resources.
- Add @ before those expressions that could provocate a warning message if something doesn't work. An example. "Yourincludefile.php" cannot be opened. So your users will read (instead of the contents of the file you wanted to include): WARNING: fopen('yourincludefile.php, 'r') - No such file or directory in http://\www\yourdirectory\yourfile.php on line 28. Writing @fopen('yourincludefile.php'); no warning message is given: your user simply does not see the content of "yourincludefile.php".
Or use error debugging scripts like John Starkey proposes in "Beginner debugging: For Your Eyes Only", "File based, custom logging" and other articles.
Reduce Your PHP Code
- Use HTML instead of PHP whenever possible. So you gain speed because the server side PHP engine parses only what it finds between <?php and ?>:
<h3> Today </h3> <p>Today I met <?php echo $person; ?>. That was nice.</p>
instead of
<?php echo "<h3>Today</h3><p>Today I met $person. That was nice."; ?>
- Use JavaScript instead of PHP ...
Of course, that's a hard decision. PHP code always runs
(but takes longer to be seen on your user's monitor because your webserver-sided PHP-engine
parses everytime all your PHP-code and only after having finished it sends everything to your
client-sides browser). JavaScript depends on the client-sided browser and even if your user has
enabled JavaScript it does not always run (but it is faster). That's a very big dilemma, so I would say:
Use JavaScript, but only ...
... if the JavaScript Code is not too complicated
... if you are sure that several scripts can't get in conflict situations between each other
... if you tried them with several different browsers, possibly also with some older
version of Netscape and Internet Explorer.
There's no problem mixing JavaScript and PHP code:
<body><p> The following text has four lines: The first line contains a <?php $var='PHP'; echo $var; ?> - variable. <--Output: The first line contains a PHP - variable. --> The second line contains a <script> jsvar='JS'; document.write (jsvar); </script> - variable. <!--Output: The second line contains a JS - variable. --> The third line contains PHP with inside JS: <?php echo $var, ' with <script> document.write(jsvar); </script> ?>. <!--Output: The third line contains PHP with inside JS: PHP with JS. --> The fourth line contains JS with inside PHP: <script> txt=jsvar+' with '+'<?php echo $var; ?>'; document.write(txt); </script>');?>. <!--Output: The fourth line contains JS with inside PHP: JS with PHP. --> </p></body>
An exciting article about how to combine JavaScript and PHP was written
by Luis Argerich and Alejandro Mitrou: "PHP &
JavaScript World Domination Series: Storing data in the client".
- Do you know other golden PHP rules?
Optimize Your Database
The less data is to be abstracted the faster your database queries will be. That means your
tables have to be as small as possible referring to the datatype and the quantity of data to examine.
Optimize Your MySQL Queries
- Select only the columns you need for your query. If you really don't want to select ALL the columns of your table don't digit SELECT * FROM tablename but write all the column names instead of "*" even if this means more writing work.
- Sort your data only if it's really necessary. It wastes a lot of time.
- Keep your filter conditions as easy as possible..
- Avoid mathematic MySQL-operations if possible. Leave them to PHP.
- Use connect only if you are sure that your pages will be connected only once and your user won't reconnect.
Use pconnect for a persistant connection if you have several queries on your page.
- Use my_fetch_row() instead of my_fetch_array(). Sometimes this may be faster (I read somewhere).
my_fetch_rows() gives back an enumerated array:
<?php while($row = mysql_fetch_rows($res)) echo $row[0], $row[1]; ?>
my_fetch_array() is its extended version (the result array works with innumeric indices
like above and with column names):
<?php while($row = mysql_fetch_array($res)) echo $row['columnname0'], $row['columnname1']; ?>
No idea about mysql_fetch_object() which works only with column names:
<?php while($row = mysql_fetch_object($res)) echo $row->columnname0, $row->columnname1; ?>
Who benchmarked the three variants to compare them? (I have no time ...)
- Do you know other golden MySQL-query rules?
Optimize Your MySQL Query Output
Save your query output in a separate file on your web-server.
Let's suppose: every day you have a new "info" (or something like this) on your site.
Obviously it's much less expensive in time and bandwidth to send a HTML copy instead of getting connected
and reconnected by every single user again and again. Certainly you have to save your copy for only as
long as you need it. In our case at 0:00 o'clock you need this copy anymore, but you need a new copy
with a new "info" (the old one can be deleted).
Now step by step.
By the way, let's call our cache file with today's date (e.g. 021229.php) and store it in a
directory called "cache_dir" where no other file outside the cache copies may be stored.
- In our main code (e.g. on "index.php") we need the output with our "info"
that we stored in a file with today's date. Our user gets connected: "Does the file with today's date
exist?". If yes this file will be visualized, if not we must first of all visualize the new "info"
(that means the new query output) and then store it in a new copy (and delete the old one):
<?php $today=gmdate(y M D, time()); //e.g.021229 $cache_dir='cache_dir/'; $file='cache_dir',$today,'.php'; //e.g. cache_dir/021229.php
if (file_exists($file)) include ($file) //parses the HTML-content of today's file else{ ... ?>
If the file with today\'s name does not exist we connect to our database to
query the actual today\'s "info". But be careful: don\'t echo the result immediately
out but store it in variables (possibly without or with few HTML-tags); so (1) you reduce
the connection time with the database and (2) you can utilize it after having
closed the connection not only (a) to echo it out but also (b) to store it
in the cache copy:
<?php while($row = mysql_fetch_object($result)) { $row1=$row->title; $row2=$row->info; // Be careful: This is possible only if you are sure that you'll get ONE title-row and ONE info-row. // "$row->title" lays out an array. So if our query expects more than one result // (more titles and more infos) we have to change the code. For example: // $row1.=$row->title; // $row2.=$row->info; } mysql_free_result($result); mysql_close($connection); ?>
We add the necessary HTML-tags:
<?php $query='<h3>'.$row1.'</h3><p>'.$row2.'</p>'; //don't use commas to concatenate the elements ?>
We echo our query out (for our user who opened our page without finding the cache file with today\'s date):
<?php echo $query; ?>
We can now delete the old file:
<?php $f_array=array(); $open_dir=opendir($cache_dir); while ($f=readdir($open_dir)) //insert all the content of the directory into the array called $f_array: array_push($f_array,$f); closedir($open_dir); //delete the file with the eldest creation day in this directory: unlink($cache_dir.$f_array[2]); //in our case there was only one file (the file we wanted delete)!!! ?>
It's not necessary to now delete the old file. If you like you can transfer this routine to an
independent file which runs periodically and deletes the old files that are crowded together in the
"cache_dir". (In this case you have to modify this script.)
Finally we store our query output in a new cache file which we name with today\'s date:
<?php $fd=fopen($file,'w'); fwrite($fd, $query); fclose($fd); ?>
Attention. If you create, overwrite and delete files you obviously need the permission
to do that (read, write, execute). You can set them with most of the FTP-programms. With my
WS_FTP it works like this:
- I send my directory structure from my computer to my server, but - be careful - with binary mode
and not (as I do usually) in ASCII mode.
- Right-clicking on the directory (or on the files in question) I choose my permissions.
Use Caching and Buffering Techniques
- Add HTTP response headers to force your user's browser to communicate with your server.
You can do this with the PHP commando header(). An example using the Expire-header:
<?php $now=gmdate('D, d M Y H:i:s'); $expire=gmdate('D, d M Y 22:00:00'); //Greenwich Mean Time header('HTTP/1.1 200 OK'); header('Date: '.$now.' GMT'); header('Expires: '.$expire. ' GMT'); //Attention: it doesn't work with sessions ?>
That means: if somebody in Italy in August opens your page at 6 o'clock local time the first time
and then at 10 o'clock in the evening local time the second time your server will tell your user's browser:
"Open the cache file that you already stored. It's only 22 o'clock local time (= 20:00 Greenwich Mean Time)."
If he opens your page at 1 o'clock local time in the morning the third time your server will tell your user's
browser: "At midnight (= 22:00 GMT) expired! I will give you a fresh copy of the page." The expire date
is always 22:00 GMT (not local time) for everybody who opens your page all over the world.
Cache images and pages that don't often change by specifying a far-away Expires header.
But don't forget: if you modify something you have to change the name of the file (because - logically -
otherwise the old copy, cached until the far-away date, continues to be visualized by your user's browser
instead of the new modified copy you stored on your server).
Another interesting header is the Cache-Control header. It is possibile to force your
user's browser to request a fresh copy from your server everytime your user opens the page (even at any RELOAD):
<?php header ("Expires: Tue, 10 Sep 2002 00:00:00 GMT"); // date in the past header ("Last-Modified: " . gmdate ("D, d M Y H:i:s") . " GMT"); // changes always header ("Cache-Control: no-cache, must-revalidate"); // HTTP/1.1 header ("Pragma: no-cache"); ?>
Or you can force your user's browser to use its own cache copy for a limited time (e.g. 24 hours):
<?php header("Last-Modified: ". gmdate ("D, d M Y H:i:s") . " GMT"); header("Cache-Control: max-age=86400, must-revalidate"); //86400 seconds=24 hours ?>
Interesting Cache-Control response headers are:
- max-age=[seconds]: it specifies the maximum
amount of time that an object will be considered fresh. Similar to
Expires, this directive allows more flexibility. [seconds] is the
number of seconds from the time of the request you wish the object
to be fresh for. Note: if a response includes a Cache-Control field with the max-age directive it overrides the Expires field.
- public: it marks the response as cacheable, even
if it would normally be uncacheable. For instance, if your pages
are authenticated, the public directive makes them cacheable.
- no-cache: it forces caches (both proxy and
browser) to submit the request to the origin server for validation
before releasing a cached copy, every time. This is useful to
assure that authentication is respected (in combination with
public), or to maintain rigid object freshness, without sacrificing
all of the benefits of caching.
- must-revalidate: tells caches that they must
obey any freshness information you give them about an object. The
HTTP allows caches to take liberties with the freshness of objects;
by specifying this header, you're telling the cache that you want
it to strictly follow your rules.
Important: The HTTP 1.1 response headers must be generated by PHP as FIRST lines of your file. The HTML document follows these headers, separated by a blank line.
If you have a page with two different frames (a static and dynamical like in the above example) you can use different headers for them.
Some other HTTP tips:
- Don't POST unless it's appropriate. The POST method is (practically) impossible to cache; if you send information in the path or query (via GET), caches can store that information for the future. POST, on the other hand, is good for sending large amounts of information to the server (which is why it won't be cached; it's very unlikely that the same exact POST will be made twice).
- Don't embed user-specific information in the URL unless the content generated is completely unique to that user.
- Don't count on all requests from a user coming from the same host, because caches often work together.
- Generate Content-Length response headers. It allows the response of your script to be used in a persistent connection (for some browsers like Netscape this is the only way to signal the end of the object to shutdown the connection).
- Use cookies only where necessary - cookies are difficult to cache, and aren't needed in most situations. If you must use a cookie, limit its use to dynamic pages.
- Minimize use of SSL - because encrypted pages are not stored by shared caches, use them only when you have to, and use images on SSL pages sparingly.
That was too much? Read Mark Nottingham's "Caching Tutorial" to understand more.
If you want to know something about the header of a file opened by your browser you can get some information only with Netscape. Select 'Page Info' from the 'View' menu to see what the Expires and Last-Modified headers are. For details you need telnet, ssh, wget, lynx or some other browser that will show the complete source.
- But that's not all. Use output buffering functions. This will speed up your PHP code by 5-15% if you frequently print or echo in your code. When output buffering is enabled it pipes the output into a dynamically growing buffer and sends the buffer content with the output in one shot at the end of the script (or when you decide). Output buffering reduces networking overhead substantially at the cost of more memory and an increase in latency.
These are the main functions to control output buffering:
- ob_start() enables output buffering. (Output buffering supports multiple levels, that means you can call ob_start() several times.)
- ob_end_flush() sends the output buffer and disables output buffering.
- ob_end_clean() cleans the output buffer without sending it and disables output buffering.
- ob_get_contents() returns the current output buffer as a string that you can echo to send the accumulated output to the browser (after turning buffering off).
- int ob_get_length() returns the length of the output buffer.
An example:
<?php ob_start(); //start buffering ob_implicit_flush(0); //turn off implicit flushing //now your output stuff: echo 'This is the 1st line with compressed output.'; $str='<br /> This is the 2nd line.'; echo $str; $var1=10; $var2=7; $var3=$var1-$var2; echo '<br /> This is the ',$var3, 'rd line.'; $contents = ob_get_contents(); //if you don't add "ob_end_clean()" all that you wanted to echo out will be echoed out NOW ob_end_clean(); //if you add this line nothing will be echoed out, only if you add after this line "echo $contents" the content will be echoed out. ?>
This example was quite stupid. An intelligent output example would have been all the query stuff from the last chapter: ob_start(); ob_implicit_flush(0); should be added between mysql_close($connection); and $query='<h3>',$row1,'</h3>.
If your version of PHP is compiled with zlib support and is 4.0.4 or higher, you can also try this:
<?php ob_start('ob_gzhandler'); // rest of your script ?>
- Compress your data. Many browsers support transparent compression using gzip. The server detects that the browser supports gzip encoding, compresses and sends the data (using the Content-Encoding: gzip header line) in spite of sending data in plain text form. When the compressed data arrives at the browser, it will be decompressed. This shortens the load time for the page.
Here is the code with the function that compresses the output:
<?php //the above example continues; we added "ob_end_clean()" and nothing was echoed out.
function compress_output($contents){ //tell the browser that gzip data will be sent: header("Content-Encoding: gzip"); echo "\x1f\x8b\x08\x00\x00\x00\x00\x00"; $Size = strlen($contents); // figure out the size and CRC (Cyclic Redundancy Check) of the original for later, if you want to know more about this read Luis Argerich's article cited at the bottom of this chapter. That's where I found this code $Crc = crc32($contents); $contents = gzcompress($contents, 9); // compress the data echo $contents; gzip_PrintFourChars($Crc); gzip_PrintFourChars($Size);
function gzip_PrintFourChars($Val) { for ($i = 0; $i < 4; $i ++) { echo chr($Val % 256); $Val = floor($Val / 256); } } } //check if the browser supports gzip encoding: if(ereg('gzip, deflate',$HTTP_ACCEPT_ENCODING)){ compress_output(); } else{ echo $contents; } ?>
Output buffering and compression is discussed by Luis Argerich:
"Caching and compressing dynamic pages". Read also Zeev Suraski's "Output buffering and how it can change your life".
- What's your experience?
- Do you know other golden rules to optimize our output?
At any rate, caching and buffering techniques are advanced techniques. But the aim of this article
was to get a good web builder. So we all have to begin to think about new techniques ...
Benchmark your code (if you want).
Applying some of the golden rules I collected for you, your pages will surely gain in speed and performance. Do you want to benchmark your results and trials on your PC? Here is the script:
<?php $timeparts = explode(" ",microtime()); $starttime = $timeparts[1].substr($timeparts[0],1);
//insert your code here
$timeparts = explode(" ",microtime()); $endtime = $timeparts[1].substr($timeparts[0],1); echo bcsub($endtime,$starttime,6) ?>
Maybe you have particular (positive or negative) experiences in trying out what I proposed? Let's discuss them!
I close this article as usual:
- Do you know other golden rules?
|