While working on this article, I received a Security bulletin highlighting a critical flaw in phpBB. This is a fairly mature open source forum written in PHP, and one that's had its fair share of critical flaws. The fact that there are still more being found, and more likely to be found, shows you how difficult it is to write completely secure software, even for an experienced team of developers.


Ensure your software is up to date

It seems obvious, but many installations still contain old versions of PHP, with known exploits. By keeping up-to-date with security bulletins (for example from Security Tracker) you'll be informed when an exploit is discovered, and be able to take action, hopefully before the exploit is used against you. Exploits can be for PHP itself, other applications on your servers such as MySQL or Apache, or, more commonly, PHP applications, such as phpBB.


Register Globals

Any article on security has to start with the register_globals setting in your PHP configuration. Until version 4.2.0, this setting was on by default, and most applications were written with the assumption that the setting was on. Even now, many versions later, many administrators still change the setting back to on to allow legacy applications to work painlessly. Some high profile applications, such as OSCommerce (as of June, 2005) still only work with register_globals on.


register_globals being on is a gaping wide hole of a security risk. It allows users to set global variables inside a script from by passing them from outside (usually the GET or POST method). Since a feature of PHP is that variables don't need to be initialized, they rarely are, and this lapse allows the attacker to do so on your behalf. It is possible to code securely with register_globals on, it just makes the task that much more difficult, and the risks much greater. Here's an example of some exploitable pseudocode:


 <?php
  // example1.php
  if (some or other condition) {
      $authenticate = 1;
  }
  //
  if ($authenticate == 1) {
      // allow access to something important
  }
 ?>


What's wrong with the above? If register_globals is on, a malicious users can pass a parameter, as follows:
www.example.co.za/example.php?authenticate=1. Since the variable $authenticate was never initialized, the user gains access where they shouldn't. Here's how the code can be secured.


 <?php
  // example2.php
  
  // first initialize the authentication variable
  $authenticate = 0;
  
  if (some or other condition) {
      $authenticate = 1;
  }
  //
  if ($authenticate == 1) {
      // allow access to something important
  }
 ?>


However, by simply having register_globals off, the first script would also be immune from that sort of attack. A script called from the URL above would not run with $authenticate set to 1 in the global namespace. Rather, the variable would only be accessible as $_GET['authenticate'], where it can do much less harm.


Include files and the web tree

Include files are often used to store passwords and other confidential data. When stored in the web tree there's a circular round of problems. If they have the extension .inc, poorly configured Apache webservers will allow the the file to display as plain text. For example, www.example.co.za/conf.inc could display a database username and password. You can (and always should) configure Apache to bar the display of .inc files. However, you can't always assume this will be case in the particular environment where your code is deployed. It's also not always directly under the control of the programmer, and relies upon the system administrator. Many applications now use the .php extension to pre-empt this, or .inc.php. The files should now never be displayed in plain text. At the very least PHP will interpret the file and return a blank page. However, include files with a .php extension have also been exploited before. They're not meant to be points of entry, and this means that they aren't always checked thoroughly. It's best practice to put include files outside of the web tree. Ideally, any file that's not meant to be a point of entry (such as libraries and includes) should sit outside the web tree.

Avoiding SQL injection attacks with Magic Quotes or addslashes()

Another php.ini configuration is magic_quotes_gpc. Off by default, many encourage turning this on for security reasons. While it does have benefits, I personally dislike the setting, and am loathe to use it. It's main benefit is that it prevents some types of command injection attacks. For example:


 <?php
  // example3.php
  
  // search for the company name supplied by the user
  $query = "SELECT * FROM tablename WHERE field1 = '".$_POST['company_name']."'";
 ?>


Without any validation (see the next point), whatever the user entered is simply being passed to the database. A common exploit (as well as more commonly simply a simple bug) is to enter a name with a single quote, for example Miriam's. The single quote would cause the SQL passed to the database to be invalid (effectively SELECT * FROM tablename WHERE field1 = 'Miriam's'. A more malicious user can end up with a query such as SELECT * FROM tablename WHERE field1 = 'Miriam' OR 1, revealing everything, or even worse, SELECT * FROM tablename WHERE field1 = 'Miriam'; DELETE FROM tablename. The magic_quotes_gpc setting counteracts this, by magically escaping any quotes with a backslash. However, I find this does impact the integrity of the data, which could now contain unnecessary slashes. It requires the frequent use of the stripslashes() function to remove the offending slashes.

The alternative is to use the addslashes() function to escape when needed. This requires the programmer to be sure that it is used in every case, but I find this a better alternative than escaping absolutely everything. It's a challenging task though. The critical flaw I mentioned earlier that's been patched for phpBB 2.0.16 involved a missing addslashes().



Validation

Almost all PHP vulnerabilities, no matter how insecure the configuration, can be avoided if you adhere to the principle of not trusting any data that originates outside the code. Many suggest that all user data should be validated. Validation is not as easy as it sounds. It needs to be consistent, and strict. PHP is loosely typed, which means that setting a variable to a blank string means that it also has the numeric value 0. This can be important in certain cases, such as with PHP's associative arrays, where the array $phparray["00"] is not the same as $phparray["0"], as the indexes are evaluated as a string, not as a numeral.

Many programmers also rely upon JavaScript validation on a form, and don't do any validation inside the PHP code. This is not sufficient for deflecting attacks. Don't think an attacker isn't going to recreate and customize the form or URL to do just what they want it to do!

Here's an example of validating an email address:



<?php
if (!eregi("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})$", $email) { 
    //invalid email
}
?>


Doing something similar for every single external input (which includes $_POST, $_GET, $_COOKIE, $_SESSION, $_REQUEST) can be time-consuming, and although a great way of enhancing security, following the variable through the code, knowing where it will be used, and doing so securely, is as effective.


Outputting HTML

Here's a sample PHP snippet that displays some HTML:
<?php
// Welcome a user who's just entered their name
echo '<p>Hello '.$_POST['name']."</p>'; 
?>

Now what happens if the name they've entered happens to be:

<script language ='JavaScript'>
document.location.href=http://www.maliciouswebsite.co.za
</script>


Your script redirects to another page. This is called a cross-site scripting vulnerability. There're two places to avoid problems like this. You can try and detect it at source (validating the data), as well as ensuring you avoid any pitfalls when you next use the variable. If you plan to output the variable on a web page, as in the above example, the PHP function htmlentities() converts the HTML into their harmless HTML entity equivalent, for example:

See the PHP documentation on htmlentities() for more details on the options. The ENT_QUOTES option is the strictest, converting both single and double quotes. By default, only double quotes are converted. Here's how the same code would look with htmlentitles():



<?php
// Welcome a user who's just entered their name
echo '<p>Hello '.htmlentities($_POST['name'])."</p>'; 
?>


Other functions you can use depending on the context include htmlspecialchars() or strip_tags(), which removes all PHP and HTML code (leaving whatever's between the tags as plain text).


Error reporting

By default, PHP error reporting is set to E_ALL and ~E_NOTICE. The PHP manual recommends that you set E_NOTICE on during testing as well. It's one of those things that tends to get forgotten in the switching between live and testing, so I suggest leaving simply using E_ALL (which reports all errors, including those from E_NOTICE). Of course you'll want to log errors to a file rather than display them, and you can do so with the log_errors php.ini configuration directive. PHP 5 has a further setting, E_STRICT that is worth using (it's the only kind of reporting not included in E_ALL). This gives suggestions, which are useful in development, such as informing you that you're using a deprecated function. Of course, if you ignore the warning and continue to use the deprecated function, there's no point in having this setting active in your production environment, as you'll just be clogging your logfile.


Running Shell commands

Shell commands are a big risk. Most attackers would love to be able execute shell commands on your server, and running shell commands based upon external input is another potential point of entry. Of course, sometimes you just have to do things that way, but there's no equivalent of magic_quotes_gpc for shell commands. Here's an example:


<?php
//list the contents of a directory based upon user input
echo "Directory listing: ".system("ls -l {$_POST['directory']}");
?>


You're meant to be running something like ls -l /home/ian/, but a malicious user may be able to manipulate an unvalidated $_POST['directory'] to read something like ls -l / ; rm -rf * and delete all files in and below your current directory.

However, there is an equivalent to addslashes() called escapeshellarg(), and it quotes and escapes as necessary, ensuring you're only passing a single argument to the shell command. These include: exec(), popen(), system() and the backtick operator.



File Uploads

Allowing files to be uploaded is a potentially serious security risk. It's been exploited in the past in PHP, and allowing any arbitrary file onto your server, potentially containing malicious code, is always going to be a potential risk. Many attacks rely upon files being uploaded to the local server in order to cause serious damage. Disable file uploads if you don't need them (by setting file_uploads = off in php.ini). If you do need them, only use the $_FILES variables, as described in the PHP manual. Older versions supported alternative, less secure methods.


Remote files

PHP's convenience extends to the ability to work with files on remote servers, which is often very useful. However, this too creates potential hazards, as it opens the door for malicious code sitting on remote servers.


Conclusion

As you can see, writing secure code in PHP is not trivial. PHP programmers often rely upon external libraries, or hook into code that's been written by developers (or designers) who don't consider security. Ensuring the entire application is secure could involve trawling through reams of code you didn't write yourself. But consider the consequences. Books and websites sometimes encourage programmers to optimize their code to the extreme, spending hours gaining a fraction of a second in performance. This is usually frowned upon by employers, who're paying for the time. But in the case of security, the consequences are much more serious, and employers should be much happier to consider this.


Other resources