A Web developer's typical day is anything but boring, with any number of fascinating technologies to explore. Over time, the breadth of knowledge that any cutting-edge developer is expected to accumulate has continued to grow, with the proliferation of Ajax-, iPad- and location-based applications becoming an omnipresent part of any organization's online strategy.
With so much potential for distraction, it's no wonder that developers continue to fall victim to the very same security gaffes that have afflicted the community for well over a decade. Notably, failure to properly validate user input remains the single most serious security issue, with several of the Open Web Application Security Project's top ten security risks originating directly from this oversight.

A Recipe for Disaster

How serious is the problem of improperly validating user data? Consider the commonly applied approach of sending a one-time URL to a user's email address in order to recover their password. Many such solutions allow users to paste the random key into a password recovery form rather than click on the URL. Because this random key is usually quite lengthy, often 32 characters, the chances of somebody exploiting this feature to attack the website is negligible, right? You're so confident of this impossibility that you quickly create the following PHP script, which updates the account that is associated with the provided key:
$db = new mysqli("localhost", "webuser", "secret", "corporate_prod"); $key = $_POST['key']; $password = $_POST['password']; $query = "UPDATE accounts SET password = '{$password}' WHERE recovery_key = '{$key}'"; $result = $db->query($query);
However, a malicious user armed with a basic understanding of how the one-time URL works passes not the random recovery key but the string <code>"' OR ''='"</code> into the recovery form, meaning that the query sent to the MySQL database looks like this:
UPDATE accounts SET password = 'iownyou' WHERE email = '' OR ''=''
When executed, this odd-looking SQL statement will change every account password in the accounts table. Suffice to say, this isn't the sort of password recovery feature you had in mind.
While this particular dilemma could be avoiding using prepared statements, it nonetheless highlights the considerable danger of not properly validating user input. Fortunately, with PHP 5.2 came an incredibly easy way to ensure that user input fits expectations!

Introducing the Filter Extension

An official part of the PHP distribution as of the 5.2.0 release, the Filter extension offers developers an easy way to validate and sanitize user input. Validation is useful in instances where input absolutely must fit a certain requirement such as a syntactically valid email address or an integer value such as a user's age. Sanitization is useful in cases where the input might need to be cleaned up a bit before it's accepted, such as removing disallowed HTML tags from a blog comment.

Validating Data

Validating data using the Filter extension is accomplished using the filter_var() function in conjunction with one of seven available filters. For instance, to validate an email address, use the FILTER_VALIDATE_EMAIL validation filter:
$email = "jason@example.com"; if (filter_var($email, FILTER_VALIDATE_EMAIL)) { echo "Valid email address!"; } else { echo "Invalid email address!"; }

Validating an Alphanumeric String

Oddly, the Filter extension doesn't offer a filter for validating or sanitizing strings consisting solely of alphanumeric characters. However, thanks to the FILTER_VALIDATE_REGEXP extension, it's trivial to create your own solution. Returning to the maliciously malformed recovery key used in the opening example, you can create a regular expression which will return TRUE only if the provided string consists of letters and numbers as demonstrated here:
$recoveryKey = "' OR ''='"; if (filter_var($recoveryKey, FILTER_VALIDATE_REGEXP, array('options' => array('regexp' => "/^[a-zA-Z0-9]+$/")))) { echo "Valid recovery key"; } else { echo "Invalid recovery key"; }

Validating an Integer Value

The Filter extension also offers a validation filter named FILTER_VALIDATE_INT, which can be used to ensure that a value is a valid integer. You can optionally pass an integer range to ensure that the value falls within a defined boundary. For instance, if you wanted to collect age-related information from your users, you'd presumably want to allow only users aged between 13 and 100 or so years (the lower limit is in order to comply with COPPA and the upper limit simply a reasonable upper limit in terms of practical life span). You can set this range using the FILTER_VALIDATE_INT filter like this:
if (filter_var($age), FILTER_VALIDATE_INT, array('options' => array('min_range' => 13, 'max_range' => 100))) { echo "Valid age!"; } else { echo "Invalid age!"; }

Other Validation Resources

PHP's Filter extension is only one of the latest of many validation-specific solutions at your disposal. A great number of useful string-related functions are found in PHP's Strings library. When sending data to a MySQL query, be sure to use prepared statements in order to properly escape any special characters, which could be used to interfere with the query's proper functioning. If you're a Zend Framework user, then I highly recommend checking out the Zend_Filter component.
With so many options at your disposal, incorporating a sound approach to data validation within your next project should be a trivial part of the process, allowing you spend even more time investigating new technologies!

About the Author

Jason Gilmore is the founder of the publishing and consulting firm WJGilmore.com. He also is the author of several popular books, including "Easy PHP Websites with the Zend Framework", "Easy PayPal with PHP", and "Beginning PHP and MySQL, Fourth Edition". Follow him on Twitter at @wjgilmore.