Chris Snyder and Michael Southwell
Chapter 12 is reprinted with permission by Apress
PHP is an extremely powerful yet easy-to-learn scripting language, affording even relatively inexperienced programmers the opportunity to create complex, dynamic websites. It is, however, notoriously difficult to ensure privacy and security of internet services. In this book, we will provide you with the security background every web developer needs, along with PHP-specific knowledge and code that you can use to protect the integrity of your own applications. We begin with an overview of server security that shows you how to assess privacy in a shared hosting environment, keep developers out of production servers, maintain up-to-date software, provide encrypted channels, and control access to your systems.
The discussion then turns to preventing common vulnerabilities in PHP scripts. We explain how to secure your scripts against SQL injection, prevent cross-site scripting and remote execution, and stop the hijacking of temporary files and sessions. The final part of the book is devoted to implementing secure applications. You'll learn how to verify user identities, authorize and track application use, PHP Security avoid data loss, safely execute high-risk system commands, and use web services securely. Whether you have learned just enough PHP to be dangerous, or have years of experience dealing with security issues, this book offers a wealth of information that can help you to make your online applications more secure.
We began Part 3 with a discussion in Chapter 11 of keeping your PHP scripts secure by careful validation of user input. We continue that discussion here, focusing on user input that participates in your scripts' interaction with your databases. Your data is, after all, probably your most treasured resource. Your primary goal in writing scripts to access that data should be to protect your users' data at all costs. In the rest of this chapter, we'll show you ways to use PHP to do that.

What SQL Injection Is

There is no point to putting data into a database if you intend never to use it; databases are designed to promote the convenient access and manipulation of their data. But the simple act of doing so carries with it the potential for disaster. This is true not so much because you yourself might accidentally delete everything rather than selecting it. Instead, it is that your attempt to accomplish something innocuous could actually be hijacked by someone who substitutes his own destructive commands in place of yours. This act of substitution is called injection.
Every time you solicit user input to construct a database query, you are permitting that user to participate in the construction of a command to the database server. A benign user may be happy enough to specify that he wants to view a collection of men's long-sleeved burgundy- colored polo shirts in size large; a malicious user will try to find a way to contort the command that selects those items into a command that deletes them, or does something even worse. Your task as a programmer is to find a way to make such injections impossible.
How SQL Injection Works
Constructing a database query is a perfectly straightforward process. It typically proceeds something like this (for demonstration purposes, we'll assume that you have a database of wines, where one of the fields is the grape variety):
  1. You provide a form that allows the user to submit something to search for. Let's assume that the user chooses to search for wines made from the grape variety "lagrein."
  2. You retrieve the user's search term, and save it by assigning it to a variable, something like this:
    SnyderSouthwell_5084.book  Page 250  Saturday, July 16, 2005  6:14 AM
    $variety = $_POST['variety'];
    
    So that the value of the variable $variety is now this:
    lagrein
    
  3. You construct a database query, using that variable in the WHERE clause, something like this:
    $query = "SELECT * FROM wines WHERE variety='$variety'";
    
    so that the value of the variable $query is now this:
    SELECT * FROM wines WHERE variety='lagrein'
    
  4. You submit the query to the MySQL server.
  5. MySQL returns all records in the wines table where the field variety has the value "lagrein."
So far, this is very likely a familiar and comfortable process. Unfortunately, sometimes familiar and comfortable processes lull us into complacency. So let's look back at the actual construction of that query.
  1. You created the invariable part of the query, ending it with a single quotation mark, which you will need to delineate the beginning of the value of the variable:
    $query = "SELECT * FROM wines WHERE variety = '";
    
  2. You concatenated that invariable part with the value of the variable containing the user's submitted value: $query .= $variety;
  3. You then concatenated the result with another single quotation mark, to delineate the end of the value of the variable:
    $query .= "'";
    
    The value of $query was therefore (with the user input in bold type) this:
    SELECT * FROM wines WHERE variety = 'lagrein'
    
    The success of this construction depended on the user's input. In this case, you were expecting a single word (or possibly a group of words) designating a grape variety, and you got it. So the query was constructed without any problem, and the results were likely to be just what you expected, a list of the wines for which the grape variety is "lagrein." Let's imagine now that your user, instead of entering a simple grape variety like "lagrein" (or even "pinot noir"), enters the following value (notice the two included punctuation marks):
    lagrein' or 1=1;
    
You now proceed to construct your query with, first, the invariable portion (we show here only the resultant value of the $query variable):
SELECT * FROM wines WHERE variety = '
You then concatenate that with the value of the variable containing what the user entered (here shown in bold type):
SELECT * FROM wines WHERE variety = 'lagrein' or 1=1;
And finally you add the closing quotation mark:
SELECT * FROM wines WHERE variety = 'lagrein' or 1=1;'
The resulting query is very different from what you had expected. In fact, your query now consists of not one but rather two instructions, since the semicolon at the end of the user's entry closes the first instruction (to select records) and begins another one. In this case, the second instruction, nothing more than a single quotation mark, is meaningless. But the first instruction is not what you intended, either. When the user put a single quotation mark into the middle of his entry, he ended the value of the desired variable, and introduced another condition. So instead of retrieving just those records where the variety is "lagrein," in this case you are retrieving those records that meet either of two criteria, the first one yours and the second one his: the variety has to be "lagrein" or 1 has to be 1. Since 1 is always 1, you are therefore retrieving all of the records!
You may object that you are going to be using double rather than single quotation marks to delineate the user's submitted variables. This slows the abuser down for only as long as it takes for it to fail and for him to retry his exploit, using this time the double quotation mark that permits it to succeed. (We remind you here that, as we discussed in Chapter 11, all error notification to the user should be disabled. If an error message were generated here, it would have just helped the attacker by providing a specific explanation for why his attack failed.)
As a practical matter, for your user to see all of the records rather than just a selection of them may not at first glance seem like such a big deal, but in actual fact it is; viewing all of the records could very easily provide him with insight into the structure of the table, an insight that could easily be turned to more nefarious purposes later. This is especially true if your database contains not something apparently innocuous like wines, but rather, for example, a list of employees with their annual salaries.
And as a theoretical matter, this exploit is a very bad thing indeed. By injecting something unexpected into your query, this user has succeeded in turning your intended database access around to serve his own purposes. Your database is therefore now just as open to him as it is to you.
PHP and MySQL Injection
As we have mentioned previously, PHP, by design, does not do anything except what you tell it to do. It is precisely that hands-off attitude that permits exploits such as the one we described previously.
We will assume that you will not knowingly or even accidentally construct a database query that has destructive effects; the problem is with input from your users. Let's therefore look now in more detail at the various ways in which users might provide information to your scripts.
Kinds of User Input
The ways in which users can influence the behavior of your scripts are more, and more complex, than they may appear at first glance.
The most obvious source of user input is of course a text input field in a form. With such a field, you are deliberately soliciting a user's input. Furthermore, you are providing the user with a wide open field; there is no way that you can limit ahead of time what a user can type (although you can limit its length, if you choose to). This is the reason why the overwhelming source for injection exploits is the unguarded form field.
But there are other sources as well, and a moment's reflection on the technology behind forms (the transmission of information via the POST method) should bring to mind another common source for the transmission of information: the GET method. An observant user can easily see when information is being passed to a script simply by looking at the URI displayed in the browser's navigation toolbar. Although such URIs are typically generated programmatically, there is nothing whatsoever stopping a malicious user from simply typing a URI with an improper variable value into a browser, and thus potentially opening a database up for abuse.
One common strategy to limit users' input is to provide an option box rather than an input box in a form. This control forces the user to choose from among a set of predetermined values, and would seem to prevent the user from entering anything unexpected. But just as an attacker might spoof a URI (that is, create an illegitimate URI that masquerades as an authentic one), so might she create her own version of your form, with illegitimate rather than predetermined safe choices in the option box. It's extremely simple to do this; all she needs to do is view the source and then cut-and-paste the form's source code, which is right out in the open for her.
After modifying the choices, she can submit the form, and her illegal instruction will be carried in as if it were original. So users have many different ways of attempting to inject malicious code into a script.
Come back next week when we continue with our excerpt from Pro PHP Security!
reprinted with permission by Apress