Thursday, August 7, 2008

Secure PHP Programming

Secure PHP Programming 101

By Michael McCann

Last Updated: January 10, 2008


Writing insecure code is easy. Everybody does it. Sometimes we do it accidentally because we don’t realize that the security issue exists, and sometimes we do it on purpose because we suspect the bad guys won’t notice one little vulnerability. Secure programming is often overlooked because of ignorance, time constraints, or any number of other factors. Since security isn’t flashy until something goes wrong, it is often easy put it off.

Once your application is compromised, you will realize there’s nothing more important. The best case scenario is that you lose days of productivity and suffer downtime while you fix what was damaged. The worst case scenario &em; your data is compromised and you have no idea if it is correct, much less what the hackers managed to copy and read. Did you expose usernames and passwords to the world? Did you happen to release the credit card information for thousands into the den of identity thieves? You’ll never really be able to know. It’s best to practice secure programming so you never need to ask yourself these questions.

With this in mind, let’s examine three different classes of secure programming "no-noes," storage risks, system risks, and exposure risks and discuss how we can prevent each of them. Server configuration and data transmission security are beyond the scope of this article, but the reader should be aware that they also play a major role in securing a web application.

Storage risks are those risks involved in the storing data and interacting with a database server or file system. The most widely known of these in the infamous SQL injection attack. SQL injection is when you allow the user to input data into a query, and instead of a value he adds his own SQL into the query. The easiest way to prevent this type of attack is to escape every user variable that could touch your queries. Luckily, PHP has several build in functions for handling this, such as mysql_escape_string(). Essentially, this works by escaping characters in a string that could conceivably be used to terminate your query and run a user specified query.

When should you escape user data? It all depends on who you talk to. Some programmers prefer to escape as soon as it enters the application, while others prefer to wait until just before it is placed into the query. Personally, I prefer to escape right before it is inserted into the query. I do this because I can always look at the code, see the database interaction, and see that the data was escaped before it was being used. I don’t need to search the entire source to make sure something was escaped.

The second storage risk we’ll talk about is storing passwords as plain text (hereafter referred to as clear text). I know you guys do it; I’ve seen too many open source applications and too many in-house applications to believe that it doesn’t go on. Simply put, there is never any reason to store a password in clear text. It doesn’t matter if you’re storing the password in a database or a flat file, always store passwords as a hash. You can accomplish this simply enough by using PHP’s md5() function to transform the password before you insert it into your storage medium. Since md5 is repeatable, you can validate a password by simply using

When should you transform the password to a hash? You should do it as soon as possible. Don’t let the password variable float around your application at all. As soon as you grab the password input, convert it into a hash. I prefer to do this by setting the password variable to its own hash, this avoids the chance of using the wrong variable in later code.

Next, let’s talk about the usernames and passwords your program needs in order to interact with other applications (like database servers). You should always separate these out into a different PHP file than the rest of your code, and reference them as constants or variables. This not only makes your code easier to maintain (if you need to change a password, you know exactly where to look), it the event that your source gets released, you know that the password isn’t in that file. While it’s certainly true that they could grab your password file, it does reduce the risk considerably.

Before we leave usernames behind, I want to touch on the concept of division of power. We’re not talking about the government in this case, but about database users. The database user accounts your program uses should have the minimum level of access they need in order to function correctly.

If your application only reads from a database, then the database account it uses should only have SELECT permission on that particular database, and no access to any other database.

To take this concept a step further, I prefer to create multiple database accounts for my web applications. Typically I create one account that only has INSERT permissions for the particular tables the software needs to write to, and a completely separate account that only has SELECT access. This makes sure that no INSERT queries are accidentally performed and mitigates the possible damage done by SQL injections.

Of course, multiple accounts work best when there’s a clear separation between those who can write to a database and those who can read it (such as a CMS). In theory, you could use multiple accounts in any application but you run into problems with the number of open connections to the database. This is simply something that should be considered as a possibility during the design phase of your software.

I’m a big advocate, as are most programmers, of breaking source code down into multiple files at every logical opportunity. However, I’ve noticed that a lot of PHP programmers have a nasty habit of naming PHP files they intend to use as libraries or other include types with the extension .inc, or .config, or some other non .php extension. This is a horrible idea because the server its running on might not be setup to parse these extensions as PHP files, so anyone loading the file would be exposing their source code (and potentially passwords, usernames, and other protected information) to the world. I prefer to prefix filenames myself, using inc_ or class_ when needed.

While we’re discussing included files, I would like to talk about to other security precautions. If you have a PHP file that you intend to use only as part of a larger PHP application, add this line to the beginning of the file (__FILE__, $_SERVER['PHP_SELF']).

This will cause the file to immediately terminate is someone tries to run it directly. A well written include or class file shouldn’t do anything when loaded on its own, but you can never be too careful &em; especially when a one line cut and paste can potentially save you so much heartache.

The other include-related item I’d like to talk about is the difference between include() and readfile(). Include will tell the server to parse the file as PHP, while readfile tells the server to output the file as straight text. You should never use include on a file that is publicly writable (for example, if you have an application that appends user submitted data to end in order to simulate a graffiti wall or guest book) or on a file that you don’t control (files on other servers, or that others can edit). A malicious user could easily inject his own PHP into your system, causing untold amounts of havoc. At the same time, you should never execute readfile on a file that ends in .php. On a misconfigured system, this runs the risk of exposing your source code to the world. To summarize, use readfile() on html, txt, and remote files. Use include on local files with php code you want to execute.

Now let’s talk about system risks. I think of system risks as those things related to the way code executes. The primary system risk in any application is invalid data. You can never valid data enough. As soon as user data enters the system, you should immediately verify it exists and that it is what you want it to be, if not your program should halt and prompt the user for better input.

When validating data, you should use the tightest filter possible. For example, if your program is expecting a percentage, you should not simply verify that they entered something. Your program should verify that it is numeric and between 0 and 100.

You should also validate at every level. Every time a function accepts input, verify that the data is what you expected it to be and react accordingly if the data is bad. This will make it more likely that you will catch bad data due a programming oversight, it also has the added advantage of catching logic errors in your software.

Next, I’d like to talk about eval(), exec(), and their ilk (shell_exec(),system(), passthru(), and pcntl-exec()). Visit their respective php pages to find out more about them, but in actuality there is very rarely any reason to use them. Eval will run any php code passed to it as a variable. This is inherently dangerous because you no longer have absolute control over what code is executed. If you must use eval(), don’t ever run it with a variable that has been derived from a user determined value, otherwise you run the risk of a hacker injecting his code. Exec() and the like pose similar threats, allowing your script to interact with the command line is a level of power you should rarely, if ever, need.

Finally, let’s talk about a couple of exposure risks. Usually, you don’t want to show your error messages to the world. For one, they freak people out. Secondly, they give hackers a wealth of information about potential bugs in your code. On production systems, always turn your error reporting off and use PHP’s errorlog() function instead.

The last risk we’ll talk about is using session IDs. Simply put, try not to ever send the session id to the user. Sessions aren’t secure, but if you transmit the session ID you run an even greater risk of someone other than the expected user to act as a "man in the middle" (to steal an analogy) and piggy-back off of the legitimate user’s session. An example of this would be using a session id to hijack someone’s shopping cart and change a delivery address, get credit card information, or do something even more malicious depending on the system.

No comments: