Setting a Bad Example: How Not To Validate User Input

September 20, 2010

(Disclaimer: The code provided below is offered only for purposes of illustrating the writer’s thoughts and should be considered INSECURE. The included code SHOULD NOT be used on ANY Web server except when it has been carefully isolated, and even then only for the specific purpose of conducting research.)

Much of the time that I spend writing exploits for Web application security flaws is spent bypassing filters of various types.

One example of insecure Web application that I commonly encounter is a file upload script. With these scripts, if the file being uploaded is not properly validated and is Web-accessible, it generally creates the opportunity for code execution by an attacker, as they can subsequently upload some sort of malicious file which will be treated as code.

Surprisingly, it’s rather difficult to handle file uploads in a secure manner, and most of today’s developers generally don't seemingly have the understanding needed to do so.

Some of the ways that people generally try to protect against this pervasive issue are to:

1) Filter uploads based on file extensions

2) Check the Content-type header of uploaded files

3) Ensure that the file is a valid example of the expected file type

All of these help, but by themselves each approach still has issues that need addressing.

(The following code examples are borrowed from scanit -- and you can read all about file upload insecurity at:

To explore this idea a little further, let's start by discussing the file extension approach.

First, it's worth mentioning that any filtering mechanism that has been implemented in JavaScript or any other client-side technology (Flash, for instance) is entirely useless as any attacker can write their own interface without the protection mechanisms.

Doing this sort of filtering on the server-side has issues too, though.

Here's a PHP script which takes a file upload and checks the file extension against a list of "bad" file extensions:


$blacklist = array(".php", ".phtml", ".php3", ".php4");

foreach ($blacklist as $item) {

                if(preg_match("/$item\$/i", $_FILES['userfile']['name'])) {

                                echo "We do not allow uploading PHP files\n";




$uploaddir = 'uploads/';

$uploadfile = $uploaddir . basename($_FILES['userfile']['name']);

if (move_uploaded_file($_FILES['userfile']['tmp_name'], $uploadfile)) {

                echo "File is valid, and was successfully uploaded.\n";

} else {

                echo "File uploading failed.\n";



That’s a nice try, but it won't work on Web sites where the server is set to execute any file other than those with the standard PHP file extensions. Extensions like "cgi" or "exe" are frequently set to be executed by the server, and if another application framework is installed other file extensions may execute as code.

Additionally, if the Web admin has enabled any non-standard extension to execute as code, everyone say "OOPS!"

Now take a look at this script, which instead attempts to check the Content-type header of any uploaded file to ensure that it's consistent with the type of image the upload script is intended for, which in my experience tends to typically be images.

In this case, only "image/gif" and "image/jpeg" are valid values for the Content-type header:


$imageinfo = getimagesize($_FILES['userfile']['tmp_name']);

if($imageinfo['mime'] != 'image/gif' && $imageinfo['mime'] != 'image/jpeg') {

                echo "Sorry, we only accept GIF and JPEG images\n";



$uploaddir = 'uploads/';

$uploadfile = $uploaddir . basename($_FILES['userfile']['name']);

if (move_uploaded_file($_FILES['userfile']['tmp_name'], $uploadfile)) {

                echo "File is valid, and was successfully uploaded.\n";

} else {

                echo "File uploading failed.\n";



Luckily (and somewhat comically) for us, the Content-type header can be set arbitrarily and will only affect the logic of the application, which in this case is simply checking it to ensure that it is some particular value. We can send a Content-type header of "lol/wut" if we like, or "ilovemydoggy/heissocute" or even "hacknaked/bowtomyfirewallahh".

It simply doesn't matter, and as such we can very easily satisfy the application with a Content-type header of "image/gif" despite the fact that we, as pen testers, are likely uploading a file using a "php" extension.

Perhaps you're saying to yourself at this point:

"Self, it seems it's not effective to check things like file extensions and user-provided data. What I need to do is to check the content of the file itself!  (And perhaps even: “I should seek mental help considering the fact that I'm talking to myself...")

While I can't provide advice on your mental well-being, I can recommend that you think again about your overall approach to addressing this security challenge. Due to the way certain file formats work, a file can be simultaneously interpreted as a valid example of two or more different types of files very easily.

For instance, the JPEG file format specifies start and end "magic numbers" to delimit where the image begins and ends (0xFFD8 and 0xFFD9). Content outside those markers is ignored when processing the file as a JPEG.

The RAR format specifies a start "magic number": 0x526172211A0700 (Rar!...) However, the format does not specify an end marker. By appending a RAR archive to the end of a JPEG image, the file becomes a valid JPEG image and RAR archive.

Try it yourself on Unix or Windows:

Unix: "cat archive.rar >> image.jpeg"

Windows: "type archive.rar >> image.jpeg"

After this command, image.jpeg is both a valid JPEG and RAR. Similarly, a PHP file can be appended to a JPEG.

PHP files also have start and end markers: "<?php" and "?>". As such, PHP scripting can also be inserted into comment fields in many file formats. JPEG and GIF both have comment fields, as do most archive formats and other file formats with metadata.

As you may have already guessed, it’s possible to bypass such a filter by creating a file which is both a JPEG or GIF and a PHP backdoor. By using a similar command to what is listed earlier, you can append a PHP file to a JPEG image and use this attack yourself.

You should avoid uploading files to a web-accessible folder with execute permissions enabled. This will prevent uploaded files from being executed using the web server. Also, doing various things like checking Content-type and checking for a well-formed image can prove very valuable as well.

Some people take an entirely different approach to protecting their applications, and it's what some refer to as "data massaging." Data massaging is the process of sanitizing data before using it, despite the presence of suspicious data.

This approach is dangerous; if suspicious data is detected, the attempt should be logged and denied.

Here's an example of how things can go wrong, in the form of a PHP function.

function remove_bad_chars($str_words)


                $found = false;

                $bad_string = array("select", "drop", ";", "--", "insert","delete", "xp_", "%20union%20", "/*", "*/union/*", "+union+", "load_file", "outfile", "document.cookie", "onmouse", "<script", "<iframe", "<applet", "<meta", "<style", "<form", "<img", "<body", "<link", "_GLOBALS", "_REQUEST", "_GET", "_POST", "include_path", "prefix", "http://", "https://", "ftp://", "smb://", "onmouseover=", "onmouseout=");

                for ($i = 0; $i < count($bad_string); $i++){

                                $str_words = str_replace($bad_string[$i], "", $str_words);


                return $str_words;           


This is a function in an actual product called APPHP MicroCMS. Its purpose, as you might imagine, is to perform data massaging with any strings it receives. The comically named "bad_string" array contains keywords which might be used in various attacks, ranging from SQL injection to cross-site scripting attacks to file inclusion flaws.

However, there's a serious problem here: this check is run only once, and the strings are simply removed.

For any keyword two characters or greater in length, we can split it with another "bad" keyword, which will be removed, leaving our original keyword intact. We'll use a keyword in the list after the bad keywords we want to use ("smb://") to break up any keywords we don't want removed (since these keywords are removed from all strings in order).

So, for an SQL injection flaw, the original attack might be:

"' union select * from mysql.users#"

Thus, our attack string would become:

"' unsmb://ion selsmb://ect * from mysql.users#"

And after filtering, would return to:

"' union select * from mysql.users#"

On a slight tangent, I wanted to mention that APPHP MicroCMS considers "--" a bad keyword due to its use as an SQL comment delimeter in MSSQL. Other MSSQL-specific keywords such as "xp_" can also be seen. However, MicroCMS uses MySQL, in which neither of these strings are relevant.

That's all for now.

Until next time, I'm...

--Dan Crowley, Technical Specialist



  • Penetration testing

Ready for a Demo?

Eliminate identity-related breaches with SecureAuth!