Module
Improper Input Validation
Summary
Any input that comes into a program from an external source - such as a user typing at a keyboard or a network connection - can be the source of security concerns and disastrous bugs. All input should be treated as potentially dangerous.
Description
Most software packages rely on external input, either via the keyboard, network, or other external sources. Generally, this input will be of a specific type: for example, a user interface that requests a person’s name expects a series of alphabetic characters. If programs are not carefully written, attackers can construct inputs that can cause malicious code to be executed.
Risk
How Can It Happen? All input data is a potential source of problems. If input is not checked to verify that it has the correct type, format, and length, it can cause problems. Failure to validate input can lead to serious security risks such as integer error, buffer overflow, and SQL injections among others.
Example
A Norwegian woman mistyped her account number on an internet banking system. Instead of typing her 11-digit account number, she accidentally typed an extra digit, for a total of 12 numbers. The system discarded the extra digit, and transferred $100,000 to the (incorrect) account. A simple dialog box informing her that she had typed too many digits would have helped avoid this expensive error.
Example Code (Bad Code)
Below a few examples demonstrate this weakness.
-
Example 1 demonstrates a shopping interaction in which the user is free to specify the quantity of items to be purchased and a total is calculated.
... public static final double price = 20.00; int quantity = currentUser.getAttribute("quantity"); double total = price * quantity; chargeUser(total); ...
The user has no control over the price variable, however the code does not prevent a negative value from being specified for quantity. If an attacker were to provide a negative value, then the user would have their account credited instead of debited.
- This example asks the user for a height and width of an m X n game board
with a maximum dimension of 100 squares.
... #define MAX_DIM 100 ... /* board dimensions */ int m,n, error; board_square_t *board; printf("Please specify the board height: \n"); error = scanf("%d", &m); if ( EOF == error ){ die("No integer passed: Die evil hacker!\n"); } printf("Please specify the board width: \n"); error = scanf("%d", &n); if ( EOF == error ){ die("No integer passed: Die evil hacker!\n"); } if ( m > MAX_DIM || n > MAX_DIM ) { die("Value too large: Die evil hacker!\n"); } board = (board_square_t*) malloc( m * n * sizeof(board_square_t)); ...
While this code checks to make sure the user cannot specify large, positive integers and consume too much memory, it does not check for negative values supplied by the user. As a result, an attacker can perform a resource consumption attack against this program by specifying two, large negative values that will not overflow, resulting in a very large memory allocation and possibly a system crash. Alternatively, an attacker can provide very large negative values which will cause an integer overflow and unexpected behavior will follow depending on how the values are treated in the remainder of the program.
-
The third example shows a PHP application in which the programmer attempts to display a user’s birthday and homepage.
$birthday = $_GET['birthday']; $homepage = $_GET['homepage']; echo "Birthday: $birthday<br>Homepage: <a href=$homepage>click here</a>"
The programmer intended for $birthday to be in a date format and $homepage to be a valid URL. However, since the values are derived from an HTTP request, if an attacker can trick a victim into clicking a crafted URL with
2009-01-09--
If this data were used in a SQL statement, it would treat the remainder of the statement as a comment. The comment could disable other security-related logic in the statement. In this case, encoding combined with input validation would be a more useful protection mechanism.
Furthermore, an XSS attack or SQL injection are just a few of the potential consequences when input validation is not used. Depending on the context of the code, CRLF Injection, Argument Injection, or Command Injection may also be possible.
-
This example takes a user-supplied value to allocate an array of objects and then operates on the array.
private void buildList ( int untrustedListSize ){ if ( 0 > untrustedListSize ){ die("Negative value supplied for list size, die evil hacker!"); } Widget[] list = new Widget [ untrustedListSize ]; list[0] = new Widget(); }
This example attempts to build a list from a user-specified value, and even checks to ensure a non-negative value is supplied. If, however, a 0 value is provided, the code will build an array of size 0 and then try to store a new Widget in the first location, causing an exception to be thrown.
-
This examples show that an Android mobile application has registered to handle a URL when sent an intent:
... IntentFilter filter = new IntentFilter("com.example.URLHandler.openURL"); MyReceiver receiver = new MyReceiver(); registerReceiver(receiver, filter); ... public class UrlHandlerReceiver extends BroadcastReceiver { @Override public void onReceive(Context context, Intent intent) { if ("com.example.URLHandler.openURL".equals(intent.getAction())) { String URL = intent.getStringExtra("URLToOpen"); int length = URL.length(); ... } } }
The application assumes the URL will always be included in the intent. When the URL is not present, the call to
getStringExtra()
will returnnull
, thus causing a null pointer exception whenlength()
is called.
Addressing Improper Input Validation
How would we address potential improper input validation error in our code? The basic rule is for input validation is to check that input data matches all of the constraints that it must meet to be used correctly in the given circumstance. In many cases, this can be very difficult: confirming that a set of digits is, in fact, a telephone number may require consideration of the many different phone number formats used by countries around the world. To lessen these challenges, we offer a few tips on some checks that you might want to use to validate input data, which include:
- Type: Input data should be of the right type. Names should generally be alphabetic, numbers numeric. Punctuation and other uncommon characters are particularly troubling, as they can often be used to form the basis of code-injection attacks. Many programs will handle input data by assuming that all input is of string form, verifying that the string contains appropriate characters, and then converting the string into the desired data type.
- Range: Verify that numbers are within a range of possible values: For example, the month of a person’s date of birth should lie between 1 and 12. Another common range check involves values that may lead to division by zero errors.
- Plausibility: Check that values make sense, e.g., a person’s age shouldn’t be less than 0 or more than 150.
- Presence check: Guarantee presence of important data - the omission of important data can be seen as an input validation error.
- Length: Input that is either too long or too short will not be legitimate. Phone numbers generally don’t have 20 digits; Social Security Numbers have exactly 9.
- Format: Dates, credit card numbers, and other data types have limitations on the number of digits and any other characters used for separation. For example, dates are usually specified by 2 digits for the month, one or two for the day, and either two or four for the year.
To further reduce the challenges of input validation, consider the following:
- Use a library: Many programming languages have libraries that can help with input validation. For example, Java has the java.util.regex package, which provides support for regular expressions; the United States Postal Service (USPS) provides a library for validating ZIP codes; Google’s libphonenumber is a library that parses, formats, stores and validates international phone numbers. Locating and using a library that is appropriate for your needs can save you a lot of time and effort. In addition, libraries are often more thoroughly tested than code that you write yourself, and they are often maintained by a community of developers who are interested in keeping them up to date.
- Use appropriate language functions/APIs. The safety of functions/APIs that read user input varies across programming languages and systems. Some languages, such as C and C++ have library calls that read user input into a character buffer without checking the bounds of that buffer, causing a both a buffer overflow and an input validation problem. Alternative libraries specifically designed with security in mind are often more robust.
- Use appropriate programming languages. The choice of programming languages
can play a role in the potential severity of input validation
vulnerabilities. As strongly-typed languages, Java and C++ require that the
type of data stored in a variable is known ahead of time. This requirement
leads to the type mismatch problem when, e.g., a string such as
"abcd"
is typed in response to a request for an integer. Untyped languages such as Perl and Ruby do not have any such requirements - any variable can store any type of value. Of course, the strong-typed languages do not eliminate validation problems. You may still run into trouble, e.g., if you use a string to retrieve an item from an integer-indexed array. Some languages provide additional help in the form of built-in procedures that can be used to remove potentially damaging characters from input strings.
Recover appropriately. A robust program will respond to invalid input in a manner that is appropriate, correct, and secure. For user input, this will often mean providing an informative error message and requesting re-entry of the data. Invalid input from other sources, such as a network connection, may require alternate measures. Arbitrary decisions such as truncating or otherwise reformatting data to “make it fit” should be avoided.
Fixing Weaknesses
In this lab, you will be given a code snippet that contains an Improper Input Validation vulnerability. Your task is to identify the vulnerability and fix it.
Examine the code snippet below.
public class Input {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int sz = getArraySize(scan);
String[] names = getNames(scan, sz);
int which = getWhich(scan);
String aName = getName(which, names);
System.out.println("You choose name: " + aName);
}
public static int getArraySize(Scanner scan) {
System.out.print("How many names? ");
int n = scan.nextInt();
scan.nextLine();
return n;
}
public static String[] getNames(Scanner scan, int sz) {
String[] names = new String[sz];
for (int i = 0; i < sz; i++ ) {
System.out.print("type name #" + (i + 1) + ": ");
names[i] = scan.nextLine();
}
return names;
}
public static int getWhich(Scanner scan) {
System.out.print("Which name: ");
int x = scan.nextInt();
return x;
}
public static String getName(int n, String[] vals) {
return vals[n - 1];
}
}
- Locate an input that is not properly validated.
- Fix the security weakness using the skills you just learned, i.e., revise the program to properly validate the input and gracefully recover from errors.
- Complete the survey in the next page. One of the question in the survey asks for your solution to the lab. Copy your solution code snippet (snippet only) and paste it into the survey.
Acknowledgement
This page is derived from the Security Injection@Towson project.