What is Undefined Behavior?

Tuesday, July 09, 2013 , 0 Comments

In computer programming, undefined behavior refers to computer code whose behavior is unpredictable. It is a feature of some programming languages—most famously C.[1] In these languages, to simplify the specification and allow some flexibility in implementation, the specification leaves the results of certain operations specifically undefined, meaning that the programmer can't predict what will happen.

For example, in C the use of any automatic variable before it has been initialized yields undefined behavior, as would division by zero or indexing an array outside of its defined bounds (see buffer overflow). This specifically frees the compiler to do whatever is easiest or most efficient, should such a program be submitted. In general, any behavior afterwards is also undefined. In particular, it is never required that the compiler diagnose undefined behavior — therefore, programs invoking undefined behavior may appear to compile and even run without errors at first, only to fail on another system, or even on another date. When an instance of undefined behavior occurs, so far as the language specification is concerned anything could happen, maybe nothing at all. In particular, anything may include apparently-impossible behavior because the compiler has made assumptions that lead to erroneous code generation that does not match the source code.

Under some circumstances there can be specific restrictions on undefined behavior. For example, the instruction set specifications of a CPU might leave the behavior of some forms of an instruction undefined, but if the CPU supports memory protection then the specification will probably include a blanket rule stating that no user-accessible instruction may cause a hole in the operating system's security; so an actual CPU would be permitted to corrupt any or all user registers in response to such an instruction but would not be allowed to, for example, switch into supervisor mode.
In C and C++, implementation-defined behavior is also defined which requires the implementation to document what it does, thus more restrictive than undefined behavior.

Examples in C and C++ :

Attempting to modify a string literal causes undefined behavior:[2]
char * p = "wikipedia"; // ill-formed C++11, deprecated C++98/C++03
p[0] = 'W'; // undefined behaviour
One way to prevent this is defining it as an array instead of a pointer.
char p[] = "wikipedia"; /* RIGHT */
p[0] = 'W';
In C++ one can use STL string as follows.
std::string s = "wikipedia"; /* RIGHT */
s[0] = 'W';
Division by zero results in undefined behavior:[3]
return x/0; // undefined behavior
Certain pointer operations may result in undefined behavior:[4]
int arr[4] = {0, 1, 2, 3};
int* p = arr + 5;  // undefined behavior
Reaching the end of a value-returning function (other than main()) without a return statement may result in undefined behavior:
int f()
}  /* undefined behavior */
The C Programming Language cites the following examples of code that have undefined behavior in Section 2.12: The order in which function arguments are evaluated is not specified, so the statement
printf("%d %d\n", ++n, power(2, n));    /* WRONG */
results in undefined behavior. The following statement is ambiguous as it is not clear whether the array index is the old value of i or the new.
a[i] = i++;
This results in undefined behavior.