My rusty corner on the Web.
by EchoStone
2023 Fall Semester
The following code
#include <stdio.h>
int main() {
int x = 50000;
printf("%d\n", x*x);
printf("%d\n", x*x >= 0);
printf("%d\n", x*x > 0);
printf("%d\n", x*x == 0);
return 0;
}
compiled with gcc
later than 8.0, will produce the following output:
-1794967296
1
0
0
This behavior is rather surprising, and is very much specific to gcc
(if you compile with clang
, maybe as you are using MacOS, then the second line will be 0). However, this behavior is actually comformant to the C standard, which explicitly does not clarify what should with integer overflows.
In other words,
Integer overflow is UB (Undefined Behavior) in C standards.
Because of that, the compiler has freedom to do anything when such an event occurs. In this case, gcc
chooses to apply a nice optimization, by assuming a integer squared should never be less than zero.
You can see this in action, by passing the -Wstrict-overflow=2
flag to gcc
. It should now emit the following warning:
<source>: In function 'main':
<source>:5:5: warning: assuming signed overflow does not occur when simplifying comparison of absolute value and zero [-Wstrict-overflow]
5 | printf("%d\n", x*x >= 0);
| ^~~~~~
ASM generation compiler returned: 0
This is the realm of UB (Undefined Behavior).
To reiterate: gcc
is not technically doing anything wrong. The fault is on you. gcc
simply assumes that the programmer should never write code that triggers any UB. To quote a gcc
developer:
“I’m sorry, but you wrote broken code to start with.
gcc
will not change.”
In fact, the behavior above is only introduced into gcc
after its 8.0
update. You may find the following in the changelog:
-fno-strict-overflow
is now mapped to-fwrapv
-fwrapv-pointer
and signed integer overflow is now undefined by default at all optimization levels. Using-fsanitize=signed-integer-overflow
is now the preferred way to audit code,-Wstrict-overflow
is deprecated.
If you pass the -fwrapv
flag to gcc
, the output becomes “normal”. The output is also “normal” with gcc
earlier than 7.5, or with pretty much all clang
versions, as llvm
has its own rules.
The truth is: neither case is the really normal case. For what it’s worth, this example just goes to show how we, as programmers, should never rely on UBs working in “the assumed way”.
tags: ICS