My rusty corner on the Web.
by EchoStone
2023 Fall Semester
The following code
#include <stdio.h>
int main() {
    int x = 50000;
    printf("%d\n", x*x);
    printf("%d\n", x*x >= 0);
    printf("%d\n", x*x >  0);
    printf("%d\n", x*x == 0);
    return 0;
}
compiled with gcc later than 8.0, will produce the following output:
-1794967296
1
0
0
This behavior is rather surprising, and is very much specific to gcc (if you compile with clang, maybe as you are using MacOS, then the second line will be 0). However, this behavior is actually comformant to the C standard, which explicitly does not clarify what should with integer overflows.
In other words,
Integer overflow is UB (Undefined Behavior) in C standards.
Because of that, the compiler has freedom to do anything when such an event occurs. In this case, gcc chooses to apply a nice optimization, by assuming a integer squared should never be less than zero.
You can see this in action, by passing the -Wstrict-overflow=2 flag to gcc. It should now emit the following warning:
<source>: In function 'main':
<source>:5:5: warning: assuming signed overflow does not occur when simplifying comparison of absolute value and zero [-Wstrict-overflow]
    5 |     printf("%d\n", x*x >= 0);
      |     ^~~~~~
ASM generation compiler returned: 0
This is the realm of UB (Undefined Behavior).
To reiterate: gcc is not technically doing anything wrong. The fault is on you. gcc simply assumes that the programmer should never write code that triggers any UB. To quote a gcc developer:
“I’m sorry, but you wrote broken code to start with.
gccwill not change.”
In fact, the behavior above is only introduced into gcc after its 8.0 update. You may find the following in the changelog:
-fno-strict-overflowis now mapped to-fwrapv-fwrapv-pointerand signed integer overflow is now undefined by default at all optimization levels. Using-fsanitize=signed-integer-overflowis now the preferred way to audit code,-Wstrict-overflowis deprecated.
If you pass the -fwrapv flag to gcc, the output becomes “normal”. The output is also “normal” with gcc earlier than 7.5, or with pretty much all clang versions, as llvm has its own rules.
The truth is: neither case is the really normal case. For what it’s worth, this example just goes to show how we, as programmers, should never rely on UBs working in “the assumed way”.
tags: ICS