Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There isn't a "find UB branches" pass that is seeking out this stuff.

Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.

Imagine the following code:

   int x = abs(get_int());
   if (x < 0) {
     // do stuff
   }
Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.


Why is it allowed to eliminate the branch? In most architectures abs(INT_MIN) returns INT_MIN which is negative


Calling abs(INT_MIN) on twos-complement machine is not allowed by the C standard. The behavior of abs() is undefined if the result would not fit in the return value.


Where does it say that? I thought this was a famous example from formal methods showing why something really simple could be wrong. It would be strange for the standard to say to ignore it. The behavior is also well defined in two’s complement. People just don’t like it.


https://busybox.net/~landley/c99-draft.html#7.20.6.1

"The abs, labs, and llabs functions compute the absolute value of an integer j. If the result cannot be represented, the behavior is undefined. (242)"

242 The absolute value of the most negative number cannot be represented in two's complement.


I didn't believe this so I looked it up, and yup.

Because of 2's complement limitations, abs(INT_MIN) can't actually be represented and it ends up returning INT_MIN.


It's possible that there is an edge case in the output bounds here. I'm just using it as an example.

Replace it with "int x = foo() ? 1 : 2;" if you want.


> value constraint pass that computes a set of possible values that a variable can hold

Surely that value constraint pass must be using reasoning based on UB in order to remove NULL from the set of possible values?

Being able to disable all such reasoning, then comparing the generated code with and without it enabled would be an excellent way to find UB-related bugs.


There are many such constraints, and often ones that you want.

"These two pointers returned from subsequent calls to malloc cannot alias" is a value constraint that relies on UB. You are going to have a bad time if your compiler can't assume this to be true and comparing two compilations with and without this assumption won't be useful to you as a developer.

There are a handful of cases that people do seem to look at and say "this one smells funny to me", even if we cannot articulate some formal reason why it feels okay for the compiler to build logical conclusions from one assumption and not another. Eliminating null checks that are "dead" because they are dominated by some operation that is illegal if performed on null is the most widely expressed example. Eliminating signed integral bounds checks by assuming that arithmetic operations are non-overflowing is another. Some compilers support explicitly disabling some (but not all) optimizations derived from deductions from these assumptions.

But if you generalize this to all UB you probably won't end up with what you actually want.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: