There isn't a "find UB branches" pass that is seeking out this stuff.
Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.
Imagine the following code:
int x = abs(get_int());
if (x < 0) {
// do stuff
}
Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.
Calling abs(INT_MIN) on twos-complement machine is not allowed by the C standard. The behavior of abs() is undefined if the result would not fit in the return value.
Where does it say that? I thought this was a famous example from formal methods showing why something really simple could be wrong. It would be strange for the standard to say to ignore it. The behavior is also well defined in two’s complement. People just don’t like it.
> value constraint pass that computes a set of possible values that a variable can hold
Surely that value constraint pass must be using reasoning based on UB in order to remove NULL from the set of possible values?
Being able to disable all such reasoning, then comparing the generated code with and without it enabled would be an excellent way to find UB-related bugs.
There are many such constraints, and often ones that you want.
"These two pointers returned from subsequent calls to malloc cannot alias" is a value constraint that relies on UB. You are going to have a bad time if your compiler can't assume this to be true and comparing two compilations with and without this assumption won't be useful to you as a developer.
There are a handful of cases that people do seem to look at and say "this one smells funny to me", even if we cannot articulate some formal reason why it feels okay for the compiler to build logical conclusions from one assumption and not another. Eliminating null checks that are "dead" because they are dominated by some operation that is illegal if performed on null is the most widely expressed example. Eliminating signed integral bounds checks by assuming that arithmetic operations are non-overflowing is another. Some compilers support explicitly disabling some (but not all) optimizations derived from deductions from these assumptions.
But if you generalize this to all UB you probably won't end up with what you actually want.
Instead what happens is that you have something like a constant folding or value constraint pass that computes a set of possible values that a variable can hold at various program points by applying constraints of various options. Then you have a dead code elimination pass that identifies dead branches. This pass doesn't know why the "dest" variable can't hold the NULL value at the branch. It just knows that it can't, so it kills the branch.
Imagine the following code:
Can the compiler eliminate the branch? Of course. All that's happened here is that the constraint propagation feels "reasonable" to you in this case and "unreasonable" to you in the memcpy case.