That's only when an exception is thrown, though. If an exception isn't thrown, "unroll the stack" is just a normal `ret` instruction. There's no exception handling code at all when a function returns normally without an exception, which is the point. By contrast when an error sum type returns without an error, you're still doing a branch at the call site to verify that.
In the non-throwing exception path, there's literally no error or exception handling code executed at all. Whereas in the sum-type error-returning version, you have a branch at every call site that's always executed regardless of if there's an error or not.
Now the exception handler generates ".cold" clones of the function, so the total assembly for the exception handling one is larger. However, that assembly isn't every executed if an exception isn't thrown, which is the broader point. So it's not taking up CPU cache space & it's not taking up branch predictor slots.
That's a bad point? Exceptions are not normal control flow. They are rare or, as one might say, exceptional. The performance of them when thrown isn't of key concern, it's the performance when they are not thrown that matters since that's the >90% case. And in that case, code using exceptions is faster than code using sum type return values, especially if those errors propagate deeply across the call stack which they very often do.
You mean destructors? An exception handler would be a catch block.
Anyway, the typical implementation involves two phases, one which uses a table to identify the matxhing catch clause, then another one going through landing pads for each frame of the stack. Just consult the Itanium ABI spec for technical details.