Autovectorization basically doesn't work. The effort to test it properly (that it got vectorized the way you expect or at all) is more maintenance than writing it yourself.
If you insist on abstractions, autoscalarization (the opposite approach) would be better, which is kind of how Fortran works… but I unironically recommend just writing assembly like ffmpeg does.
Doing the feature detection is fairly trivial. If they don't run the invalid instructions don't error.