I forget the math behind it; and could be totally wrong but I think for every additional caveat you add to a hypothesis to make it fit the data post hoc, it cuts the odds of your new hypothesis in half.
No, not quite. It depends on your prior assumptions. Basically imagine that you have these preconceptions
Possibility Prior probability
No aliens 0.55
Aliens at the perceptual limit 0.1
Completely invisible aliens 0.2
Obviously visible aliens 0.15
Now the math is simple. You cross out the impossible: obviously visible aliens. Then you need to get the remaining options to sum to 1 again. This is done by dividing by the total.
Possibility Posterior probability
No aliens 0.55 / 0.85
Aliens at the perceptual limit 0.1 / 0.85
Completely invisible aliens 0.2 / 0.85
Obviously visible aliens 0
Now you may wonder where the prior probabilities come from. Well they are simply your best guesses based on reasoning and thinking, before having looked at the data. For instance they may be informed by your knowledge of physics and how easy it is to make things invisible. And they may be informed by extrapolating human psychology onto aliens (see the other response about a human playing with a cat using a laser pointer). You may consider the precedent of bacteria existing all around us for a very long time without detection.
Which gives my hypothesis (Its Aliens) a posterior probability of 25% => (50% X 30%)/(50% X 30% + 50% X 90%).
I want it to be aliens so I say that they're conducting studies on us which juices P(Evidence | ALIENS && Studying Us) = 100%. But because I have no evidence to back that up, the odds are no better than 50/50; and I have to drop the prior of my new hypothesis (UFOs are Aliens who Study us) to 25% (50% X 50%). This nets out a new posterior probability of around 27%.
P(ALIENS AND They Study Us) = 25%
p(~ALIENS) = 75%
P(Evidence | ALIENS & They Study Us) = 100%
P(Evidence | ~ALIENS) = 90%
I'm new to Bayesian logic and am still shaky with it, so there could totally be glaring errors I'm overlooking, but as I understand it that's what I was trying to get across - Every new caveat you add to explain away a poor data fit shouldn't juice your odds.
If anyone wonders why our approaches look so different, it's because I assumed ideal evidence that cleanly disproved one hypothesis without affecting any of the others. I did choose the list of hypotheses to ensure that was (approximately) valid.