Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Which rate? The rate you failed to mix the balls? The rate you failed to count a ball? The rate you misclassified the ball? The rate you repeatedly counted the same ball? The rate you started with an incorrect count? The rate you did the math wrong? etc

This is called modelling error. Both Bayesian and frequentist approaches suffer from modelling error. That's what TFA talks about when mentioning the normality assumptions behind the paper's GLM. Moreover, if errors are additive, certain distributions combine together easily algebraically meaning it's easy to "marginalize" over them as a single error term. In most GLMs, there's a normally distributed error term meant to marginalize over multiple i.i.d normally distributed error terms.

> Plenty of downvotes and comments, but nothing addressing the point of the argument might suggest something.

I don't understand the point of your argument. Please clarify it.

> Here’s the experiment and here’s the data is concrete it may be bogus but it’s information. Updating probabilistic based on recursive estimates of probabilities is largely restating your assumptions.

What does this mean, concretely? Run me through an example of the problem you're bringing up. Are you saying that posterior-predictive distributions are "bogus" because they're based on prior distributions? Why? They're just based on the application of Bayes Law.

> Black swans can really throw a wrench into things

A "black swan" as Taleb states is a tail event, and this sort of analysis is definitely performed (see: https://en.wikipedia.org/wiki/Extreme_value_theory). In the case of Bayesian stats, you're specifically calculating the entire posterior distribution of the data. Tail events are visible in the tails of the posterior predictive distribution (and thus calculable) and should be able to tell you what the consequences are for a misprediction.



You don’t find black swans from the data you find them from building better models. You can look at 100 years of local flood and weather data to build up a flood assessment, but that’s not going to include mudslides or earthquakes etc. The same applies to studies.

My point is this: You can’t combine them using Bayesian statistics adjusting for the possibility of research fraud it’s simply not in the data.

Their great for well understood domains, less so for research. Frequentist models don’t work, but they also don’t even try.

PS: Math errors don’t really fall into modeling error.


I wouldn't think of Black Swan events as tail events, so much as model failures or regime-changes. As in, 'we modeled this as a time-invariant gaussian distribution, but it's actually a mixture model where the second hidden mode was triggered in the aftermath of an asteroid strike that we didn't model for, because of course we didn't.'

In re, the arguey-person you were responding to, frequentist modeling is just as bad or worse for these sorts of situations.


Frequentist modeling isn’t useful, but that’s not how studies are evaluated. Let’s suppose your looking at a bunch of COVID studies and you ask yourself what if one or more of them was fraudulent?

Your investigation isn’t limited to the data provided by them it’s going to look for more information beyond the paper. This isn’t a failure of frequentist models because they evaluate the study and it’s output separately.


Ah yeah fair enough, I see what you mean. This is a general problem with all models though. Fundamental modeling issues will tank your conclusion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: