statistics

# The price and payoff of Bayesian statistics

I’ve never totally understood why people complain so much about having to specify prior distributions in order to do Bayesian inference. Even if you’re doing frequentist statistics, you have to make some assumptions about the world and about your data. If you’re using some maximum likelihood based approach, you’re counting on asymptotics to get you to multivariate normality — and so many data analysis problems just don’t have the sample size for that.

The big payoff with Bayesian statistics, it seems to me, is that you get full-on probability distributions as output, not just a mean and a standard error. But everyone focuses on specification of the prior.

Johnson & Albert in Ordinal Data Modeling:

The additional “price” of Bayesian inference is, thus, the requirement to specify the marginal distribution of the parameter values, or the prior. The return on this investment is substantial. We are no longer obliged to rely on asymptotic arguments when performing inferences on the model parameters, but instead can base these inferences on the exact conditional distribution of the model parameters given observed data–the posterior.

That is a huge payoff. But even more important than that, Bayesian statistics is so much more believable than classical. I am almost happy that I spent 15 years ignorant of what was going on in academic statistics so I could jump on the Bayesian train now.

Here’s one of the first “pop statistics” articles I’ve seen — an attempt to clarify for the layperson what is going on with statistical practice in academic research. It’s a good article. I learned a few things and found a few interesting references.

Reporter Siegfried misses a couple important points though. He doesn’t note that frequentist statistics are based on repeated sampling on into infinity, that confidence intervals cannot be interpreted except with reference to the long run. This is endlessly confusing to intro stats students. Most of them probably never absorb it.

And what about Bayesian statistics? Siegfried, like so many others, focuses on specifying the prior:

Bayesian math seems baffling at first, even to many scientists, but it basically just reflects the need to include previous knowledge when drawing conclusions from new observations. To infer the odds that a barking dog is hungry, for instance, it is not enough to know how often the dog barks when well-fed. You also need to know how often it eats — in order to calculate the prior probability of being hungry. Bayesian math combines a prior probability with observed data to produce an estimate of the likelihood of the hunger hypothesis.

This describes Bayesian stats mostly correctly (in my novice opinion) but focuses too much on the price (the need to specify the prior) rather than the payoff you get (probability distributions that are easily interpretable under conventional notions of probability).

Here, I think, Siegfried further obscures what’s going on with the enthusiasm for Bayesian ways of analyzing data:

But Bayesian methods introduce a confusion into the actual meaning of the mathematical concept of “probability” in the real world. Standard or “frequentist” statistics treat probabilities as objective realities; Bayesians treat probabilities as “degrees of belief” based in part on a personal assessment or subjective decision about what to include in the calculation. That’s a tough placebo to swallow for scientists wedded to the “objective” ideal of standard statistics. “Subjective prior beliefs are anathema to the frequentist, who relies instead on a series of ad hoc algorithms that maintain the facade of scientific objectivity,” Diamond and Kaul wrote.

No, no, no. Bayesian methods do not introduce confusion into the concept of probability. Classical statistics did that. Bayesian statistics clarifies probability — makes it into a human measure, not some pseudo-objective long-run construction.