Friday, September 01, 2006

What is probability?

I happened to come across a somewhat off-hand question "what exactly does it mean to assign probabilities for a single event?" during some random blog-surfing a few days ago. I thought it was widely accepted that such probabilities are essentially Bayesian, that is, subjective expressions of the degree of belief of a person in the proposition in question (eg as Stefan Rahmstorf writes). There are, to be sure, practical difficulties in accessing this belief in a precise and consistent manner (especially if people are prepared to lie), and personal probabilities may change from minute to minute and day to day, but the basic theory seems clear enough and forms the foundation of a large field of research with many practically useful outputs. One thing that is certainly clear (and I believe undisputed) is that the main competing interpretation (frequentism) cannot apply at all in such situations. So if you want to talk in probabilistic terms at all, you've simply got to go outside that framework, and the standard Bayesian angle seems the obvious one.

Anyway, today I finally got the Reply from Allen and Frame to our attempted Comment. [This had been accidentally omitted from the set of reviews that were sent a couple of weeks ago.] I don't intend to publish and fisk it in detail - that would be tedious, lengthy and no-one would care. However, since it was offered for publication, they can hardly complain about me making a couple of comments on it.

One striking sentence in particular jumped out at me:
"We do not think most scientists interpret probabilistic forecasts purely as expressions of degrees of belief."
(And just to clarify, the context makes it clear that this is not indended as a snide comment about the ignorance of "most scientists", but rather as support for A&F taking this same position.)

While they are being admirably clear and frank in acknowledging that they do not actually believe the estimates that they have published, it does rather raise the issue of what they consider the status of their probabilistic estimates to be.

Although I do favour what I understand to be the standard subjective Bayesian viewpoint for non-frequentist probability, I'm not dogmatically going to insist that it is the only possible one - philosophers and mathematicians have argued for centuries over probability, and I don't pretend to have all the answers or to have covered all the bases. Note, however, that Wikipedia only mentions 2 broad categories, Bayesian and Frequentist - any others seem to be rather esoteric philosophical finesses of these two, not major revolutions (excluding imprecise probability which is a whole new can of worms wholly irrelevant to this discussion). Salmon (1966) proposes three criteria for a proposed interpretion of probability:
  1. Admissibility or coherence (must satisfy the Kolmogorov axioms).
  2. Ascertainable (there's a method for calculating it)
  3. Applicable (useful in real life applications)
Obviously, whatever A&F's interpretation is, it fails on admissibility - a point which they have also explicitly acknowledged (indeed they claim it as a feature rather than a bug). Failure on point 2 is therefore a gimme - through being multi-valued (see my previous example on P(x>4) ≠ P(x4>34)) their methods also fail ascertainability, since any answer can be generated by reformulating the question in logically equivalent ways (hmm...I can see a semantic dodge here - does the answer "whatever you want it to be" count as a method for calculating their probability? I'll leave them to decide on that). All that needs to be shown is that their results are not useful and they'll have a 0/3 score :-) Of course this just all means that they think Salmon is wrong too, I guess...but more importantly, it leaves unanswered the question of what their version of probability actually is. What axioms does it satisfy (if any)? What does it mean?

According to their Reply (and indeed the referee who supported them), all this is entirely clear to all climate scientists (except us, I guess) and needs no further clarification. I'd be interested to hear from anyone, climate scientist or not, who can make head or tail of it!

There's a further funny point which I can't resist mentioning. Their Reply makes much of the fact that the D&A stuff (of which Myles Allen is a major contributor) routinely commits the Prosecutor's Fallacy in turning the (frequentist) confidence intervals that classical D&A methods produce, into the (Bayesian) probability intervals that people really want to see. But rather than being embarassed by this, they use it to justify their claim that a uniform prior is in fact the appropriate choice! It really is Emperor's New Clothes stuff.

19 comments:

CapitalistImperialistPig said...

James,

I realize that this won't help, but I can't understand the degree of belief interpretation. Whose degree of belief? Most people believe at least a few absurd things, and some people hardly believe anything else.

Ultimately all events are unique - each coin flip encounters slightly different air currents, is triggered by slightly different muscle patterns, etc. (For me) Unique events can be assigned a probability only if I imagine them to be part an ensemble of similar events. Such an ensemble is a theoretical construct, but if the theoretical concept allows assignment and calculation of probabilities, it can be useful.

I've read some of your previous explanations, but I don't seem to really comprehend them until I put them in the theory + frequency interpretation.

I suspect from your referee's comments that I'm not unique in this regard.

James Annan said...

Whose degree of belief?

The person who calculates their probability, of course (and certainly the person who specifies their prior).

Most people believe at least a few absurd things, and some people hardly believe anything else.

Sure. But that's the way it works, and the point of assessments such as the IPCC is to form a reasonable consensus about the level of belief that is judged credible. I hope (and expect) that they have the nous to discount the claims of those they judge to be clueless. Lindzen claims to believe that climate sensitivity is 0.5C (at least, he used to) and no evidence will change his mind.

If you are not prepared to assign probabilities to future uncertain events such as, say, Hilary Clinton winning the next election, then don't. That is your choice - no-one is forcing you to use Bayesian probabilities. But if you want to make a decision that will be affected by the outcome of this election, then you might find making an estimate of the odds to be useful. You might even take account of the odds expressed by others, if you think the are likely to have some skill in judging these matters... Real-world evidence suggests that many people are very interested in assessing the odds of future uncertain events, even if you are not.

Your comment about imagining an ensemble of similar events seems imply a personal threshold based on your powers of imagination - is that a fair comment? Can you give a clear explanation of why you can imagine an ensemble of "similar" coin tosses but not (I assume) an ensemble of "similar" elections?

If all you are saying is that you want to dress up a Bayesian probability as a sample from a hypothetical ensemble, then that's ok - but the contents of this hypothetical ensemble is not something that can be objectively determined based on data. At least, not in a consistent and coherent manner. The ensemble necessarily and fundamentally depends on the prior judgement of the researcher, before he sees the data. That's simply the way the laws of probability works, and saying that you don't like or understand it does not change that, any more than my ignorance of QCD is a threat to fundamental physics!

Maybe you have heard of Bertrand's paradox. This concerns a repeatable experiment. What is the corrrect answer as to the probability?

CapitalistImperialistPig said...

James,

It's at least possible that my discomfort is mainly a matter of terminology. I'm uncomfortable with the seeming subjectivity of "belief." If you have a reproducible algorithm that generates your "degree of belief" I might well be happy with that.

Future events like the outcome of a football match can be predicted using a model that takes into account data about previous results, player health, etc., and that seems to me, at least, to have a straightforward frequentist (plus model) interpretion.

Bertrand's paradox, if I understand it correctly, is paradoxical only if one assumes all methods of randomizing chord picking are equal. Since they aren't, it's quite unsurprising that distributions that are "flat" in one variable, aren't in another.

Some problems, like predicting whether Hillary will become President or whether String Theory will be proved correct are hard, I think, because we don't have plausible ways for choosing an ensemble that will model the circumstance.

but the contents of this hypothetical ensemble is not something that can be objectively determined based on data.

Yes, I do understand that point. My confusion is on how you assign your degree of belief.

Anonymous said...

I think people object to the subjective nature of Bayesian probability. It seems like there ought to be meaning to objective probability.

One form of objective probability comes from quanum mechanics. Quantum measurements are inherently uncertain and you can mathematically calculate the probability of various results. With a full theory of physics and precise knowledge of every aspect of the physical world at the present moment, we could in principle precisely calculate the quantum probabilities for a election held several years hence. This is an example of a way to think about objective probability.

James Annan said...

CIP,

The mathematical part of the algorithm (Bayes Theorem) is certainly reproducible, but it necessarily relies on subjective inputs - in particular, the prior (in practical applications, the likelihood is also likely to have some subjective component). Dressing up the problem as a frequentist thought experiment doesn't solve this in any way - you still have to assume some prior distribution over your unknown parameters. It is not yet clear to me what you think "probability" means in the case of a single nonrepeated event in a deterministic world. Do you accept that other people with access to the same data may invent hypothetical thought experiments in which the frequency of outcomes differs from yours? If so, then clearly you must accept that such probabilities are subjective. If not, you'll have to describe your universal rules in much more detail than you already have, in order that we don't all get it wrong! It might be instructive if you were to outline a worked example of what you mean.

In the special case where the space of outcomes is discrete (say coin tosses or win/lose an election), it might be arguable that a uniform distribution represents some special "ignorant" prior. But this simply doesn't work for continuous variables, since a nonlinear transformation kills any possible symmetries.

James Annan said...

Hal,

I'm not keen on bringing QM into it, since then you generally end up sidetracked into arguments about free will and determinism...in any case, it doesn't help in practice since we have no way of performing the required calculations - and probability is precisely a way of encapsulating our level of ignorance!

I think everyone agrees that with an arbitrarily large amount of knowledge of the system behaviour plus its initial conditions, the "probability" of rain tomorrow would always be expressed as essentially 0 or 100% - quantum uncertainty doesn't come into it. Indeed one of the important measures of forecast performance is to push the probabilities to both ends of the scale, while maintaining verification statistics. Simply saying "climatological probability of rain tomorrow" is certainly valid but it is also skill-free!

CapitalistImperialistPig said...

A famous problem is trying to calculate the probability of life in, say, the Galaxy (usually intelligent life, but I will skip that part). The Drake equation breaks this into parts: The number of Sun like stars, the probability that such a star will have planets, the probability that such a system with planets will have an Earthlike planet, and the probability that life will arise on an Earthlike planet.

The first we know, the second we can now estimate roughly (from detected extra solar planets), and the rest we don't know. If you have a somewhat trustworthy theory of planet formation, you can come up with a guess for the third, but we don't yet have either good enough theory or data to know how trustworthy our planet formation theories are, but a good guess is not terribly. We have one data point for the final link in the chain, and that is the hardest of all. Does the fact that life arose very quickly on Earth show that life is very probable or were we just extremely lucky in that the conditions were just right for some brief instant so that life evolved.

So I say that we can't calculate the probability with any confidence, because we need to make a wild guess at two of the components of the calculation.

Progress can come in two forms: as our theories get better we may understand better how life arose and how the solar system arose. Even better would be experimental data - an Earth mass planet around a Sunlike star with a lot of oxygen in the atmosphere is more or less hard evidence of life. We may be able to make that kind of measurement in twenty years or so.

James Annan said...

CIP,

It's a little unclear from what you wrote, but I guess that what you are thinking of comes down to the notion of whether the data "dominates" the prior. A formal bayesian analysis just makes that explicit and quantifiable - but note that the influence of the prior never completely vanishes, unless and until the data actually indicates a probability of 0 or 100%.

CapitalistImperialistPig said...

I think what I am saying is that statistical reasoning doesn't tell you anything in the absence of a theoretical framework for its interpretation. I doubt that that has much to do with data dominating a prior, but I'm not terribly confident of that.

One of my problems is that I don't know what theoretical framework you are working in, so I don't know how to interpret statements like "degree of belief."

I tend to believe that the Sun will come up tomorrow, but that belief is only indirectly related to reports that it has been regularly doing that for some time. More important is the fact that that idea fits into a cosmology that seems to explain a lot of things.

James Annan said...

One of my problems is that I don't know what theoretical framework you are working in, so I don't know how to interpret statements like "degree of belief."

It's basically something I would be prepared to bet on, which is (roughly speaking) the standard Bayesian view. Sure, there are issues of precision, decidability and risk aversion/risk seeking but the fundamental idea seems sound enough. If I say that X happens with probability p, I will equally happily place a stake of p which pays 1 iff X happens, or a stake of 1-p which pays 1 iff X does not happen.

If you are happy to talk about "the probability of Hillary Clinton becoming the next president" or "the probability of rain tomorrow" then you have little choice but to accept the subjective view of probability. That is, unless and until you or someone else proposes a workable alternative (which would involve a major revolution in probabiity theory, so I'm not holding my breath).

I tend to believe that the Sun will come up tomorrow

So why not try (as I suggested some time ago) a nontrivial example which involves actual calculation rather than just vague verbiage? Say you find a bent coin on the ground - how would you attempt to determine how biased it was? Can you attempt to answer such questions as "what is the probability that the coin is strongly biased (p>75%) in favour of heads" or is this outside the realm of the sort of question you consider probability can properly address? Please don't just vaguely waffle about how you could work it out given an infinite number of coin tosses, but explain what you would actually do given limited time.

Anonymous said...

Seems a strange example for James to choose. It seems CIP is trying to say he is happy to talk about probability of heads for a bent coin where he is allowed a sample. The larger the sample the greater the confidence (or narrower the probability range). I assume if election candidates are known and a poll is allowed the same would apply.

What if the candidates are not known and the method of deciding on candidates is either unknown or depends on future events in some unknown way? It is much harder to see how confidence in the probability could be calculated.

Does this make a difference? I suspect James would say despite the difficulty you just have to have a go at making your best estimate for the odds of the person being a candidate if you are going to try to estimate the probability of that person winning.

I don't really see anything wrong with that. However I might be inclined to try some sort of sanity check by comparing my odds to what other people think. Then things get complicated again. Do I give more weight to experts? and who are experts? makes calculating the probability rather complex and difficult.

Also a consensus could be wrong. Presenting the 3 choices in the first part of Bertrands Paradox with answers 1/2, 1/3, and 1/4 people might tend to congregate around the middle answer of 1/3. Does that mean it is more likely than the others?

E. T. Jaynes reckons that a common sense interpretation of chosen at random comes up with the answer 1/2.

Dealing with such things is difficult, but I don't see any alternative to give it your best shot. There will always be the possibility that others have seen further and can see why the prior needs to be biased in a particular way.

I wonder if Frame or Allen have something in mind.

crandles

CapitalistImperialistPig said...

About that bent coin. This seems like a classic sampling/hypothesis testing situation. [Next I will reveal my vast ignorance of statistics] Assuming that the coin was not so bent that it seemed likely to wind up on edge, I would assume a classic Bernoulli distribution, independent of time, with probability p of heads and 1-p of not heads. Hypothesis: p > .75

After n throws, having gotten q heads and n-q not heads, if I could remember the arithmetic I could have calculated the confidence with which we could say p > 0.75.

How is this similar to the Hillary problem, or am I missing something big?

James Annan said...

CIP,

I think you are confusing (frequentist) confidence with (bayesian) probability. I specifically asked you for the (your) probability that p>0.75, not (eg) the probability that if p>0.75, you would observe at least x heads in n trials. A frequentist has no way of addressing the question I posed - it is simply a category error as the coin itself is not a random sample from a repeatable experiment. However, if you want to place a bet on the next coin toss, the question I posed is precisely the sort of question you have to make a judgement about! Suppose you had seen 7 heads out of 10 coin tosses - what odds would you consider fair on the next toss? That is to say, at what odds would you be indifferent to taking either side - a 70/30 split on the stakes perhaps?

Perhaps coin tosses were a slightly confusing idea due to their common use in frequentist examples. It was just the first thing that came to mind. I'm simply trying to get you to put some actual numbers down, because once you start the process of actually working things out rather than just talking about them, it all becomes rather a lot clearer. At least, that is what I have found.

James Annan said...

By the way, CIP, for some reason your comments do not get automaticaly emailed to me - I have no idea why this might be the case, presumably blogger doesn't like you! So if I don't reply to something you say, it might be cos I haven't noticed it...

James Annan said...

Chris,

Also a consensus could be wrong.

Bayesian probability isn't about being right or wrong - it's about being consistent. If you don't use Bayes Theorem and the Kolmogorov axioms to update your probabilities consistently, then you are vulnerable to a Dutch Book and therefore cannot reasonably claim to be rational.

Eg, if I say P(A)=70% and also P(not A) = 60%, then I could be tempted to pay a stake of up to 0.70 on a bet which pays 1 if A happens, and also stake 0.6 on a bet which pays 1 if A does not happen. Whether A happens or not, I get 1 back but I've paid out 1.3!

Note that even if the probabilities are consistent (say P(A)=70% and P(not A)=30%) then I may still lose if I place a single bet of .7 on A and it doesn't happen - in that sense, the probabilities are "wrong", but this is a pretty vacuous complaint. In fact in a deterministic world, probabilities - consenseus or not - are always going to be wrong in that sense because the truth of A is either 0 or 1, merely currently unknown to us! Nevertheless, it can still be useful to determine a rational expression of our current beliefs - rational meaning that there is no sequence of bets which guarantees a loss irrespective of the truth of A.

In the case of a finite space, there is indeed a natural "ignorant" prior which assigns equal probability to each event. However, in the coin-tossing problem I'm talking about, the space is not the 2 outcomes head/tail but instead a continuous 0-1 range for p = probability of heads on a single toss. There is still a natural uniform prior here, but it's not so obvious that it is the only plausible choice.

Equally, I can't outlaw a uniform prior on 0-20C for climate sensitivity, but I can point out that it represents an extraordinary level of belief in high climate sensitivity (P(S>6C)=70%, or "S is likely greater than 6"), and if instead we were to only start out from a moderate belief in such an outcome (say P=15%ish), then the data rules against it very convincingly indeed. It remains to be seen whether or not such statements can find a home in the peer-reviewed literature...

CapitalistImperialistPig said...

James ...presumably blogger doesn't like you!

Oh yeah? Well it can just get in line! ;)

I might see a bit of your point - if your point was that 7/10 might still leave me with a bias in favor of more even odds. The belief part still bothers me though. I would prefer a cleaner separation of subjective and objective components.

James Annan said...

I might see a bit of your point - if your point was that 7/10 might still leave me with a bias in favor of more even odds.

The point is that once you have stated odds after 10 tosses (or even 100 or 1000), you can work out what your odds must have been after 6 tosses in order for you to be rational...and so on back to your prior, by an inversion of Bayes Theorem!

(although this does require a full pdf over p to be stated, not just the odds on a single toss)

The belief part still bothers me though. I would prefer a cleaner separation of subjective and objective components.

It is absolutely explicitly separated into a subjective prior and updating via the data-based likelihood (which in simple cases can be completely objective)! Just do the sums FFS! Or do you actually not want to make any attempt to understand it?

Once again: the data cannot provide you with probabilities, they can only update a prior. That's how the axioms of probability work - if you don't like this, then invent some new axioms and explain to us how they are more useful...

As I said to Chris, it's not about getting the "objectively correct" probabilities (which don't exist in general), but in making sure that your ducks are all in a row and you are not opening yourself up to a Dutch Book. Even if we don't know the true answer, we can still try to act rationally and improve our estimates in the light of new evidence.

C W Magee said...

This exchange makes me appreciate how lucky I am to have a job where all I do is count ions or decays without having to think about the philosophy of them.

I'll disagree with the suggestion that Drake's equation is useful, though. If anything, predictive planetary science suffers from an imagination deficit, not an excess. So equations that serve to confine analytical thought only increase this disconnect.

Anonymous said...

Best case scenarios (2% to 15% probability)


Average case scenarios (70% to 95% confidence)



Worst case scenarios (2% to 15% probability)