Tuesday, January 24, 2006

Probability, prediction and verification IV: More on Bayesian vs frequentist uncertainty

Having received some correspondence relating to this post, I think it might be worth exploring the issues in a little more detail.

The question I was considering was: what does "70% chance of rain tomorrow" actually mean? Most people would probably expect that if this forecast was issued 100 times, rain would follow about 70 times. And indeed this is what the forecaster thinks (hopes). But on any particular such day, another forecaster might give a different prediction (say "90% chance of rain") and their forecasts might also work out to be accurate on average. Were they both right? What is the "correct" probability of rain?

An analogy with number theory may be helpful. It has been shown that the number of primes less than x is approximately given by x/ln(x), where ln is the natural logarithm. Using this formula, we find there are about 390,000,000 primes between 109 and 1010 (ie 10-digit numbers, of which there are 9x109). In other words, if we pick a 10-digit number uniformly at random, there's a 4.3% probability that it is prime. That's a perfectly good frequentist statement. If we exclude those numbers which are divisible by 2, 3 or 5 (for which there are trivial tests) the probability rises to 16.1%. But what about 1,234,567,897? Does it make sense to talk about this number being prime with probability 16.1%? I suspect that some, perhaps most, number theorists would be uneasy about subscribing to that statement. Any particular number is either prime, or not. This fact may be currently unknown to me and you, but it is not random in a frequentist sense. Testing a number will always give the same result, whether it be "prime" or "not prime" (I'll ignore tests which are themselves probabilistic here).

But does it make sense for someone to accept the validity of a probabilistic weather forecast, while rejecting the appropriateness of a probabilistic assessment about a particular number being prime? It should be clear that the answer to this is a very definite no. Both statements describe a fact which can be determined by a deterministic calculation (by digital computer or analogue atmosphere), but which is currently unknown to me. Granted, we don't know how to perform the atmosphere's calculation, but we don't need to, as it is going to do the job for us anyway. I will find out tomorrow whether rain fell or not, and I can work out whether 1,234,567,897 is prime. In fact, if the primality test takes a day to run, the analogy is a very close one indeed. It should also be clear that "70% chance of rain on Jan 25 2006" and "70% chance of rain on Jan 25 2003" are both in principle equally valid statements. I won't know whether rain fell on the 25th Jan 2006 for a couple of days, but it would probably take even longer to find out about 25th Jan 2003 (even assuming someone has kept a record for the location in question). All of these statements are Bayesian estimates of our (someone's) confidence in a particular proposition, and have no direct frequentist interpretation.

That hasn't helped us pin down what is the "correct" probability. In fact I hope that it has helped to show that ultimately there is no such thing. Just as a different forecaster might give a different probability of rain, so a different mathematician might argue that since 1,234,567,897 is at the low end of the range, a better estimate of the local density of primes is 1/ln(1,234,567,897) = 4.8%, or 18% when multiples of 2, 3, and 5 are excluded. Someone else might known how to check for divisibility by 11 (it isn't), increasing the probability still further. These more sophisticated methods will generate more skillful estimates for 10-digit numbers, in a way that can be quantified. However, someone who assigns a probability of 4.3% to all randomly-chosen 10-digit numbers in a frequentist experiment would also turn out right in the long run. That's a skill-free forecast, but a valid one nevertheless (a future post will expand on that). Essentially, it is "climatology" for 10-digit integers.

Given that we are so used to making probabilistic statements which can only make sense with a Bayesian interpretation, it seems a little strange that people often find it difficult to accept and understand that they are doing this, instead appealing to some quasi-frequentist thought experiment. Almost every time that anyone uses an estimate of anything in the real world, it's a Bayesian one, whether it be the distance to the Sun, sensitivity of globally averaged surface temperature to a doubling of CO2, or the number of eggs in my fridge. The purely frequentist approach to probability dominates in all teaching of elementary theory, but it hardly exists in the real world.

26 comments:

Anonymous said...

Often the storms are much more localized then the forcast. So often when it rains it doesn't rain everywhere. Do the forcasts take that into account in the probabilities?

James Annan said...

I think that a 70% rain forecast can in principle mean either that showers will hit about 70% of the area, or that a whole area of solid rain cloud is approaching but will only arrive (fall) wth 70% probability.

From the individual end-user's perspective (and simple analyses of rainfall/forecast statistics etc), they are pretty much the same, although there could be some differences especially for those interested in larger areas. Crop failure in 5% of the world would be very different proposition from a 5% chance of complete failure across the whole world, for example!

Anonymous said...

A slightly different question from "what do they mean by 70% chance of rain?" would be "how do they argue for a 70% chance of rain?". A Bayesian interpretation gets you past the meaning question, but I don't think it helps much with the justification. Do forecasters use statistical databases, or what? Going along with this, there's a slightly different way to interpret the question "what's the right probability?", which is "how does the community of weather forecasters decide if a forecast is good or bad?" I mean, before whatever weather is going to happen actually happens, just looking at the weather forecast, what criteria are there by which one can distinguish a "good" one from a "bad" one?

Anonymous said...

Alan Murphy published a couple of studies on this --

On the misinterpretation of precipitation probability forecasts. Bulletin of the American Meteorological Society, 58, 1297-1299, 1977.

Misinterpretations of precipitation probability forecasts. Bulletin of the American Meteorological Society, 61, 695-701, 1980. (A.H. Murphy, S. Lichtenstein, B. Fischhoff, and R.L. Winkler)

James Annan said...

Anon,

With modern ensemble-based methods, the basic principles of making a probabilistic forecast are easy enough to explain. Eg, a forecast centre can run 10 forecasts with a numerical atmosphere model, each one starting off from a slightly different representation of today's atmospheric conditions (which are not known with certainty anyway). If 7 of the 10 produce rain on the following day at the location in question, then they say "70% chance of rain tomorrow". In practice it is a bit more complicated than that. Some comments on forecast verification (your 2nd comment) may be coming to a blog near you soon...

One of Roger's references can be found here, by the way ("Print version" gives the full pdf).

Anonymous said...

In my opinion, the only way to interpret a statement like the "70% chance of rain tomorrow" is:

We have 100 different scenarios (or simulations), all compatibel with our current knowledge of the atmosphere, and 70 of those show rain for tomorrow.

In my opinion, counting the number of scenarios (or simulations, forecats etc.) is the correct way to understand bayesian probability as a frequentist.

James Annan said...

Wolfgang,

How would you apply this sort of reasoning to estimating the primality of 1,234,567,897 - "we have 100 different versions of 1,234,567,897, and according to our current knowledge, 4 of them are prime and 96 are composite"?

Anonymous said...

How would you apply this sort of reasoning to estimating the primality of 1,234,567,897 - "we have 100 different versions of 1,234,567,897, and according to our current knowledge, 4 of them are prime and 96 are composite"?

Almost. Given our state of knowledge - a process that is randomly picking numbers with a uniform probabiliity in a particular range - 4 out of every 100 picked are predicted to be prime.

James Annan said...

But I didn't pick that number randomly, and we do know its value, so it is not a random input. It doesn't seem to me that your proposed interpretation of the situation really brings any insight, and it is rather tortured in situations where there isn't a sensible frequentist "wrapper" to dress the question up in. There are a number of different priors that could have produced that number, which have strongly different probabilities of generating a prime in a frequentist experiment. Maybe I selected it from the 10-digit composites, for instance! Or the 10-digit primes plus 1,234,567,897.

Anonymous said...

> How would you apply this sort of reasoning to estimating the primality of 1,234,567,897

I would not. There can only be one scenario in this case. Mathematical statements are either true or false.

James Annan said...

In the same way, "it will rain tomorrow" is either true or false. The atmosphere is a deterministic system!

Anonymous said...

James,

> The atmosphere is a deterministic
yes, but we do not know the initial conditions exactly.
Thus, different scenarios are compatible with our knowledge of the current state of the atmosphere.

Anonymous said...

Perhaps I should explain myself a bit better.

In physics statements about probability makes sense in two cases:
i) quantum theory
ii) systems where we do not know the initial state exactly.

In all cases we must be confident about the theory (kinematics and dynamic), otherwise the use of probability is misleading.

Probability is not a good replacement for missing or wrong theories. Your example of the prime number illustrates this very well.

In the more interesting case of climate change:
Bayesian probabilities cannot and should not be used to replace a true understanding of how the various components and forcings etc. interact, IMHO.

Anonymous said...

You didn't pick a number randomly; but the predictor didn't know that. You're trying to cram two different situations into one. In the case of the weather (assuming it is deterministic) and in the case of "pick a number, is it prime?", the probabilistic prediction arises out of ignorance. When you dispell the ignorance, by providing more information, by actually being able to do the computation, or by actually witnessing the weather event, the probabilities change.

This is no more surprising than a predictor which said the chances of getting a 5 on a throw of a dice is 1 in 6, and then you throw the dice and now it is something definite; and to say that the probability of that particular throw being a 5 is 1/6 when a 3 is staring at you in the face seems absurd. Likewise, having picked a composite number, to give it a probability of being a prime seems absurd.

Ultimately, if there is more information to be had (e.g., instead of picking numbers randomly, you're always picking a composite number), the failure of the predictor (in a frequentist sense) will be what prompts you to look for a better predictor.

Anonymous said...

It should also be clear that "70% chance of rain on Jan 25 2006" and "70% chance of rain on Jan 25 2003" are both in principle equally valid statements. I won't know whether rain fell on the 25th Jan 2006 for a couple of days, but it would probably take even longer to find out about 25th Jan 2003 (even assuming someone has kept a record for the location in question).

If the newspaper published a random ten digit number each day, and you wanted to know whether the one published on Jan 25, 2003 is prime or not, the same type of estimation holds good, even though we're talking about a definite number.

Do explain how one would be led to seek a better predictor in the Bayesian interpretation of probabilities.

Anonymous said...

The original, with some modifications:

An analogy with number theory may be helpful. The probability of throwing an odd number with a fair dice with spots of 1,2,3,4,5,6 is half. That's a perfectly good frequentist statement. {We threw the dice and got 4. } ....But what about 4? Does it make sense to talk about this number being odd with probability half? I suspect that some, perhaps most, number theorists would be uneasy about subscribing to that statement. Any particular number is either odd, or not. This fact may be currently unknown to me and you, but it is not random in a frequentist sense. Testing a number will always give the same result, whether it be "odd" or "even".

I think one can see that the prime number stuff is merely a distraction.

CapitalistImperialistPig said...

James - I have a post commenting on yours here.

James Annan said...

Last anon comment:

No, your modifications are not valid. Try this for size:

An analogy with number theory may be helpful. The probability of throwing an odd number with a fair die with spots of 1,2,3,4,5,6 is half. That's a perfectly good frequentist statement. {We threw the die but it is concealed from view and we do not yet know the answer} ....But what about roll of this die? Does it make sense to talk about this number being odd with probability half? I suspect that some, perhaps most, number theorists would be uneasy about subscribing to that statement. Any particular number is either odd, or not. This fact may be currently unknown to me and you, but it is not random in a frequentist sense. How ever many times you look at this die you will always get the same result, whether it be "odd" or "even".

It should, I think, be immediately apparent that the only difference between my version and yours is the ignorance of the observer as to the state of the die.

I can see I'm going to have to go over this one more time...

Anonymous said...

We threw the die but it is concealed from view and we do not yet know the answer. However, we took high speed video footage of the first quarter second of the throw. This was provided to two teams along with similar footage of 1000 similar dice roll where the answer is known and provided to all teams.

The first team gains some knowledge from the video footage and testing on the 1000 rolls allows them to estimate a 25% probability of a six. The second team is more skilled and their testing enables them to predict a 50% probability of a six.

You and I are members of a third team. We don't get the video footage to analyse but we do get to analyse the results of the other two teams for the 1000 rolls and the one roll.

It is possible that all the information gained by the first team is fully utilised by the second team and we will be able to do no better than copy the second teams prediction.

It is also possible however that some information gained by the first team is not fully utilised by the second team, so our approach may enable us to do better than the second team. In this case, where both the first and second teams both predict an increased chance of a six, we may be able to predict a six with a 51% probability.

So what is the correct probability of that one particular dice throw being a 6? Is it 1 in 6? or is it 25%? or is it 50%? or is it 51%?

It seems to me that the answer depends on how much information you have access to. Team 3 should always win the competition or at least jointly win at getting the best prediction (though that prediction could turn out wrong).

Is the approach that I am suggesting for our third team unscientific? I see no reason to think so.

crandles

CapitalistImperialistPig said...

Sorry James, but I don't buy it. The Earth's atmosphere is not a deterministic system in any meaningful sense. Even if all initial conditions were nown perfectly, quantum effects induce an essential indeterminacy at the micro level which propagates up to every scale via the "butterfly effect." More immediately, it isn't possible to know the initial conditions very accurately at all. The atmosphere is much better modelled as a semi-deterministic system with large random elements.

Back to your ten digit composite integer = 17x73x994817. All the arguments about primality were based on frequency considerations. It seems to me that all such estimates are either frequentist in origin or pure prejudices. A further rant on the subject is Bayes Bogus? here.

CapitalistImperialistPig said...

Er, "nown" should be "known."

James Annan said...

Sorry CIP, but you are completely wrong about the atmosphere. For the purposes of tomorrow's forecast, it is definitely appropriate to consider it as a deterministic (but imperfectly known) system. There's no way that any quantum indeterminacy can have a measurable effect on this time scale. We do occasionally put in randomness as a proxy for our ignorance, but this is very clearly a Bayesian approach. No-one actually thinks that the Navier-stokes equations are really random!

If, when faced with epistemic uncertainty, you are prepared to invent hypothetical nonexistent frequentist thought experiments with a specified (how?) prior distribution from which reality is considered to be a sample, then you are merely a Bayesian in disguise. And if you are neither an honest Bayesian, or a disguised one, but instead a strict frequentist, then you'll have trouble ever making any decisions at all, rational or not.

CapitalistImperialistPig said...

James - It's possible that I don't really understand what you mean by Bayesian or frequentist, but I'm pretty sure that I understand what I mean when I say meaningful probabilities come from a theory that gives frequencies.

I didn't think I was restricted to tomorrow's rainfall forecast, but even conceding that Navier-Stokes are deterministic, nobody thinks they are a fundamental description of the dynamics of a real fluid - they are just a large scale approximation, even without quantum effects. In any case, even if the equations were exact, they are only one piece of the description of the atmosphere - the other piece is the initial and boundary conditions, which can never be known exactly.

Finally, when a 70% rain forecast is made around here anyway, it often means that rain will fall on about 70% of the local area - theory predicts raincloud density in a very frequentist manner.

James Annan said...

It's possible that I don't really understand what you mean by Bayesian or frequentist

I suspect that many people fall into this category. They are taught via the frequentist paradigm, and instinctively try to reject the "subjective" nature of Bayesianism, but in fact they apply this concept widely in everyday life (including scientific research). It's a sort of interesting doublethink that many people are hardly aware of and reluctant to acknowledge. A longer post will be forthcoming when I have the time.

CapitalistImperialistPig said...

One more on subjective certainty, and then I will shut up.

I heard a girl on National Public Radio (US) say that she knew she was right in disbelieving evolution because she had her "Lord Jesus in her heart." No way can I match her degree of subjective certainty in my belief that Darwin is right. Jesus has never even broached the subject with me, so all I have is some facts and a rather long chain of inference. So from a Bayesian standpoint, it looks to me like Darwin is losing - which is why the Bayesian viewpoint doesn't impress me.

Greg said...

Take a statistics class. If you don't know a number is prime or not, there is still a knowable probability . Its better than nothing, thats why statisticians get paid. They add value.

As for weather, I was taught that 70% meant that for the whole area, 70% of the area would definitely get rain at least once. Your mileage may vary. But since you are in meteorology, you ought to have a definition at least as precise as that.