Friday, August 12, 2011

How many of Roger's findings about probability manage to be wrong? Answer: he's more inventive than you might expect.

Roger Pielke has a new post up asserting that 28% of the IPCC's findings are incorrect. Although it's obviously a rather implausible figure, I was expecting this claim to be backed up with some sort of evidence of errors, or at least sloppiness, or something, so I had a look at his paper that he cites to justify the claim.

It turns out that 100%-28% = 72% is merely the average (lower bound) probability level associated with the statements they made. Such as "It is very likely that hot extremes, heat waves and heavy precipitation events will continue to become more frequent." Here "very likely" means greater than 90%. So, given 10 such statements, the IPCC is saying that they would expect the "very likely" outcome to occur about 9 times, and not occur about once. And similarly for "likely" (66%). Averaging over all the probabilistic statements, it should be expected that in about 28% of cases, the (probabilistically) preferred outcome will not actually happen.

And in Roger-world, this means that 28% of the statements are "incorrect". Note, however, that he does not make this silly claim in the paper itself, but only in his blog post.

To see why this interpretation is nonsensical, consider a single roll of a fair die. I state (accurately) that it is "likely" to lie in the range 1-5. If I roll a 6, then in Roger-world, my statement was incorrect. However, it was not incorrect, and Roger is simply wrong to claim so.

As you can see from the comments, I challenged Roger on this, and his response (entirely in character) is to duck and weave. In his comment #5, for example, he shamelessly misrepresents what I said, and brings up the red herring of a definitive prediction (when in fact I had clearly made a probabilistic one, and the distinction is of course absolutely fundamental to the point). The obvious elephant in the room that Roger cannot bring himself to acknowledge is that the statement is correct irrespective of the outcome of the roll. "Correctness" of a single statement simply isn't something that can be directly validated (or invalidated) by the outcome, and the accurate calibration of a probabilistic prediction system actually relies on having the appropriate number of "failures" for each level of probability.

I realise of course that having done some rather boring textual analysis that in his own words amounts to "Nothing too interesting, really", Roger is just rabble-rousing on his blog. I'm confident that any competent scientist will see straight though it, but that's hardly his target audience.

As for what the 72%/28% average actually does mean, it doesn't actually tell us anything except that the IPCC makes a lot of statements about things that it is only (by its own admission) moderately confident about. It might in principle be interesting to see how the confidence level changes over time, but only if the set of statements were to be held fixed from one assessment to the next. People have looked at climate sensitivity estimates (hardly changed) and detection and attribution (increased markedly in confidence) but not a lot else AIUI. I suppose we can anticipate Roger claiming that the next report is either more correct, or less, depending on what mix of statements they happen to include :-)

Incidentally, and although it's a minor point it is perhaps telling in terms of his overall level of competence, Roger is also wrong where he claims that if the statements are not independent, then the proportion of "incorrect" will be higher than 28%. Actually, if the statements are not independent (while still being correctly calibrated), then the proportion that do not come to pass would still be 28% in expectation, just with higher variance, meaning that either a larger or smaller proportion would not be surprising. Unlike the simple misinterpretation in his blog post title, this elementary error is actually made in the paper itself.

39 comments:

Joel said...

"If the IPCC is 100% correct, 28% of what they say is wrong."

This is fun!

Anonymous said...

I saw Roger's post like night around midnight EST, and immediately had the same incredulous thought (re: dice). I'm glad someone pointed this out.

Roger Pielke, Jr. said...

Hi Jaems, Thanks for keeping the snark over here ;-)

"The obvious elephant in the room that Roger cannot bring himself to acknowledge is that the statement is correct irrespective of the outcome of the roll."

In other words, by equating IPCC predictions with rolls of a die, you are implying that whatever happens in the real world, the IPCC is correct, about everything.

Infallibility doesn't even work for the Pope.

Here is how I responded to your question about the die:

On your question about a die, it is a bad analogy.

A die has a known PDF, So if you say, something like "a die has 6 sides with each side having equal probability of turning up" that is by definition a correct statement. If you instead say "I'll give you 5 to 1 odds that the next roll is not a six" then that is a statement that can be verified by experience, and money will change hands.

Of course, with respect to the IPCC PDFs are not known, and the report is written about the climate future, of which we will have only one.

Following your logic the IPCC can never be evaluated empirically about anything expressed probabilistically. This is silly. Probabilistic statements are made to add information to decision making, not to subtract from accountability.

If a weather forecaster offers an 80% chance of rain the next day and does so for 360 days, then I can evaluate that forecast by converting it into a deterministic prediction that should verify 80% of the time, if it is well calibrated. Similarly, if the IPCC offers up 360 statements expressed with various certainties, then (if its judgments are well-calibrated) we can expect that the scientific community will collectively judge 28% of those statements to be incorrect in the future.

I am happy to continue a conversation over at my blog, BTW I like your new blog format.

Roger Pielke, Jr. said...

Sorry about the typo in your name ... on the "elementary error" (?) if there are 360 statements of which 100 are in error, and it turns out that instead of being independent 10 of them are combinations of the others (expressed at a higher level of likelihood), then removing those 10 statements means that we would only have 350 independent statements with a collectively lower likelihood, which will increase that 28%.

Snark away! ;-)

A.Grinsted said...

How can one have been so many years in science and yet be so clueless about statistics?

If IPCC had used unrealistic small uncertainties then according to Pielke they would have been more right. Ludicrous!

andrew adams said...

Roger,

In other words, by equating IPCC predictions with rolls of a die, you are implying that whatever happens in the real world, the IPCC is correct, about everything.

No, whatever happens in the real world we will be able to compare it against the projections made by the IPCC and make a judgement about the extent to which they got it right. If few of their "very likely" scenarios come to pass then will be jusdged to have got it wrong.

EliRabett said...

BTW, congrats on the snarky subtleness over at RPs, something he didn't understand either.

Jesús R. said...

I think this sentence you wrote in his blog is especially clear:

"If 10% of their very likely statements turn out to not occur, this would simply mean their assessment was well calibrated."

But then, trying to state this correctly, in other words, this means that, for the IPCC being correct, ~28% of their projections should turn out not to occur? I mean, the IPCC PDFs are telling us that around 28% of the projections will (probably) turn out not to happen? Is that it?

Anonymous said...

Don't you remember the story of "undergrad Megan", where Roger explained, in condescending fashion, just how the standard IPCC scientists were all wrong about statistics?

Regarding picky details about the post and the paper:

1) "very likely" should be 90 to 99% - or, on average, about 94.5% - does Roger assign it 94.5% or (as somehow I suspect is more likely), 90%?

2) "very unlikely" is 1 to 10%: Does a very unlikely finding coming to pass mean that the IPCC is correct, and if it doesn't happen, the IPCC is incorrect? That just seems like a bizarre interpretation, and yet, for 9 very unlikely or worse statements, that's basically 2.5% "incorrect" for the IPCC right off the start.

3) What the heck is up with Fig 1 of the paper? I can't for the life of me reconcile the colors in the legend with the colors in the bars in the Figure... did he leave out the column which had "likely", "extremely unlikely", and "low agreement, low evidence"?

-M

ps. Did you really have to switch to this OpenID thing?

David B. Benson said...

RPJr --- Breathtaking.

Go actually study some elmentary probability text.

Anonymous said...

Also, Pielke states "The IPCC SRES went to some length to explain that its scenarios were considered equal probability."

I thought Roger was casting himself as some kind of expert on the IPCC? But, the IPCC SRES actually went to great lengths to state that their scenarios were "equally valid with no assigned probabilities of occurrence." That is pretty much explicitly stating that "equally valid" does NOT mean "equally likely".

More evidence: AR4: "The most controversial assumption in the Wigley and Raper (2001) probabilistic assessment was the assumption that each SRES scenario was equally likely. The Special Report on Emissions Scenarios (Nakićenović and Swart, 2000) states that ‘No judgment is offered in this report as to the preference for any of the scenarios and they are not assigned probabilities of occurrence, neither must they be interpreted as policy recommendations.’ "

Eg, the IPCC explicitly stated that the assumption that the SRES scenarios are equally likely is "controversial". That's pretty much the exact opposite of going to "great lengths to explain" how they are equally likely.

Except, of course, in Roger world.

(to give Roger credit, I do think that the IPCC could do a better job with thinking about probabilities and their meanings, but... it isn't an easy subject to handle. Just try having a discussion with a weather forecaster on what a 20% chance of showers means, and how that's misinterpreted, and that's a number we pay attention to dozens of times a year...

-M

manuel moe g said...

Concerning JA's last paragraph: Question: what is worse - having somebody tell you your fly is down, zipping it up, then berating that helpful somebody for "snark"? Or, doing the same, except then stomping away petulantly with your fly still wide open?

Answer: I don't know, but today I learned that in the intersection of impenetrable high self-esteem, low competence, and a low standard for argumentation, there exists a hero.

JA is correct that RPJr will count this as a success, of sorts. But we couldn't pick two better foils than RPJr and Curry. Anyone who could fall for their "arguments" about probability is not in a position to make accurate coinage change, much less do maths. RPJr and Curry cannot keep consistency between two adjacent statements in their arguments, and their arguments, if successful, would immediately imply that the whole of published papers on probability and statistics, all the way back to Pascal, the Bernoullis, ..., is absolutely without *any* intellectual content. Tall order.

Anyone convinced by such arguments, betrays himself as not really being in the business of "understanding".

EliRabett said...

The humor of this is that there is a way of correctly saying:

The IPCC assigns a 28% probability to some of their estimated outcomes not occurring . . . .

The joke is that Roger and his wingman Richard Tol probably won't admit to understanding the difference between this. Eli assigns a 100% probability to Roger not understanding the difference and a 90% probability to Richard understanding the difference but not admitting it.

Other expert priors differ

David B. Benson said...

I suppose Eli has the last word.

James Annan said...

Jésus,

The IPCC makes statements of the type:

S_i="We assign probability P_i to event E_i"

If their assessment is well-calibrated, then roughly 28% of the events E_i will not occur. To talk of the "correctness" of a single statement S is not meaningful. However, contrary to Roger's rather desperate claim "Following your logic the IPCC can never be evaluated empirically about anything expressed probabilistically", it is of course entirely possible and meaningful (and indeed routine in the field of probabilistic prediction, including weather prediction) to assess performance over a sufficiently large set of statements. If substantially more or less than 72% of the events E_i do occur, then the judgement of the IPCC will have been shown to be poorly calibrated.

(FWIW I expect events will show them to have been conservative in terms of their level of understanding, in that more than 72% of the events will occur.)

I gave a simple example of this in assessing Corbyn's forecasts a few years ago, for anyone who wants to see it in practice...

"46671558-c515-11e0-b866-000bcdcb5194", I think any google account should work fine. I do get a huge amount of annoying spam when I try making comments entirely open. In fact I get a lot of spam even so...

Dikran Marsupial said...

Roger's analysis is nonsense. The whole point of explicitly stating the uncertainty in the projections is that the IPCC don't expect every one of the (binary) projections to pan out even if the underlying theory is correct.

For example, the IPCC might say that increasing CO2 radiative forcing due to fossil fuel emissions means that all things being otherwise equal temperatures will rise. Thus, considering the unforced variation in the climate, it is LIKELY that temperatures will rise over the next decade. According to Roger's criterion, if we didn't observe a warming decadal trend, the IPCC would be incorrect to claim that CO2 radiative forcing has a warming effect on the climate. However that is obviously incorrect, the observation does not falsify the claim unless a statistically significant cooling trend was observed. That is a much more stringent requirement than merely not observing warming. Note the hypothesis can still be falsified, just not by Rogers test, hence the bogus 28% headline is sheer nonsense.

I am appauled (but sadly not surprised) that such nonsense gets published.

Jesús R. said...

Thanks for the carification, James. I absolutely go along with this:

"I expect events will show them to have been conservative in terms of their level of understanding, in that more than 72% of the events will occur."

Dikran Marsupial said...

Having now had a look at the paper I am amused at how much Roger has attempted to claim from so little in the actual paper. IMHO this goes well beyond "provocative". Surely Roger knows that his blog post will be interpreted as saying 28% of the IPCC science is incorrect?

James Annan said...

DM, yes I'm sure that's precisely his goal - getting quoted by the Peisers and Moranos and their ilk.

Carl C said...

never trust a guy who agrees so much with his daddy...

David B. Benson said...

42

Steve Bloom said...

Yes, Carl, the similarity in argumentation styles is really something. The only difference seems to be that Jr. has learned to camouflage his obtuseness somewhat.

Robert Grumbine said...

The nutshell version?

* Scientists should be forthcoming about their uncertainty.

But:

* To be less than 100% certain is to be wrong (by the amount of the difference).

Hokay. A little more subtle than 'heads I win, tails you lose', so I suppose points for creativity.

Steve Bloom said...

And as Gavin noted recently over at RC, RP Jr. is *never* wrong.

Hank Roberts said...

> *never* wrong
But being misunderstood somehow *always* comes up.
http://earthobservatory.nasa.gov/Features/DeepFreeze/

William M. Connolley said...

RP isn't doing well, but you fail to jump on Tol's head too.

Incidentally, RP's paper that he is blogging looks like the sort of worthless thing written only to bump citation counts.

James Annan said...

I don't think there is much need to drag Tol into it: his criticisms seem to be aimed specifically at the language in WG2 rather than WG1 (and appear to be valid, though I haven't looked at WG2 in any detail myself).

Chip Knappenberger said...

James,

In the Summary for Policymakers of the AR4, the IPCC provides Table SPM.3 which lays out their “best estimate” of the future temperature rise as well as the “likely range” for a few of their SRES scenarios among which they assign no probability of preference (as "46671558-c515-11e0-b866-000bcdcb5194", is quick to point out).

Is the information in this Table at all useful to “policymakers” or anyone else for that matter? It represents a collection of what-ifs with unverifiable real-world outcomes (which would be the case even if the time period in the Table had already been completed).

You seem to argue that the IPCC’s predictions are only testable in bulk. But what it you have specific interests? Should you turn elsewhere?

Unverifibale guidance is no guidance at all.

-Chip

Steve Scolnik said...

The clearest explanation of this problem that I've seen is from an analysis of probabilistic weather forecasting by severe storms experts Charles Doswell and Harold Brooks. In case there's a remote probability that RPJr happens to know a meteorologist, perhaps he should inquire about it instead of continuing to engage in semantic games.

"An important property of probability forecasts is that single forecasts using probability have no clear sense of "right" and "wrong." That is, if it rains on a 10 percent PoP forecast, is that forecast right or wrong? Intuitively, one suspects that having it rain on a 90 percent PoP is in some sense "more right" than having it rain on a 10 percent forecast. However, this aspect of probability forecasting is only one aspect of the assessment of the performance of the forecasts. In fact, the use of probabilities precludes such a simple assessment of performance as the notion of "right vs. wrong" implies. This is a price we pay for the added flexibility and information content of using probability forecasts. Thus, the fact that on any given forecast day, two forecasters arrive at different subjective probabilities from the same data doesn't mean that one is right and the other wrong! It simply means that one is more certain of the event than the other. All this does is quantify the differences between the forecasters."

http://cimms.ou.edu/~doswell/probability/Probability.html

andrewt said...

To go with Roger's latest example if the IPCC says its very likely than an AFC team will win the superbowl - Roger equates that to the IPCC having a 90% chance of being "correct".

But if the IPCC says its very unlikely an NFC team will win the superbowl - Roger equates that to the IPCC having a 10% chance of being "correct".

But these are equivalent statements - or very close to equivalent I guess the superbowl could be cancelled .

And its gets worse for the IPCC, we suppose WG1 says an AFC win is very likely and WGII says an NFC is very unlikely, Roger says the expected number of these findings that will be correct is less than 50% because the two findings aren't independent.

I could imagine referees thinkling a shallow compilation of the form of IPCC findings was vaguely interesting and I can certainly see the simple maths error about independence could slip past a referee because it was in a footnote but surely they should have noticed where < 28% became 28% in the paper.

James Annan said...

Steve,

Thanks for posting that on Roger's latest. Yes, it's exactly what I was saying, only better put :-)

Given RP's dishonest misrepresentation of what I said (and "accidental" absence of link to my post), I suppose I should write another article to explain (again), though I think everyone who is capable of understanding probably does get the point by now!

Andrew, yes I had thought about the inconsistency at the "unlikely" end of things. While it demonstrates the cluelessness of his approach, it isn't actually that important for the calculation, as there are not many of these statements.

James Annan said...

Chip,

I disagree with your characterisation of unverifiable guidance as no guidance. If you judge (say, based on a number of weather forecasts) that it is "likely" to rain tomorrow, then I suspect you will *use* that judgement even though the accuracy of your probability for that one day is never going to be verifiable. You can only calibrate your judgements over a large enough set of cases.

Philosophers may debate the meaning of expectation when there really is only a single event of interest...but note that in the case under discussion, Roger himself assembled a few hundred probabilistic statements into an ensemble, even though he now seems to deny that it can be treated as such.

Chip Knappenberger said...

James,

So the information in IPCC TAR Table SPM.3 is not usuful in and of itself, but can only be judged collectively with all the other IPCC predictions?

What if I want to know about how good the IPCC guidance is on future global mean temperature trends and don't care about hurricane strength, et al.?

-Chip

Chip Knappenberger said...

James,

Here is a run down of the IPCC’s projections for the rate of temperature rise taken from their Summary for Policymakers from their first 4 Assessment Reports:

First Assessment Report (1990):

“Under the IPCC Business-as-Usual (Scenario A) emissions of greenhouse gases, the average rate of increase of global mean temperature during the next century is estimated to be about 0.3°C per decade (with an uncertainty range of 0.2°C to 0.5°C). This will result in a likely increase in global mean temperature of about 1°C above present value (about 2°C above that in the pre-industrial period) by 2025 and 3°C above today’s (about 4°C above pre-industrial) before the end of the next century.”

Second Assessment Report (1995):

“For the mid-range IPCC scenario, IS92a, assuming the “best estimate” value of climate sensitivity and including the effects of future increases in aerosol, models project an increase in global mean surface air temperature relative to 1990 of about 2°C by 2100. This estimate is approximately one-third lower than the “best estimate” in 1990.This is due primarily to lower emission scenarios (particularly for CO2 and the CFCs). The inclusions of the cooling effect of sulphate aerosols, and improvements in the treatment of the carbon cycle. Combining the lowest IPCC emissions scenario (IS92c) with a “low” value of climate sensitivity and including the effects of future changes in aerosol concentrations leads to a projected increase of about 1°C by 2100. The corresponding projection for the highest IPCC scenario (IS92e) combined with a “high” value of climate sensitivity gives a warming of about 3.5°C.”

Third Assessment Report (2001):

“The globally averaged surface temperature is projected to increase by 1.4 to 5.8°C over the period 1990 to 2100. These results are for the full range of 35 SRES scenarios, based on a number of climate models.

“Temperature increases are projected to be greater than those in the SAR, which were 1.0 to 3.5°C based on the six IS92 scenarios. The higher projected temperatures and the wider range are due primarily to the lower projected sulphur dioxide emissions in the SRES scenarios relative to the IS92 scenarios.”

Fourth Assessment Report(2007):

“For the next two decades, a warming of about 0.2C per decade is projected for a range of SRES emissions scenarios…

Since IPCC’s first report in 1990, assessed projections have suggested global average temperature increases between about 0.15°C and 0.3°C per decade for 1990 to 2005. This can now be compared with observed values of about 0.2°C per decade, strengthening confidence in near-term projections.”


So, what to make of all this?

Personally, I find the last statement from the AR4 laughable. Of the 4 projections, I think only the one from the SAR is correct. Funnily, the SAR explained why the FAR was wrong, and then the TAR explained why the SAR was wrong, and then the AR4 explained why all of them were right. Ha.

The IPCC seems to think that observations from 1990 to 2005 can be used to “strengthen confidence” in their near-term projections. By the same token then, how do the observations from 1997-2011 impact confidence in their projections? Since the observed trend is far beneath 0.2°C per decade, I would think it would weaken confidence in their projections, but, on the other hand, since even a trend of 0.0°C per decade (over 15 years) probably falls somewhere within their pdf (even if it is outside the 2.5th percentile), perhaps the observed trend further strengthens their confidence?

Can grossly missing the mark still count as getting it right?

-Chip

James Annan said...

Chip,

To recapitulate what I previously said, I reject your characterisation of "not useful", and I don't think you have justified this. It is (probably) not going to be possible to evaluate whether the probability level of "likely" was appropriate for the temperature ranges in that table (assuming some sort of Gaussian, we might reasonably reject if the temp change is wildly different). Just as tomorrow's observations will not determine whether today's prediction of a 70% chance of rain was a "correct" forecast. However, I bet you routinely use weather forecasts (and all sorts of other uncertain estimates of your own), which implies that you consider them "useful" nevertheless. Anyone who cares about temperature 20 years from now, ought to care about what the current expert opinion is, even if they think they have good reasons for disagreeing (as I do, a little).

James Annan said...

That last comment of mine crossed with yours. To briefly address your later comment:

I believe the IPCC near term predictions were quite deliberately chosen to be wider than the model range. I happen to think that some of the work on which this is based has flaws, but that's the way it is.

I agree that a short interval of recent data does not do much to bolster their predictions, but I don't agree with your characterisation of it as "grossly missing the mark" either. I'd still like to see our joint work published somewhere...

Chip Knappenberger said...

James,

I'll send you some up-to-date figures...

-Chip

winston said...

"So, given 10 such statements, the IPCC is saying that they would expect the "very likely" outcome to occur about 9 times, and not occur about once."

You're sounding a bit frequentist there. I could have sworn you were a Bayesian.

James Annan said...

Wiston, even Bayesian predictions have frequentist properties when you collect a bunch of them together :-)

(Also may be worth pointing out again that each day's weather forecast is actually Bayesian in nature.)