I'm not going to give a broad overview, as there is is already a plethora of rather boring bloggorhea (sorry guys, but it is :-) ) on the subject. Go and read RC if you want the "consensus" view. Or just read the document itself, it's simple enough. I'm just going to pick out a few bits that are particularly interesting to me.

1. First, they predict continued warming for the next 20 years of "about 0.2C per decade", up from a "likely" range of 0.1-0.2C in the TAR. That change is not really surprising - it had been clear for some time that the actual warming rate was closer to the upper than the lower end of the TAR range. I don't know the history of the TAR but I guess that their selection of endpoints for their range owed as much to rounding as a deliberate selection of 0.15C as a central estimate. Even back then the recent trend was above 0.15C and forecast to increase over time, especially under the implicit assumption of no volcanic eruptions (a big one could knock as much as 0.1C off a decadal average temperature). This new estimate is still a long way short of the probabilistic predictions that have been published though (at least, 2 such papers that I recently re-read).

Apparently there is a new Science paper which talks of a recent trend of >0.2C per decade. Every time I've looked at GISTEMP (eg here) it shows just a bit under 0.2C to me, so I'll have to check exactly what this new paper did. Anyway, "about 0.2C per decade" is fine by me.

2. On climate sensitivity, there is the much-leaked change from 1.5-4.5C (TAR) to 2-4.5C (AR4) at the same "likely" level. I think the change at the lower end may be as much due to increasing recognition that 1.5 is a firm limit, as stronger confidence that the value of S is actually greater than 2C. Forster and Gregory's recent estimate was 1.7C and there are several others with a strong likelihood close to the lower end of the range. Anyway, depending on how "likely" is interpreted, this phrase still acknowledges perhaps as much as 15% probability of S sneaking below the 2C threshold, but it cannot do so by much.

What they have said about the upper end of the range is more...interesting. They have added the phrase:

[Update

Based on this first-hand report, the phrase was indeed chosen specifically for its ambiguity.]

3. There is more probabilistic confusion in the discussion of attribution of past climate changes (I've written about this before). This is perhaps most clearly demonstrated in

I should point out that this criticism doesn't invalidate (or even weaken) the broad thrust of the report. I'm grumbling about the D&A stuff primarily because it forms the basis of the confusion in the climate sensitivity debate, rather than actually mattering in itself. They could have written things in a clear and correct manner without substantively affecting the overall message.

1. First, they predict continued warming for the next 20 years of "about 0.2C per decade", up from a "likely" range of 0.1-0.2C in the TAR. That change is not really surprising - it had been clear for some time that the actual warming rate was closer to the upper than the lower end of the TAR range. I don't know the history of the TAR but I guess that their selection of endpoints for their range owed as much to rounding as a deliberate selection of 0.15C as a central estimate. Even back then the recent trend was above 0.15C and forecast to increase over time, especially under the implicit assumption of no volcanic eruptions (a big one could knock as much as 0.1C off a decadal average temperature). This new estimate is still a long way short of the probabilistic predictions that have been published though (at least, 2 such papers that I recently re-read).

Apparently there is a new Science paper which talks of a recent trend of >0.2C per decade. Every time I've looked at GISTEMP (eg here) it shows just a bit under 0.2C to me, so I'll have to check exactly what this new paper did. Anyway, "about 0.2C per decade" is fine by me.

2. On climate sensitivity, there is the much-leaked change from 1.5-4.5C (TAR) to 2-4.5C (AR4) at the same "likely" level. I think the change at the lower end may be as much due to increasing recognition that 1.5 is a firm limit, as stronger confidence that the value of S is actually greater than 2C. Forster and Gregory's recent estimate was 1.7C and there are several others with a strong likelihood close to the lower end of the range. Anyway, depending on how "likely" is interpreted, this phrase still acknowledges perhaps as much as 15% probability of S sneaking below the 2C threshold, but it cannot do so by much.

What they have said about the upper end of the range is more...interesting. They have added the phrase:

"Values substantially higher than 4.5°C cannot be excluded, ...".A literal interpretation of this is completely vacuous (we can never assign a probability of precisely zero), so I'm not at all sure what they mean by including it. Note how it carefully avoids using the calibrated probabilistic language that has been adopted (likely, very likely etc). I can't help but be amused by RC's comment:

"the governments (for whom the report is being written) are perfectly entitled to insist that the language be modified so that the conclusions are correctly understood by them and the scientists. [...] The advantage of this process is that everyone involved is absolutely clear what is meant by each sentence. Recall after the National Academies report on surface temperature reconstructions there was much discussion about the definition of 'plausible'. That kind of thing shouldn't happen with AR4."I predict discussion about the definition of "cannot be excluded". I will be discussing it, at least! I complained about this ambiguous phrasing (which appeared in similar form in a couple of chapters) in my review of the last draft, and explicitly asked the authors to explain more clearly what they meant by it. I've also asked a couple of authors who used similar phrasing in their papers but have not got a reply out of them. I find it hard to avoid the conclusion that this "cannot be excluded" phrase was deliberately chosen specifically for its meaninglessness, in order to to be able to present a "consensus" rather than a strong disagreement about the credibility of such high values. I'm sure that those who assign a probability of 5% or even more to S greater than 6C will consider that this phrase supports them, even though Stoat parses it as "they do go on to diss > 4.5 oC a bit" due presumably to the "but agreement of models with observations is not as good for those values" which completes the sentence I partially quoted above. "Not as good" also has no probabilistic interpretation of course.

[Update

Based on this first-hand report, the phrase was indeed chosen specifically for its ambiguity.]

3. There is more probabilistic confusion in the discussion of attribution of past climate changes (I've written about this before). This is perhaps most clearly demonstrated in

"It is very unlikely that climate changes of at least the seven centuries prior to 1950 were due to variability generated within the climate system alone."One thing they might have said (and perhaps thought they were) is that an unforced system is very unlikely to exhibit the observed level of climate changes. That is an essentially frequentist statement about ensembles of model runs. But what they have actually said appears to be the Bayesian statement that they believe that there was external forcing in the real world. No shit Sherlock! The cavalier way in which the detection and attribution community freely switches between frequentist and Bayesian approaches to probability, without any clear explanation, gives every impression that they do not understand the difference (or even perhaps realise that there might be a difference). Their writing about the recent warming is similarly clumsy:

"it is extremely unlikely that global climate change of the past fifty years can be explained without external forcing, and very likely that it is not due to known natural causes alone."The first statement again is essentially frequentist, the second Bayesian. The existence of anthropogenic forcing as a contribution to the recent climate changes is not merely "very likely", it is at least "virtually certain"! (I don't believe there is a single working climate scientist who would argue that the anthropogenic forcing has been precisely zero. There is of course debate over its magnitude - some legitimate, some specious.)

I should point out that this criticism doesn't invalidate (or even weaken) the broad thrust of the report. I'm grumbling about the D&A stuff primarily because it forms the basis of the confusion in the climate sensitivity debate, rather than actually mattering in itself. They could have written things in a clear and correct manner without substantively affecting the overall message.

## 25 comments:

James, perhaps your could explain to the ignorant (me), why it is significant that the language switches between frequentist and Bayesian statements: what difference does it make to the meaning (apparent or literal) of what is being presented? Alternatively, I suppose I could re-read the Wiki entries, but this won't explain why you see this as problematic.

Regards,

Fergus.

It matters because it is wrong. It is wrong because "the probability of data D given hypothesis H" and "the probability of hypothesis H given data D" are not the same. Equating the two is known as the Prosecutor's Fallacy (I've blogged about it before.)

Simple example: The probability of a roll of two fair dice turning up double-6 is 1 in 36. This is a trivial mathematical statement of frequentist probability (repeat the dice roll a large number of times times and the long-run proportion of double sixes will tend to 1/36). If a single roll of a specific pair of dice turns up double 6, what is the probability that the dice are fair? There is no objective answer to this question - and in particular, 1 in 36 is almost certainly wrong. We can legitimately say "it is very unlikely that fair dice will turn up double-6". We cannot legitimately say that "it is very unlikely that this observed double-6 was generated by fair dice" or equivalently "it is very likely that this double-6 was generated by biased dice".

As I said, the actual consequences in terms of the scientific understanding of the statements I highlighted in the SPM are quite small, but it certainly creates the impression of a bunch of people who don't understand very clearly what they are doing, and it has much more serious consequences for estimation of climate sensitivity. This was written and reviewed by a large number of eminent climate scientists!

(I should make it clear that I do not claim a high level of expertise myself, but I can see clearly enough where they are going wrong.)

I agree with the general point that confusion between frequentist and Bayesian probability interpretations is a big problem, and it's not surprising that it shows up in the IPCC report, since published empirical work is predominantly presented in terms of a frequentist hypothesis testing model, while theoretical analysis is at least implicitly Bayesian in most cases.

But looking at the statement "it is extremely unlikely that global climate change of the past fifty years can be explained without external forcing, and very likely that it is not due to known natural causes alone", I can't see any semantic or grammatical difference between the two clauses. Both seem Bayesian to me - the only difference is that natural external forcing is excluded in the first statement and included in the second. Can you clarify this?

John Q

If I roll a pair of dice 400 times, and get 11 sixes, can I then make a statement about the fairness? If so, what is the proper way to qualify that statement? Does my initial assumption of fairness, or lack thereof, in the dice before I start rolling have any effect?

LL,

You can make a statement, but any (honest and mathematically consistent) statement that you make will necessarily depend on your prior belief about the fairness of the dice, and therefore contain a subjective element. For some prior beliefs, you would still think that the dice were biased - eg consider that they are hand-made and imperfect. In that case (in fact in any real case) the prior probability of them being

exactlyunbiased is zero, and your posterior belief will remain zero after any finite number of throws. However for a large range of plausible priors you would end up being more confident that the bias is at least fairly small (which may in practive be the question you are really interested in).Here is an example about coin tossing.

John,

It is interesting that you interpret both statements as Bayesian, as I'm sure that some climate scientists would call them frequentist :-)

However I see your point and I think it boils down to how one parses "can be explained" versus "is (not) due to". The first IMO (and in the literature) is referring to the probability of something happening in a repeatable random ensemble of unforced model experiments whereas the latter appears to be making some statement of beliefs about the real world itself.

The wording is a sort of ambiguous half-way-house but it is certainly clear from the underlying literature that this question has almost universally been addressed in purely frequentist terms. The reason the wording seems Bayesian is that of course everyone wants the Bayesian answer (what we believe about the real world), but they didn't want to do nasty subjective Bayesian probability :-)

Thanks for the explanation, James.

The Prosecutor's Fallacy, as you call it, has a simple logical form which shows the problem clearly.

Just for fun, I reworded the external forcing/natural causes statement to give the two sides approximate equivalence and ended up with: '...it is extremely unlikely that...can be explained without external forcing, and very unlikely that it can be explained by natural forces alone...'

I feel a Homer Simpson moment coming...

Regards, Fergus

This entire conversation shows the limitations of observation driven decisions, which is where models and lab experiments come in (Know something about the center of mass, determine the weight distribution of the dice and you have an unequivocal answer)

>But looking at the statement "it is extremely unlikely that global

>climate change of the past fifty years can be explained without

>external forcing, and very likely that it is not due to known natural

>causes alone", I can't see any semantic or grammatical difference

>between the two clauses. Both seem Bayesian to me - the only

>difference is that natural external forcing is excluded in the first

>statement and included in the second.

It seems like there are at least two differences to me. As you point out there are the natural external forcings. There are also the unknown natural variations, and while these are believed to be smaller than the natural external forcings, I suggest the unknowns are more important to understanding what has been written.

James seems to think the first part is frequentist so it should be possible to reword it more clearly as being about ensembles of runs. Perhaps:

Having done a large ensemble of runs without any external forcings, it is extremely unlikely that such a run could reproduce the global climate change of the last fifty years.

I don’t think this is the same as what has been written in the first part because an ensemble of runs only includes the known or modeled natural variations and doesn’t include the unknown/unmodelled natural variations. Therefore what has been written in the first part appears to me to include at least an element of belief about the unknown natural variations being very small.

OTOH following this reasoning, the second part is only about *known* natural causes (forcings plus variations). Therefore I suggest that this part could be reworded as:

Having done a large ensemble of runs with only known natural forcings and variations, it is very unlikely that such a run could reproduce the global climate change of the last fifty years.

So I am inclined to say the first part is Bayesian and the second part frequentist. However, I am probably parsing it wrongly and would appreciate any help in understanding it.

So that is three people with three different views and James thinks there will be other climate scientists going for the 4th possibility. Aren’t we doing well ;)

crandles

Chris, isn't it the case that model runs have been done using, e.g., higher amounts of solar forcing than have been measured in the last 50 years, and that the results basically went off the rails? If so that would seem to be an additional nuance.

I just think this is a terrific article. I got lucky and followed a name from another thread. I just got done saying it is darn hard to find anyone who knows their stuff to give a non-consensus view and then you hit it nicely.

If you'd like to broaden your audience, we have a community science site in beta and we'll have about 300,000-500,000 readers a month once we get done and I think the few thousand we have reading it even now would get a lot of value out of this.

Eli,

You'll still have uncertainty on your measurements, and if you want to do all the sums carefully you'll have to choose a prior. You may end up choosing some sort of uniform prior by default if you don't think about it too hard - which in practice may be ok (especially given accurate observations), but still raises the issue of: uniform in what?

Standard question: if you measure the side of perfect cube to be 10.2cm, with an uncertainty of 5mm (gaussian), then what is the probability that the volume of the cube is greater than 1 litre?

(Sorry, this is getting dangerously close to Frame and "the role of prior assumptions" again. I thought I had given that up...)

So that is three people with three different views and James thinks there will be other climate scientists going for the 4th possibility. Aren’t we doing well ;)As RC puts it, "everyone involved is absolutely clear what is meant by each sentence" :-)

There is an IPCC coordinating lead author visiting next week. I'll try to find out what he thinks it means.

"Standard question: if you measure the side of perfect cube to be 10.2cm, with an uncertainty of 5mm (gaussian), then what is the probability that the volume of the cube is greater than 1 litre?"

OK, I'll bite. Given that it's a perfect cube the probability is the same as that the side is more than 10cm, which is a little bit over 50 per cent.

JQ

John,

You've not shown your working or provided precise numbers. However, in order to deduce that the length of the side is probably greater than 10cm, you had to assume (perhaps implicitly) some prior distribution of possible sizes. The obvious assumption that the cube side has a 34% chance of being less than 10cm (tail of Gaussian beyond 0.4 sd, as per my numbers) is equivalent to assuming a prior which is uniform in the length of the side.

If you assumed instead that the cube was created by someone taking a random volume of metal and squashing it into shape, you might have started from a prior which is uniform in volume rather than the length of a side. Follow the sums through and you will get a different answer. If you find an inscription saying "98mm-sided cube from Mitsubishi cube-making factory" then you might get yet another different answer!

Fundamentally, an observation with Gaussian measurement error on a variable does not mean that the variable's (Bayesian) estimate is given by the Gaussian of the appropriate width centered on the observed value. That's obvious enough with a non-negative value such as mass or length.

Ah, but my question is what if you measure the side of a cube with a digital vernier caliper accurate to 0.01 cm, traceable to NIST and get a value of 10.2 cm. There is an uncertainty of 0.01 in the measurement, but there is NO probability that you are off by 0.02 (you have to know something about calipers and you told me it was a perfect cube).

Eli,

Of course I agree that if you can make more accurate measurements (or repeats with independent errors) then you are fine, and this probably helps to explain why the strictly correct approaches have not really caught on. However, it is not easy to go back in time and re-measure historical climatic changes using modern technology :-)

I take your point, James. I'd be interested to see you relate it to my concerns about your Bayesian treatment of climate sensitivity. It seems to me that there are lots of possible priors here, giving rise to different posterior distributions for sensitivity.

JQ

John,

I agree that there is no one unique answer and different people could come to different conclusions.

However, based on the widespread "success" of the Prosecutor's Fallacy in D&A (which is basically equivalent to assuming a uniform prior, roughly speaking), some researchers have adopted this as an axiom for probabilistic estimation in general. They therefore use a uniform prior for sensitivity - and claim this is "ignorant" when in fact it assigns extraordinarily high prior probability to extremely high sensitivity, and very low prior probability to values close to 3C. After updating with a small amount of observational evidence (typically the temperature rise in the 20th century, which is only one small part of our total evidence) there is still an alarmingly high posterior probability of high sensitivity.

As I've argued in a few places, any reasonable approach that either starts from a plausible prior (IMO there are good theoretical arguments for a peak around 2-3C and long but thin tail to high values) and/or takes a comprehensive look at the wide range of observational data available, will inevitably result in a much stronger focus around the preferred value of about 3C (personally I think 2.5 is better for various reasons) with a very low probability of exceeding about 4 or 4.5C. I think the contortions that seem to be necessary to support high sensitivities go well beyond the reasonable level. I would happily update my belief in the presence of more careful calculations than we have performed, but in the 18 months since I started to circulate these thoughts on the matter no-one (to my knowledge) has produced any new work that takes account of what we said.

Some prominent scientists don't like what I have to say and are doing their best to deny, downplay or obfuscate their way around it. I wouldn't like to speculate as to whether this is because they are embarrassed at their mistakes, too invincibly ignorant to consider that they might be wrong, or just don't like the political consequences of settling for a significant but not extraordinary value for climate sensitivity. I'm more worried about the political consequences for climate science that such clearly wrong work could have gained such traction in the first place.

Hi, true and true, but wrt climate, as Drew Shindell pointed out at an AGU talk I attended, we are probably already at the point where the models are better than the data pre-1850 and are rapidly pushing that forward.

The second point is a point about measurement. When you have a scale, the interpolation (the uniform prior) only exists within between the two lines where the measurement falls on the ruler, the probability of the result being outside of those lines is zero.

Thanks for persisting with this explanation, James, and for the 'prosecutor's fallacy' post.

(BTW, I am not the "Hank" who's been posting invites to a website.)

(BTW, I am not the "Hank" who's been posting invites to a website.)Yes, I realised that. I was flattered to be told how wonderful my blog was, until I noticed he'd said much the same to Stoat :-)

I'm probably a bit dim, but I can't match that sentence with the prosecutor's fallacy (I can match the examples given though).

Anyway, this link may help shed some light:

http://www.iisd.ca/vol12/enb12319e.html

I'm wondering if the sentence:

"Following comments from Saudi Arabia and Austria, text stating that "warming of the climate system has been detected and attributed to anthropogenic forcing" was modified, with language separating the detection and attribution components into two separate sentences."

may be of relevance. The following paragraph may be of interest, too.

Adam,

Well the PF is basically swapping P(D|H) for P(H|D). The experiments generally investigate the probability of the observed data under the null hypothesis of no anthropogenic influence. What we want is an estimate of the anthropogenic influence give the data, and you have to apply Bayes' Theorem to get that!

Although it's possible that the politicians confused the language slightly in the SPM, the same confusion is clearly identifiable in the primary literature.

Okay thanks,...statistics was never my strong point.

Post a comment