Several people have email about this article. I don't have anything particularly novel or interesting to say, so I'll just repeat an email that I sent regarding it...

From my point of view, the problem is not particularly in the treatment of the Forster and Gregory result - the authors had already in that paper pointed to the choice of prior as an important factor in the specific results they generated. More, the error was in the IPCC's endorsement and rigid adherence to the use of uniform prior, despite the existence of very straightforward arguments that this approach is simply not tenable:

http://www.springerlink.com/content/7np5t35mq27p3q24/

(also here:

http://www.jamstec.go.jp/frsgc/research/d5/jdannan/probrevised.pdf )

These arguments (which as you saw I made during the IPCC review process [here here here]) were basically brushed aside. The IPCC authors exclusively relied on and highlighted the results that had been generated using uniform priors, and downplayed alternative results, which already existed in the literature, that had been generated with different priors.

However, with the passage of time I believe my arguments have now become more widely (if grudgingly) accepted, so I look forward with some interest to see how the IPCC authors deal with the subject this time.

I should also add that I'm not at all convinced by the author's claims that a prior which is uniform in feedback (1/sensitivity) is "correct", rather, it is something that people have to think about, and may reasonably disagree. Such is life. It is theoretically possible that someone could even present a plausible argument for a uniform prior in sensitivity, but I've not yet seen one...

From my point of view, the problem is not particularly in the treatment of the Forster and Gregory result - the authors had already in that paper pointed to the choice of prior as an important factor in the specific results they generated. More, the error was in the IPCC's endorsement and rigid adherence to the use of uniform prior, despite the existence of very straightforward arguments that this approach is simply not tenable:

http://www.springerlink.com/content/7np5t35mq27p3q24/

(also here:

http://www.jamstec.go.jp/frsgc/research/d5/jdannan/probrevised.pdf )

These arguments (which as you saw I made during the IPCC review process [here here here]) were basically brushed aside. The IPCC authors exclusively relied on and highlighted the results that had been generated using uniform priors, and downplayed alternative results, which already existed in the literature, that had been generated with different priors.

However, with the passage of time I believe my arguments have now become more widely (if grudgingly) accepted, so I look forward with some interest to see how the IPCC authors deal with the subject this time.

I should also add that I'm not at all convinced by the author's claims that a prior which is uniform in feedback (1/sensitivity) is "correct", rather, it is something that people have to think about, and may reasonably disagree. Such is life. It is theoretically possible that someone could even present a plausible argument for a uniform prior in sensitivity, but I've not yet seen one...

## 20 comments:

Well, it seems I was always told it didn't matter in the end what your choice of Bayesian prior was. Different people will legitimately have different ones anyway - Bayesianism is inherently subjective and "psychological" as its old frequentist critics always said. What should happen is, as more data comes in, the real world itself constrains our expectations so even people with far-out priors, like our skeptic friends who think it's highly unlikely for sensitivity to be greater than 1, have to in the end agree on what the real-world value is.

The problem is, we don't have enough data yet.

Now one thing I find odd about the IPCC graph being discussed is - shouldn't all the different lines of evidence *combine* to form a single new Bayesian estimate for the likelihood of sensitivity? I.e. why can't you combine radiative observations and paleo-climate analyses and volcano responses etc. into a single collection of evidence, rather than treating each separately? Has anybody tried this?

I'm still a bit confused. I can now see that the debate over priors was already well known, so what is new in Lewis' article? That no one had realised the graph had been changed to accord with the IPCC view?

Seems a bit of a storm in a tea cup to me.

Arthur, I think you want this post (and associated paper) :-)

Steve, yes it looks to me that Lewis is basically rediscovering something that was well known, and widely accepted as appropriate, at least according to the majority view at the time. I'm not sure they would do the same thing now.

To repeat a point that has been made before, if you don't want to rely on subjective judgement when you have multiple strands of evidence, pick one as the prior, and see how the others modify it. You could even pick each line of evidence in turn as the prior. Priors are not ignorant, but a uniform prior is.

Would it be possible to send out a survey/questionaire to a few dozen experts to extract their views on what would be a reasonable subjective estimate of probability distribution of climate sensitivity as at 1980? Would getting a dozen answers allow a reasonable calculation of an average and a 95% confidence interval?

Would anyone think that the cauchy distribution James and Jules suggested not as reasonable but for being immune to accusations of being too confident would end up inside the 95% confidence interval?

Would recommending use of the average and both ends of the 95% confidence interval as 3 different priors to be used be a reasonable way of suggesting a standardised approach allowing reasonable comparisons between different studies? (If researcher(s) wanted to make their particular views of an appropriate prior known this would not prevent them presenting the results arising from 4 priors.)

It seems to me that presenting information from the two ends of a 95% confidence interval is what is needed to see how good the data is

reducing the uncertainty.

If that is possible and what is needed, why isn't someone getting on and doing it?

In reading the lengthy comments on NicL's thread...other than noting and responding to comments from a few naive people that researchers are nice people or that we go out of our way to treat each other with impeccable courtesy...was this comment from Tom Gray:

If the IPCC is supposed to a an assessment of the peer reviewed literature then why is it allowed for the writing team to create and insert new work. This new work may be valid and useful but it has not been peer reviewed for publication or answer4d in the literature. Why is given weight and substance by appearing in an IPCC AR?This comment does seem to nail the real problem...which was the insertion of original research into what was supposed to be (in my understanding) a summary document of the "state of the art" at the time of its publication.

At the minimum you would need a neutral group of editors who were not associated with the authors of the new material. But what we seemed to have here is people generating new material then acting as editors of their own work in fielding comments on it, with predicable results.

I don't think it's a matter of opinion as to whether what they did was wrong here. It was just wrong, at least if one is applying Bayesian statistics to it, rather than some new statistical framework invented just for this exercise. James comment nails this:

Basically, you have thrown away the standard axioms and interpretation of Bayesian probability, and you have not explained what you have put in their place.Is this only a tempest in a teapot? (That would imply there was nothing substantively of interest in NicL's comments, which there obviously are.)

Carrick:

Yes, it is an interesting situation. The IPCC authors aren't supposed to do any science when writing the report, but this is unavoidable when they are aggregating the results from several papers. I think that they just didn't realise at first that by adopting uniform priors and averaging the results, that they were implicitly adopting a certain (wrong) mathematical framework. Typical example of a bunch of physicists not realising that mathematics lies beneath all that they do. So we try to help out and they act all offended and resistant. That was the really weird bit - we actually were naive enough to think they'd just thank us for our correction and that would be the end of it!

Carrick, I think that may be conflating different issues. I think that the scientists working in this area were dead wrong to promote a uniform prior so monomaniacally; I think the IPCC authors emphasised and reinforced this likewise; however given that context, the specific reinterpretation of FG's result was not IMO particularly contentious, especially as FG had themselves alluded to this in the paper (and G was a co-author on the Frame et al paper on which this uniform prior obsession was explicitly codified).

Also: what jules said, though the averaging thing hasn't been part of this particular kerfuffle.

jules, thanks for the comment. Regarding " Typical example of a bunch of physicists not realising that mathematics lies beneath all that they do." I admit I had a good laugh at that point because it is so true.

I recognize that aggregating other people's work is a necessary (and challenging) part of a report like this.

I believe there needs to be a predetermined workflow associated with certifying a particular result, and any overlap between people generating the aggregated results and those certifying it needs to be carefully managed, because there certainly seems to be a substantive, and likely unavoidable, conflict of interest in this case.

I'm of course not shocked with their intense and really emotional reaction to your criticism. People get very passionate about their own work, which is why it is so important that they not be the ones making editorial decisions about it.

James, I do agree that there has been a conflation of certain issues here. I don't see anything devious or underhanded in how the IPCC authors/editors handled this particular issue.

The averaging thing still shows up by the way. As you probably are aware, Isaac Held has a blog now and has recently posted on using model ensembles to reduce uncertainty.

Ah, climate science, whenever I ask "has anybody tried this" the answer always seems to be yes!

And Annan and Hargreaves (2006) was even cited by AR4 WG1 in discussion of sensitivity estimates (Ch. 10 at least where I looked) - though in a little bit of a wishy-washy fashion.

But I'll ask another "has anybody tried" question just for fun - can you do the same sort of analysis while considering alternate priors (uniform in sensitivity, uniform in feedback, weighted to low sensitivity) and compare what the multiple lines we have now do to the final distribution based on alternate priors? I think that would be valuable in seeing whether we really have enough data yet that people should be converging on an estimate, or not...

Arthur for that I think you want

http://julesandjames.blogspot.com/2009/09/uniform-prior-dead-at-last.html

and the linked paper :-)

crandles - thanks, of course James would have done my second "has anybody tried" one as well :) Though it's not quite what I was looking for, and has raised an alarming further question for me, as these things often do...

That is - suppose you pick a prior in feedback that has a finite probability for 0 (implying climate instability, S = infinity). What distribution does that leave you with in S? Has anybody tried looking into that from a Bayesian perspective? I.e. do observations actually significantly constrain the likelihood of real instability, or not?

Quote from abstract:

"When instead reasonable assumptions are made, much

greater confidence in a moderate value for S is easily justified, with an upper 95% probability limit for S easily shown to lie close to 4oC, and certainly well below 6oC.

Quote from paper

"We truncate this prior at 0oC and 100oC for numerical convenience but, in contrast to the uniform priors previously discussed, the influence of the upper bound on the results presented here is negligible."

I would hope that the prior with finite probability of feedback=0 and sensititivity infinite would be extremely small; smaller than in the cauchy distribution in sensitivity being discussed here if it is to be considered a plausible prior.

If that finite probability is higher and not a plausible prior then who cares? That would merely be making a scary outcome by using a non believable prior. (No-one in climate science would go for such scare-mongering would they? ;op)

The reason the upper bound is neglibible in this case is that the data is good enough. Surely the paper is saying that the data is not only good enough to constrain the likelyhood of real instability but also good enough to make sensitivity of more than 6C (if not 4C) unlikely.

The last paragraph should have started:

The reason the influence of the upper bound on the results is negligible is....

It depends what the reasoning behind the "prior" is, here. Supposedly we're trying to use observations to constrain the likelihood of particular values of the equilibrium sensitivity. Therefore, observational evidence should not factor into the "prior", rather it should be based purely on a priori theoretical considerations (or complete lack of knowledge if you don't believe the theory).

Now, the theory I believe says equilibrium sensitivity is the bare Planck response (1.2 C within 10%) divided by 1 - f_wv+lr - f_cloud+other. The water vapor plus lapse rate term is about 0.5, within maybe 20%. The cloud term, as James' paper points out, is believed at least slightly positive but where most of the uncertainty lies. A priori, there is no strong reason (apart from observational evidence) to constrain f_cloud+other to less than 0.5. Therefore I would think the most logical prior from a theory standpoint would have the 1-f feedback term peaked around 0.4 with a range from say -0.2 to 1.0 or thereabouts.

That is, a priori, there is good theoretical reason to include instability. If you do, then how well do observations constrain it? It's not clear to me James' paper quite addresses that...

You have the prior based purely on a priori theoretical considerations whereas the paper is quite clear the prior in use is knowledge known as at 1979 (though it doesn't have to be done on a strict time basis).

Would we be here if CO2 concentrations have varied and the increases seen would be expected to boil away the oceans? I don't think that line of reasoning comes into quantitative observational evidence that tries to measure sensitivity. Whether you want to call this theoretical or obsevational does not matter. If the prior is knowledge at 1979 then this is included in the prior.

You can argue that you can never get anywhere with Bayes thereom because you need a prior and if that prior comes from some evidence or line of reasoning, then you need a prior from before you knew that information. That is just silly and ignores the fact that you can draw a line somewhere and accept there is subjective uncertainty about the prior. You can try different plausible priors at that stage and see if it makes much difference.

I suggest it is sensible to draw the line so that our existance is part of the prior not part of the observational evidence.

If we agree on including our existance in the prior then I suggest the 1-f feedback term might not rule out numbers in the -0.2 to 0.01 range but I suggest they need to be very unlikely possibilities.

crandles - if we're including observations prior to 1979 in determining the prior than we're double-counting observations in the statistical analysis, are we not?

Anyway, instability is not the same as boiling away the oceans. For one thing the temperature increase to equilibrium is slow precisely because of the slowness of ocean response. The problem with direct observational constraints known at any point in time (by 1979 if you like) is they are limited in time by human records, and cannot include the full equilibrium response. If the controlling parameters vary up and down sufficiently quickly we may have an unstable system that nevertheless remains bounded because the perturbations pushing up are countered by the perturbations pushing down - until now at least.

Second, as a nonlinear system the current value of Earth's climate sensitivity is not some fixed number but is a function of the current state of the system. The current state may possibly be unstable but that only means it is susceptible to moving relatively fast to a new stable state, which could well be just a few degrees warmer than now, rather than into Venus conditions. The rather abrupt transitions between glaciated and de-glaciated conditions on Earth suggest to me a certain instability associated with that transition between two more stable states.

One of those two stable states is the relatively flat Holocene warmth we've had for the last few thousand years. Do observations actually clearly constrain sensitivity to be finite and relatively during that period? And how much can observations constrain sensitivity as temperature goes up from that stable state?

>"if we're including observations prior to 1979 in determining the prior than we're double-counting observations in the statistical analysis, are we not?"

I hope we agree that all information should be included either in the prior or in the observational evidence. Through practicalities of avoiding double counting it may be necessary to not include some information in either prior or observational evidence but that isn't ideal leading to an assessment that is known to be too uncertain.

"Knowledge as at 1979" was intended to exclude measurements made after 1979 relating to times before 1979.

If we are updating our prior with observational evidence of F&G and perhaps also some paleoclimate simulations done after 1979 and temperature response to Pinatubo, all other evidence should be in the prior. So I am rather unclear what observations you are suggesting are being double counted.

>"For one thing the temperature increase to equilibrium is slow precisely because of the slowness of ocean response."

True, but we really don't know how rapidly CO2 was changing a long time ago but we do know there are long periods of time when the CO2 level is significantly higher.

>"climate sensitivity is not some fixed number"

I agree it isn't, but presumably the short term sensitivity including only fast feedback is normally less than the long term sensitivity including slower feedbacks like ice sheet response.

I suggest that even at 1979 we would know that over long periods of time the sensitivity is almost certainly going to be less than 50C and normally over shorter periods of time the sensitivity is going to be less. That doesn't rule out short periods of instability but the very nature of arguing what is normal allows you to see that instability is abnormal and hence a low probability should be assigned in the prior.

What you appear to really want to

know is whether the observational data makes instability unlikely.

I still think James and Jules paper does answer this saying the data does shows instability is unlikely. Of course you can specify what 'unlikely' is and for any level set you can create that level of probability by working back to what the prior needs to show to create that. However, you would find that the prior has to make instability absurdly probable (as in a silly 99.9% chance) to get a reasonable chance of this still being the case after the observational evidence is used.

That seems to be the nature of the situation and is as close as you could reasonable hope to get your request for "Do observations actually clearly constrain sensitivity to be finite".

At least, that is the impression I have gained from the paper.

>"And how much can observations constrain sensitivity as temperature goes up from that stable state?"

That seems a much trickier question. Perhaps I will wait for James to help out here ;o)

I think Chris has given quite a detailed answer in my "absence". As for what we will learn in the future (how quickly), there have been a few papers which look at that. Mostly, they only consider what we will learn from a gradually increasing temperature, and there might be other ways to learn to (eg F&G).

Post a comment