Friday, August 06, 2010

Wiley Interdisciplinary Reviews: Climate Change

A new journal has sprung up recently, I'm not entirely sure why or how, but it seems to be open access for now (not indefinitely) and has some interesting papers so maybe some of you would like to take a look. Called "Wiley Interdisciplinary Reviews: Climate Change" it seems to be a cross between an interdisciplinary journal and collection of encyclopaedic articles on climate change. There are a number of other WIREs journals on unrelated topics such as computational statistics, and nanomedicine and nanobiotechnology.

The Editors seem a bunch of slightly unconventional people, a little removed from the mainstream IPCC stalwarts though eminent enough and with some IPCC links: Hulme, Pielke, von Storch, Nicholls, Yohe are names that many will be familiar with. The others are probably all famous too, but I'm too ignorant to recognise them. I'm sure the journal is not intended as a direct rival to the IPCC, but it may turn out to provide an interesting and slightly alternative perspective.

The articles to date include a mix of authoritative reviews from leading experts - such as Parker on the urban heat island, Stott on detection and attribution, interspersed with perhaps more personal and less authoritative articles. I can safely say that without risk of criticism because one of them is mine - a review on Bayesian approaches to detection and attribution. This article had a rather difficult genesis. I was initially dubious about my suitability for the task and indeed the value of the article, but after declining once (and proposing another author, who also declined) I changed my mind and had a go. My basic difficulty with addressing the concept is that D&A has always seemed to me to be a rather limited and awkward approach to the question of estimating the effects of anthropogenic and natural forcing, which is tortured into a frequentist framework where it doesn't really fit. Eg, no sane person believes these forcings have zero effect, so what exactly is the purpose of a null hypothesis significance test in the first place? However, conventional D&A has such a stranglehold on the scientific conscious that most Bayesian approaches have actually mimicked this frequentist alternative of the Bayesian estimate that you really wanted in the first place. It all seems a bit tortured and long-winded to me.

Anyway, I eventually found some things to say, which hopefully aren't entirely stupid and help to show how a Bayesian approach might actually be useful in answering the questions that (sensible) people might want to know the answer to, rather than the relatively useless questions that frequentist methods can answer, which are then inevitably misinterpreted as answers to the questions that people wanted to answer in the first place (as I argue and document in the article).

Another of the personal and argumentative articles was contributed by Jules, who was invited to say something about skill and uncertainty in climate models. This was actually the article that sparked off our "Reliability" paper, as our discussions kept coming back to the odd inconsistency between the flat rank histogram evaluation that I know is standard in most ensemble prediction, versus the Pascal's triangle distribution that a truth-centred ensemble would generate (ie, if each model is independently and equiprobably greater than or less than the observations, then the obs should generally be very close to the ensemble median). Of course this problem didn't take long to solve once we had set out the issue clearly enough to recognise that there really were two incompatible paradigms in play, and Jules even ended up citing the GRL paper which overtook her WIREs one in the review process.

Perhaps of more widespread interest to other readers, is a simple analysis of the skill of Hansen's forecast which he made back in 1988 to the US Congress. We'd actually had lengthy discussions with several people (listed in the acknowledgements) a year or two ago, trying to resurrect the old model code that was used for this prediction in order to re-run it and analyse its outputs in more detail. But this proved to be impossible. (The code exists but has been updated and gives substantially different results. If only the code had been published in GMD!) Therefore we were left with nothing more than the single printed plot of global mean temperature to look at. This didn't seem much to base a proper peer-reviewed paper on, so the idea died a death. When this WIREs invitation came long it seemed like a good opportunity to publish the one usable result we had obtained, as an example of what skill means. The headline result is that under any reasonable definition of skill, the Hansen prediction was skillful. While no great surprise, I don't think it has been presented in quite those terms before. It's a shame that we weren't able to generate a more comprehensive set of outputs though which might have given a more robust result than this single statistic.


The null hypothesis of persistence (no change in temperature) was found to give best performance over the historical interval, compared to extrapolating a trend. So this is the appropriate zero-skill baseline for evaluating the forecast. Nowadays with the AGW trend well established, probably most would argue that a continuation of that trend is a good bet, though that still leaves open the question of how long a historical interval to fit the trend over. Anyway, the model forecast is clearly drifting on the high side by now - most likely due to some combination of high sensitivity, low thermal inertia and lack of tropospheric aerosols - but is still far closer to the observations than we would have achieved by assuming no change. Furthermore, the observed warming is also very close to the top end of the distribution of historical 20 year trends, meaning that the observed outcome would be very unlikely if the the climate was merely following some sort of random walk. This evidence for the power of climate models is obviously limited by lack of detailed outputs for validation, but what there is is clearly very strongly supportive.

14 comments:

Steve Bloom said...

Hmm, yet another pay-to-view journal. Weren't there open-access ones you could have gone to instead?

Anyway, the combination of Hulme, von S. and RP Jr. causes me to think the obvious Dark Thoughts. We shall see.

James Annan said...

Hey, it's free for now, so don't complain. And we are hardly the sort of people to turn down invitations to present our opinions (well, as I said, I almost did). Actually, there is not a good open access option for most of my work, as I have grumbled before, what with the EGU climate journal focussing specifically on paleoclimate.

Tom C said...

James -

This is not persuasive. Why was the null hypothesis closer to measurements from 1950 to 1980?

Steve Bloom said...

And as it turns out we saw very quickly indeed. How bold of Esper and Frank to approve of Esper & Frank 2009's assessment of work on the MWP so definitively.

I do think Zorita's background makes him perfectly suited to be one of the new Proxy Cops.

James Annan said...

Tom, we checked which of the two obvious baseline forecasts would have performed best on average over the whole historical record. Trying to fit a trend does worse than just assuming no change. Previous apparent trends (eg 1880-1910, 1910-1940) had all abruptly ended, there was no reason to assume things would be different in 1988 - unless you accept the model's prediction, of course...which is precisely the point.

I see now there is a typo in the graph - 1990 should be 1900 of course.

James Annan said...

Steve, that is clearly labelled as an opinion piece, and I don't see much to complain about there. (I may be missing something though - I'm not really a hockey player.)

Steve Bloom said...

There certainly are things to complain about, including the two I mentioned, but I have no worries about the article getting any serious attention. I mentioned it because I think it is grounds for worrying about the future direction of the journal. But for now, I'm not arguing that there's a reason to not submit papers to it.

Anonymous said...

"Trying to fit a trend does worse than just assuming no change. "

How do you do this? Do you go through the record testing the various predictions Eg, did you test whether the next 22 years of the record was better predicted by the trend of the previous N years, for N between 1 and 80? (80 is chosen because we have about 100 years of temperature record, and we have to at least test a prediction on 1966-1988 in order to test a model which will be used to predict 1988 to 2010).

I believe you that predicting "no trend" might beat any trend prediction regardless of N, but I'd still be interested in knowing whether you did the full calculation...

-M

James Annan said...

Well, the focus was on a 20y forecast, since that is the amount of data we now have for validation, but yes, we tried a range of different hindcast periods for fitting the trend and for averaging for the sake of persistence - up to 30y, IIRC. For a very long hindcast interval, you can't fit many instances into the data (and they all overlap, so are not independent).

Tom C said...

James -

I don't think you realize the problem that the 1950-1980 period represents. What combination of inputs led to the flat GTA during this period? If there is no good answer to this the models can't be skillful. Saying that the curve goes up from the late 1900s on and models predict a continued increase is pretty lame.

EliRabett said...

As Hansen himself said, any half rational GCM has to get global temperature anomaly ~right for 20 years. The constraints on the sensitivity and the forcings (absent the sun going out or fossil fuel burning stopping) are not strong enough to move the prediction much.

Global temperature is much too easy to be a real test over less than a century.

James Annan said...

Tom,

Based on your comment, I can only surmise that you don't know what "skillful" means in this context. It is a standard term of art which is defined in the paper, you could try reading it.

Tom C said...

OK, I will read the paper. However if you could indulge me for one last question: what if Hansen had the models and methods of 1988 available to him in 1950? What would his prediction have looked like and what would Jules' analysis have concluded 30 years on?

EliRabett said...

Tom, he would also need 30 or so years of prior forcings, solar ghg, etc. to wven start. The model had to be validated against past observations. In 1950, Mauna Loa was not even a model, and the method had not yet been invented.

This is not to be snarky, but to point out that there is an entire edifice that Hansen had to stand on (see the banner for JeB).