Friday, December 19, 2014 Project ICAD and UKCP09

One of the more interesting talks for us in the Paris conference mentioned previously here was James Porter talking about UKCP09. It turns out there has been a social sciences project ("Project ICAD") part of which involved looking at the UKCP09 project (and they are based in Leeds University, perhaps an additional reason for a visit there some time?). We were in Japan over the entirety of the interval in which UKCP09 took place, and only had limited contact with the relevant parties, but perhaps know enough about the issues for our perspective to have some relevance. The speaker had spent some time embedded with the Hadley Centre and had talked to a lot of people involved in the production and review of the UKCP09 project.

A significant part of Porter’s talk looked at the question of how the probabilistic predictions were made, and in particular the UKCP choice of basing their probabilities primarily on their ensemble of HadCM3 simulations with different parameter values (perturbed parameter ensemble or PPE), rather than basing their results on the CMIP3 ensemble of different models (multi-model ensemble, MME). I was surprised to see this presented as such a major decision, as my recollection is that most of the critics at the time were really complaining about the willingness of UKMO to generate probabilistic predictions at all since (the critics argued) there was not really a sound basis for assigning numerical values by any method.

The main UKCP09 proposal was (according to their web page) funded in 2004 and at that time, it seemed quite widely accepted that PPEs were a better foundation for probabilistic prediction than the MME. In fact this era was very much the  heyday of PPEs, with, the Hadley Centre’s QUMP group and our own rather smaller ensemble research activity all making rapid progress. The UKCP09 approach was externally reviewed back in 2008/9. The full review doesn’t seem to be available (anyone know where it is?) but I don’t see any evidence in either the summary or response that the question of MME vs PPE was seriously raised by anyone even at that later time.

I believe (though I could be wrong and would welcome references) that we were actually the first to argue the contrary. The roots of our argument can be found in this Yokohata et al paper (which although published in 2010 was submitted back in 2008), which pointed to substantial inconsistencies between two PPEs based on our two different GCMs (MIROC and HadCM3). However it was actually our series of papers on ensemble analysis starting in 2010 (eg here, here, here and here) that most clearly argued not only that PPEs had serious problems, but also that the MME was much better than previously believed. So while I’m encouraged to see that this question is now high on the agenda, I really don’t think it was on the table at the outset of UKCP09 and it doesn’t really seem fair to use it as evidence of insularity or reflexive dismissal of outside ideas, which seemed to be the speaker’s point. Given the work they had already done by 2010 or so, the UKCP09 researchers actually made quite substantial and constructive efforts to account for the (by then) emerging failings of the PPE approach by effectively adding on the MME’s uncertainty to their results. While this may satisfy neither the resolutely anti-Bayesian nor the most purist pro-Bayesian, in my view it certainly improved the credibility of their results.

Some of the interviewees gave excuses for their apparent reticence to air their doubts openly at the time. According to Porter, some of them said they were scared of being labelled sceptics! What a feeble excuse. Perhaps more plausible, is the additional argument that the incestuous and cliquey nature of climate science in the UK made it a bit of a career risk in terms of future funding. But in any case, I certainly recall some people making their criticisms very plain. In particular, Lenny Smith argued eloquently about the risks of assigning probabilities where there was not really a sound basis for them. If the next model generates different results (which is entirely plausible) then someone is going to end up looking rather silly.

So I’m not going to stick the knife into the Hadley Centre for proposing in 2004 to base their probabilistic predictions on a PPE methodology that they had already started to work on. You would have had to be unusually prescient to anticipate our research by several years, although I’m encouraged to see it is now obviously high on peoples’ minds. On the other hand, the Hadley Centre’s apparent continuing preference for PPEs is hard to defend, now that they have a chance to regroup. To that extent, perhaps this Project ICAD analysis contains a truth that is deeper than the actual story they purported to tell :-)


crandles said...

At the July 2004 open day, I gained the impression that they really wanted other models to be done because a MME would better capture the range of uncertainty than their PPE.

Managed to go back to July 2004 presentation and found a slide saying

-What are we doing, physically?
-Ideally, we’d take an unbiased sample of all viable climate models, but we can’t do that
„-Best we can do is take this scatter-gun approach
„-Repeat with other models

Of course a presentation is not really arguing the point in the peer reviewed literature.

Don't know if you want to correct double words like "(the critics argued argued)" and "actually our our series".

James Annan said...

Thanks, I think it's a little different to hope for other groups to do similap PPEs and compare results, versus my claim that the ensemble of (tuned) CMIP models is already pretty good. From what I remember (which may be a bit partial and bised of course, it was 10y ago) CPDN members were generally at the forefront of claims that the CMIP models were over-tuned and too similar...

Tim Osborn said...


though not directly relevant to the PPE/MME choice, it's also interesting to look a little further back, to UKCIP98 and UKCIP02.

These were each based based on single models (HadCM2 for UKCIP98, and a HadCM3/HadAM3H/HadRM3 linkup for UKCIP02).

Though they both used an MME to give some context for the single-model projections (e.g. figures 36, 38, 48 of UKCIP98 compared HadCM2 with CGCM1, ECHAM4 and GFDL-R15; while a larger MME was available by the time of UKCIP02, which compare HadCM3 with eight other GCMs in figure 28), nevertheless the main projections were based on those single models.

And this nicely highlighted the problem with using single models (though arguably we couldn't have done much else at the time). Annual precipitation totals were projected to increase in UKCIP98 but that changed to a decrease in UKCIP02. Across this period there was a change in emphasis, shifting from concern with future flooding to concern with future drought, and there were shifts in funding that reflected this.

I don't argue that these changes in emphasis/concern on flooding to drought were driven entirely by the UKCIP98/UKCIP02 projection differences (and in fact the differences aren't as large as they appear, since they both projected wetter winters and drier summers, and the sign of the annual precipitation change arises from relatively smaller differences in which of these competing changes dominated and whether spring/autumn changed much). But nevertheless there was an influence and when I discussed the projections with various users at the time, they asked what fundamental change in scientific understanding between 1998 and 2002 had led to a change from projecting a wetter future for the UK to a drier one.

Of course, nothing fundamental had changed. This is just what can arise when you sample single models from a range of possible models, if that range spans zero change. A point that I, and others, made when reviewing UKCIP02 and making recommendations for what was needed from UKCP08 (as it was originally, before becoming UKCP09). The gist was that whatever concept the new scenarios would adopt, they should reflect the gradually changing nature of modelling/understanding -- when a new model (like HadCM3) came along, it should subtly nudge the distribution of UK climate projections rather than cause a flip from one thing to another.

By ~2004 it was both possible and necessary to do better at characterising projection uncertainty and including this in the main projections rather than as somewhat separate comparison.

As well as the scientific issues that you mention, James, about whether a predominantly PPE or MME approach is preferred, there was also a practical one: if new simulations were needed then it would be more difficult to commission them from multiple climate modelling centres than from just one. I suppose this practical advantage favoured a Hadley Centre PPE approach.


crandles said...

Your reply and further review of the 2004 open day has convinced me that my earlier comment was badly conceived and worded and that my memory was also faulty (certainly regarding date and probably more than just date).

'Scatter gun approach' certainly does sound like a response to belief in over-tuned and too similar models.

I think I would like to know more of the context to

“A major strength of the Met Office Hadley Centre’s reputation is its unified model and the inclusion of other models can raise awkward questions about why don’t we pay more attention to the American, Canadian, French, or Swiss models?” (Met Office Climate Scientist 12 – Interview).

as that seems a little strange to me.

Also what give rise to "is the Hadley Centre’s apparent continuing preference for PPEs"?

James Annan said...

Yes, I also don't get the point of that comment about paying attention to other models. As for PPEs, tweets such as this one