Monday, September 03, 2012

Jolly Hockey Sticks

Our brief foray into the last millennium was published recently on Climate of the Past. I didn't want to tread on too many toes so stayed well clear of any attempt to generate a reconstruction of past climate, instead focussing purely on more methodological issues. I'm interesting in ensemble-based data assimilation methods and have been following the pioneering work of Hugues Goosse in applying these ideas for climate reconstruction. The two main questions the work tried to address were: (1) how well is it possible to reconstruct the climate based on a handful of sparse and imprecise observations, and (2) are ensemble-based methods a viable approach for this? Rather than using real proxy data, this investigation was a purely synthetic experiment in which pseudoproxy data are taken from a model simulation. This makes it easy to check the performance of the algorithm, since we know what the real answer is supposed to be.

A pessimistic interpretation of our results would be that there is rather little that can be learnt from the scarce data that are available before about 1500AD, although the performance was significantly better in the presence of external forcing (with its associated large-scale response) than when we just considered a control run with internal variability alone (which gives much higher emphasis on regional variability). With a global change, even sparsely distributed data can give the overall picture pretty well, but where internal variability of the climate system is concerned, it leads to sufficiently small scale patterns that you need local and accurate data to have much idea what is going on in that respect.

These results aren't due to any peculiarity or limitation of the particular method we used, but are fundamental constraints due to the low information content of a handful of proxies. One thing I hadn't really thought through before is the implications of the limited accuracy of the proxies. A typical "signal to noise ratio" of 40% (using the paleoclimate convention for this term) means that the uncertainty is 2.5 times larger than the signal, so it takes quite a lot of proxies to average together to reduce the error to a useful level. We didn't even consider the realistic possibility that the errors might be correlated across different proxies (eg due to a large scale precipitation anomaly, or even calibration issues).

When there are lots of proxies (as in the last few hundred years), then the ensemble method technically failed, in that the ensemble collapsed. This is a well-known phenomenon in this method, but in fact the mean estimate was still quite good, it was just the (predicted) uncertainty was rubbish. So the practical value may exceed its theoretical performance in some cases. That was a surprisingly positive result, to me at least.

Although it was interesting doing the work, I don't expect to do much more along these lines in the near future. There is simply too much other higher-priority work to be done. It does perhaps help to provide some perspective on the hype surrounding this paper. There is simply no way that a local proxy can provide a meaningful estimate of the hemispheric let alone global temperature.

The manuscript managed to get quite high up on the "most commented papers" page. This was not due to any particular notoriety or controversy, but simply due to fact that 3 reviewers and another commenter all made useful contributions, to which I responded individually. The open review system seems to be working pretty well, I'd say. It's a shame that the AGU hasn't taken the opportunity to do something more radical with its recent reorganisation of its publications. Hooking up with a conventional profiteering toll-access publisher (and one with some fairly unsavoury activity in its recent history) is I suppose the easy option for a bunch of conservative greybeards, but I can't help but think of it as a missed opportunity.

13 comments:

PeteB said...

..there is rather little that can be learnt from the scarce data that are available before about 1500AD,...

Does that mean it is a bit of a waste of time trying to reconstruct global or hemispherical temperatures prior to that point with the proxies currently available - surely that would be a significant result

ob said...

Thanks for that nice little comment on the graybeards.

James Annan said...

Pete, well you could say that, though to be pedantic this research did consider just one (well-used) set of proxies, and there may be a little more data out there. Also, an estimate with wide uncertainty bounds is still an estimate...

Paul S said...

'Also, an estimate with wide uncertainty bounds is still an estimate...'

Dare to put a +/- figure on uncertainty prior to 1500, for a Hemispherical reconstruction?

Hank Roberts said...

It'd be interesting to see that kind of estimate done decade by decade, considering how new proxies (whole new technology, not just new data sets) come along fairly often.

James Annan said...

Paul, well we were using a Bayesian approach, so perhaps a more relevant way of looking at things is to ask how much change there is from the prior. And the answer is...not a lot. Figs 3 and 5 of the paper (at the link) summarise the state of play.

crandles said...

So looking at Fig 3 for 1000 - 1099, the actual variation is at least twice the variation the reconstruction shows. For a really large sharp peak of a 0.4C anomaly the reconstruction might only pick up a quarter of that.

The obvious question would seem to be: How should this be interpreted? Was the Medieval warm period likely to be twice as warm as reconstructions show? Twice as warm and also likely to miss a short sharp .2C anomaly? Something else?

Does this matter?

Tom C said...

"One thing I hadn't really thought through before is the implications of the limited accuracy of the proxies."

James, I have commented in a snarky way on your blog before and the result is a "boy who cried wolf" loss of credibility when I want to be constructive. But I'm not being snarky here. I just can't fathom a person with sophisticated mathematical understanding not thinking about this from the very start. I'm going to propose that we engineers, who live and die by understanding real-world systematic error, were much more attuned to this than academic mathematics types. What do you think?

EliRabett said...

There is another way of looking at this, which is, how many more proxy series at which locations are needed to extend the measurements further back.

James Annan said...

Tom, the thing you might be missing is that I haven't been at all involved in millennial reconstructions until recently.

Besides, is is really obvious to you that with a single observation of an unknown value (such as temperature at a particular point in space and time) which has a "signal to noise ratio" of 40%, the posterior uncertainty of that value is a whopping 93% of the prior uncertainty? That is, the observation only reduces the uncertainty by 7%. It's a trivial calculation, but not one I had actually done before, even in my head.

James Annan said...

Chris, frequentist approaches - which are rather more common - may behave somewhat differently. I haven't looked at them, so wouldn't like to give a general recommendation based on my results.

However, variance loss is a well-known problem and there has been some research into so-called "variance preserving" methods. Shrinkage towards the prior mean is an unavoidable fact of the sort of approach we took. Other methods sacrifice aiming for the "best" (ie error-minimising) estimate.

Tom C said...

James - Think of it this way. What you are looking for is a 4 cm golf ball (the signal)sitting in 10 cm high grass (the noise). You would need a lot of observations (additional proxies in same area) to find the golf ball. Probably 100 or so to get ~90% confidence.

Magnus said...

Speaking of them Ljungqvist et al are prociding with their new proxies... think there will be any "discussion" about it? http://geography.cz/sbornik/wp-content/uploads/2011/06/g11-2-1ljungqvist.pdf