I've been meaning to write about decadal prediction for some time, but kept on procrastinating, and the number of relevant papers has increased beyond the scope of a single post. So I'll do a number of short posts instead, until I run out of things to say, or interest, or readers...
First off the bat is this new paper from Geert Jan van Oldenborg et al. I remember meeting Geert Jan way back when I was first learning about data assimilation (with the help of google, I believe it was here) and he was scarily clever back then, but his primary focus has been on shorter-term prediction, so our paths only cross occasionally. Some readers may know of him through his Climate Explorer web portal thingy. In this paper he looked at the reliability of the trends of the CMIP5 ensemble over the last 60 years. We actually did something similar in a rather perfunctory way as part of this paper for CMIP3 (eg Fig 1) and also more recently here for CMIP5 (see Fig 2) but these papers were looking more at equilibrium climatologies and also single (perturbed parameter) versus the multi-model ensembles. When looking at the raw temperature trends from the models, this new paper gets a similar result to us, that the rank histogram is near enough flat to be called reliable, though there's a moderate tendency towards the models warming too much:
(The red line shows the histogram of the rank of the observed trend at each grid point, within the CMIP5 ensemble spread - ideally, it would be flat, and the slope up to the left means that there are relatively more obs in the low end of the model range than at the top end.)
But they then tried an additional step, to look at the regional trend relative to the global mean temperature for each model and obs. And in this case, the obs frequently lie outside the model (ie, the big bins at each end of the red histogram below):
The implication of this different result is that the
models have insufficient spatial variability in their regional trends (which is consistent with what others have shown too) -
broadly speaking, the models that matched the highest observed regional trends in the previous analysis, did so by
warming a lot globally, and those that matched the lowest observed trends,
warmed only a little everywhere. So, to the extent that the models did get the right results overall (in the first analysis), they did it by having a wide range in global responses which makes up for their unrealistically low spatial variability. The precipitation trends, however, are poor even without this extra normalisation step. There are a number of possible explanations for this behaviour, but the result is, as they say, "This implies that for near-term local climate forecasts the CMIP5 ensemble cannot simply be used as a reliable probabilistic forecast."
One thing I'd have liked to see is an investigation of the robustness of their result with respect to observational errors, which they don't seem to account for. There are some places where the observed trend seems to vary wildly between adjacent grid boxes, which seems physically unlikely. If not corrected for, obs errors will tend to increase the end bins of the histogram. It would be unreasonably optimistic to expect the ensemble to be perfectly reliable in all respects so I don't doubt the overall conclusion. But it would be interesting to see.
One thing I'd have liked to see is an investigation of the robustness of their result with respect to observational errors, which they don't seem to account for. There are some places where the observed trend seems to vary wildly between adjacent grid boxes, which seems physically unlikely. If not corrected for, obs errors will tend to increase the end bins of the histogram. It would be unreasonably optimistic to expect the ensemble to be perfectly reliable in all respects so I don't doubt the overall conclusion. But it would be interesting to see.
5 comments:
Yes, what this seems to me to show is exactly what one would expect from too course a grid or too much dissipation, the spatial variationl is "damped." Think back to your fluid dynamics days, James. Remember what the course grid results looked like compared to the finer grid results vs the data. And remember how it was impossible to see the "details" in a simulation with 1st order upwinding instead of 2nd order.
Well, I have to give a definite maybe to that. I see the point, but if you are right, then you'd expect to see a substantial change in characteristics as resolution changes. I don't think this is seen (though to be honest, I haven't looked into this much). The models do reproduce dynamical structures that look pretty good in terms of synoptic weather - hurricanes are about the limit, but that is well below the scale of structures relevant here.
Also, it should be straightforward to show such differences using much simpler dynamics-only models (without the full physics that costs so much time and effort). People here certainly worked on numerical schemes while developing high resolution models for the Earth Simulator, but I don't know what came of it.
I realise I'm caricaturing the "efficient market" economist joke, but if numerical methods alone made such an important difference, I reckon someone else would have done it by now :-)
James, A definite maybe is fine. Definitive evidence would be provided as you say by doing grid refinement studies, even though even that is not definitive. An instructive example of this is the NASA drag prediction workshop in which such studies were done in about 20 or 30 CFD codes. Initially, the results had a much bigger spread than expected, due to the striking positive results bias in the literature. So, the problem was made easier for further workshops. Finally, it was a trivial case and with SPECIFIED grids and a fixed turbulence model there was reasonable consistency. The short summary: It was much worse than we thought. There is one figure in the long paper I sent you on this and some discussion.
My problem here in climate science is that it looks a lot like CFD in 1996 before people really got beyond the "results look good" and "the codes are useful" syndrome and actually looked at quantitative measures of these things.
Web site for drag prediction workshops: http://aaac.larc.nasa.gov/tasb/cfdlarc/aiaa-dpw/
It is somewhat humorous to watch the evolution of the "story" from DPW1 to DPW5. Bottom line, small effects (and some large ones too) are not robustly predicted by RANS codes and the large knob of gridding is rather close to "dial a drag". Of course, drag is a "small" effect with tremendous implications, rather like the CO2 effect in climate.
Oh look, David thinks he's discovered another nail.
Steve, I gave an example of a subproblem of the climate problem where we have participated in proving rather conclusively that failure to improve numerics led to confusing and inconsistent results. I've given you references that are readily available. But not the ones that are in press. Those are available to responsible scientists on a case by case basis. If you have a technical comment of substance I'd be happy to entertain it. Non specific derogatory comments are not helpful in this context.
Post a Comment