So, I'm left wondering....is this result actually new for a much broader sphere of ensemble methods than the small niche of climate science that I currently reside in? (And perhaps I should also ask for confirmation, is it really true? It would be embarrassing to find some oversight that invalidates the whole thing.)

When I first wrote it down - in reply to a reviewer comment on a previous paper - I assumed it to be a well-known result, and I think I must have seen the basic idea somewhere before, because it popped into my head so easily. But so far no commenter has found prior evidence of this equation - more generic demonstrations that the mean is better than a typical ensemble member are known in some (not all!) fields, but not the exact formula that I presented.

So, the hunt is on for a direct reference...your prize is my eternal gratitude. Or a blog post on the topic of your choosing, so long as it's ENSO and annual temperatures :-)

When I first wrote it down - in reply to a reviewer comment on a previous paper - I assumed it to be a well-known result, and I think I must have seen the basic idea somewhere before, because it popped into my head so easily. But so far no commenter has found prior evidence of this equation - more generic demonstrations that the mean is better than a typical ensemble member are known in some (not all!) fields, but not the exact formula that I presented.

So, the hunt is on for a direct reference...your prize is my eternal gratitude. Or a blog post on the topic of your choosing, so long as it's ENSO and annual temperatures :-)

## 12 comments:

I don't know of a reference, but it would be interesting to sniff around the wisdom-of-crowds literature. There are many attributions there about convergence to truth. It makes sense in cases where there are many iterations with rapid outcome feedback (as in a commodity market), but that's nothing like climate.

One particular example would be Delphi methods, which iterate to obtain convergence in expert judgment. That seems like a pretty explicit application of the truth-centered paradigm. Results in energy forecasting are typically disastrous.

Now that I think about it, this is a lot like the variance decomposition in Theil's inequality statistics.

Perhaps a little elaboration of the decomposition would better illustrate Belette's comment on the previous post, that the common claim is that the ensemble mean tends to be better than most of the ensemble members.

http://models.metasd.com/2010/03/theil-statistics/

Is it really true?

The RMS 'error' is reduced even in the case where you have an ensemble of random time series. It suggests to me that having a reduced RMS 'error' is not necessarily the same thing as being a better model.

Nebuchadnezzar

>"It suggests to me that having a reduced RMS 'error' is not necessarily the same thing as being a better model."

To state the obvious twp points: if you have a utility function that is linear to absolute error distance instead of to error squared then it isn't better it is only the same.

Secondly, a model with about the right level of internal variability representing weather noise might be more useful for many purposes than a model with the same error distances but far too little internal variability to represent weather noise.

So yes, RMSE is reduced (as long as models have some differences) but this might well be viewed as 'not better'.

Better comparison are:

1. If you want the weather noise don't compare means that destroy the weather noise as these are worse on measures that are different from RMSE.

2. If you don't want the weather noise, then compare the MMM to an initial condition ensemble mean for each of the models. The MMM will still have a reduced RMSE compared to the expected value for a random model initial condition ensemble. However, I believe it becomes far more likely that the best will be a model ic ensemble mean rather than the MMM.

More or less what crandles said. The sum of the squares of the differences between points and the true value will be dominated by the points farthest from the true value.

It is trivial to realise that the average of the points will lie closer to the true value than those points which are furthest away.

You don't even need any equations.

Yes, I think that line of thought is basically the same as the Schwartz inequality proof that Dikran Marsupial pointed to in the other thread. But "my" equation has the advantage of precisely quantifying the effect.

And in answer to Nebuchadnezzar, what Chris said. The proof does depend on the specific definition of "better", a RMS-minimising estimate is not necessarily a realistic one in many respects but it has widespread use in many fields for a number of practical and theoretical reasons. For a long term forecast, the climatological mean is RMS-minimising, but we *never* see a weather map that looks anything like this.

By "my" are you perchance trying to take credit for the Annan equation?

It is often said that the best way to get a question answered on the intertubes is not to ask a question but to make an incorrect assertion.

So yes, it's mine, all mine, and I dare anyone to prove otherwise!

"And in answer to Nebuchadnezzar, what Chris said."

I understood that. I was charmed by the fact that your RMS deviation can get smaller for two reasons:

1. you are getting closer to the truth

2. that's just what happens when you average together a bunch of random tat.

The ideal situation and the worst case scenario look identical!

Nebuchadnezzar

Neb that's no quite right. Adding a bad model can make the RMSE of the mean worse than it was before, but it will be better than the mean of the new ensemble.

James, I'm still looking. Well, still thinking about having a good look when time allows. So far it looks like it's indeed a folklore result, so good luck with the reviews.

Well it seems that calling it folklore may be overstating it - I've not yet found anyone who admits to prior knowledge of the equation. I might have to big it up a bit in the paper!

ac: point taken. I'm not saying it's always the case. I think that we are both using the word 'can' because both situations 'can' apply in different circumstances.

Nebuchadnezzar

Post a Comment