## Tuesday, January 17, 2006

### Probability, prediction and verification III: A short note on verification criteria

Forecast verification is the act of checking the forecast against the observed reality, to see how good it was. The basic question we attempt to answer is "Was the forecast valid?" Note that this is a distinct question from "How skillful is the forecast?" I'm going to do a more general comment concerning the verification issue particularly in relation to climate prediction, but I'll start with a minor digression which has arisen through Roger Pielke Snr's recent comments on his blog.

Firstly, it turns out that when he talked about the models' "skill" (or lack thereof), he wasn't actually using the term in its standard form (a comparison with a null hypothesis). In fact what he was talking about seems more akin to the (related but distinct) validation issue. The questions he was addressing are along the lines of "does the model output lie within the observational uncertainty of the data?" The purpose of this note is to show why this is an invalid approach.

I'll denote the truth by t, the observations o and the model m = t+d where d is some (unknown, hopefully small) difference between model and reality. The mean square difference between model and observations is given by

E((m-o)2) = E((t+d-o)2) = E((t-o)2) + 2E((t-o)d)+ E(d2) = E((t-o)2) + E(d2)

where E() is the expectation operator (the average over many samples). The struck out cross term is zero because the measurement error is unbiased (zero mean).

Now, the mean squared observational error is equal to E((t-o)2) by definition. But E(d2) in the above equation can never be negative. So we have shown that the RMS difference between model and observations is necessarily greater than or equal to the RMS error on the observations, with equality holding if and only if the model is perfect (d=0, m=t). In the real world, that means that this "test" always automatically rejects every model! That may be convenient for a sceptic, but it is hardly scientifically interesting. By this definition, "skill-free" is simply a perjorative synonym for "model". And this does not just apply to climate models, but to any model of any real system, anywhere, no matter how skillful or accurate it has proved itself to be in the conventional sense.

In fact, the correct test is not whether the model output lies within the uncertainty range of the observations, but whether the observations lie within the uncertainty of the model forecast. Eg, given a temperature prediction of 12+-2C, an observed value of 10.5 validates, and that doesn't depend on whether or not your thermometer reads to a precision of 0.5C or 0.1C or anything else.

[Note to pedants: observational error can play a role in verification, if it is large enough compared to the forecast uncertainty. Eg given a forecast of 12+-1C, an observation of 10C with an uncertainty of 2C does not invalidate the forecast, because the true temperature might well have been greater than 11C. But that's probably an unnecessary level of detail here.]