So, I was googling for some IDL code for statistical tests (why reinvent the wheel) and I came across this ugly documentation of a Kolmogorov-Smirnov test:

It's not just climate scientists, though - essentially the same error is contained in this equivalent routine from a German university astronomy and astrophysics department (a high google hit):

In case it's not clear enough, the error in both sets of documentation is that the K-S test actually reports the probability that a particular test statistic would be exceeded if the two distributions were the same, in other words, a frequentist P(data=D|hypothesis=H) statement. The alternative P(H|D), which the documentation claims that the routine outputs, is an entirely different beast which demands a Bayesian treatment - in particular, it depends critically on a prior P(H), for which there is (in general) no default, "objective", or "correct" choice.

Both these routines claim to be based directly on Numerical Recipes. It is notable that the text of NR (at least my edition) avoids making this particular error. However, the same (latter) routine, with the same error in the documentation, turns up all over the place, with apparently no-one in NASA, Princeton, Washington Uni etc (to name but three) ever noticing...

I suppose I could be more sympathetic to the climate scientists who have apparently been seduced by such ubiquitous and intuitively appealing language, and who have as a result tried to reshape probability theory so as to make such statements actually valid. OTOH, I still think they should still be prepared to accept that their theories are wrong when I point out the problems in words of one syllable together with elementary examples :-)

; OUTPUTS: Probability two populations are drawn from sameThat's from the documentation of the UKMO IDL library (which I don't actually have, although it seems freely available). Of course, it is dead wrong. A K-S test does not calculate this at all!

; underlying distribution.

It's not just climate scientists, though - essentially the same error is contained in this equivalent routine from a German university astronomy and astrophysics department (a high google hit):

PROB gives the probability that the null hypothesis (DATA came from aThis one is particularly amusing, because the immediately previous two lines are

Gaussian distribution with unit variance) is correct.

IDL> data = randomn(seed, 50) ;create data array to be testedthus indicating beyond any shadow of a doubt that in this example the data did come from a Gaussian distribution, irrespective of the value of "prob" that results from a single application of the test (ignoring quibbles about pseudorandom versus "truly" random).

IDL> ksone, abs(data), 'gauss_cdf', D, prob, /PLOT ;Use K-S test

In case it's not clear enough, the error in both sets of documentation is that the K-S test actually reports the probability that a particular test statistic would be exceeded if the two distributions were the same, in other words, a frequentist P(data=D|hypothesis=H) statement. The alternative P(H|D), which the documentation claims that the routine outputs, is an entirely different beast which demands a Bayesian treatment - in particular, it depends critically on a prior P(H), for which there is (in general) no default, "objective", or "correct" choice.

Both these routines claim to be based directly on Numerical Recipes. It is notable that the text of NR (at least my edition) avoids making this particular error. However, the same (latter) routine, with the same error in the documentation, turns up all over the place, with apparently no-one in NASA, Princeton, Washington Uni etc (to name but three) ever noticing...

I suppose I could be more sympathetic to the climate scientists who have apparently been seduced by such ubiquitous and intuitively appealing language, and who have as a result tried to reshape probability theory so as to make such statements actually valid. OTOH, I still think they should still be prepared to accept that their theories are wrong when I point out the problems in words of one syllable together with elementary examples :-)

## 13 comments:

Hi

So are you going to be writing to the producers of this errant documentation? Will you suggest improvements to the documentation that specifically mention that:

The probability of data given a hypothesis (P(D|H)) should not be confused with probability of hypothesis given the data P(H|D). Also that Bayesian analysis is almost certainly required for P(H|D) together with its need for a prior distribution.

It may take a long time for that sort of documentation to be noticed and help your cause, but it surely cannot hinder. (I cannot imagine that it would be a worse use of your time than repeating the same thing here all the time.)

crandles

Chris,

I don't think it's worth going as far as trying to include a Bayesian primer in each set of documentation, but I did write to the responsible parties and it looks like both sets of documentation will get corrected, so at least they won't actively mislead people in future.

As for "worse use of your time", well that depends on whether I'm writing for my personal entertainment, to entertain my reader, as a handy notebook for future reference, or to change the world :-)

Your paper

http://www.jamstec.go.jp/frsgc/research/d5/jdannan/philtrans_final.pdf

sounds interesting. Whether I am up to understanding it, I am not very sure.

See http://www.climateprediction.net/board/viewtopic.php?t=7453

crandles

To be honest it was thrown together in a hurry and designed to be rather bland and uncontroversial - at the time I had had enough of rejections!

It then turned out that none of the papers in that issue got properly reviewed (as you may guess from the contents of some...) so I could have written pretty much anything...ho hum.

Anyway, it looks like you got the gist of it clearly enough.

Thank you for the reply.

Does their use of transfer functions al la Piani et al tell you anything about their sampling strategy or rather how you think it should develop?

Is the Piani et al method of arriving at the same pdf for sensitivity whether you estimate it directly or via lamda reasonable or does this still have a significant problem from using a linear prior?

Regards Chris

Um, that's a bit cryptic. "their use of transfer functions" refers to another paper in the issue? My hard copy of the issue is doing the rounds at the lab and I've not actually read it properly.

But regardless, the whole scene seems pretty farcical in the way people keep footling around with cod philosophy rather than actually dealing with the arguments raised in our GRL (multiple constraints) and arxived (priors) papers.

Sorry. 'their' probably was a bit vague.

I think I was just talking about Piani et al and CPDN's use of the technique in that paper:

http://www.climateprediction.net/science/pubs/Piani_GRL.pdf

though to be honest in writing the comment, I didn't reread the paper (which I don't think I fully understand). I was just reviewing

http://www.climateprediction.net/science/pubs/cpdn_for_oucs.pdf

(pages 41 to 44)

Anyway your reply seems to indicate you think it is a backward step and your paper is the last word on the subject (even though you have indicated otherwise elsewhere).

Might an alternate view might be that we need to properly work out the probability implications from one source before we try to combine pdfs using different sources?

crandles

"your paper is the last word on the subject"

Certainly not, although the fact that (AFAIK) no-one has attempted to improve, update or challenge what we have said in the 18 months since the GRL paper was published may give pause for thought...

Of course estimates can be expected to improve as we learn more about the climate system. What I do claim about our papers is that they present fundamental principles that any future estimate must take into account if it is to be considered credible (or even meaningful).

Well yes but your priors paper has been discussed enough here and maybe not everything revolves around your priors paper.

You have a new paper which seemed like a round up of various sampling strategies and their efficiency for gaining probability information. This didn't seem to mention Piani et al (2005)'s use of transfer functions. Is it reasonable for me to be asking why there was no mention of it and whether this technique has implications for sampling strategy?

crandles

Oh, now I see what you re getting at. Not an unreasonable question, but I don't think their work is really relevant to what we were talking about - that's not intended as a snipe, just that they are analysing a large ensemble that already exists, rather than trying to efficiently generate a smaller ensemble thst samples the posterior (of an application of Bayes' Theorem), which is what I've been trying to do.

Thank you for the reply.

My reaction is that you used 'large ensemble' and 'generate a smaller ensemble' and I would point out that large and small are relative terms. If CPDN is ongoing (and more CPDN slab models have recently been released) then perhaps you have a large ensemble that is going to get larger. The issue of how to expand the sample size could still involve working out how to 'generate a smaller ensemble' that provides better information couldn't it?

So I struggle to see why you conclude it is not relevant. Different scale to what you were thinking maybe.

crandles

I don't think Piani et al directly proposed any method for sampling parameters. Of course it's not out of the question that I missed something...are cpdn actually using it to design their sampling strategy?

I doubt you missed anything because I didn't see anything either.

Are they using it? Who knows? I happen to be interested in CPDN sampling strategy and from time to time have a go at trying to get more information made available to participants

http://www.climateprediction.net/board/viewtopic.php?p=67053&highlight=#67053

I seem to end up with general replies about what would be a 'good feature' but isn't actually in use. A reasonable conclusion might be that the strategy is still fairly dumb/random/non-optimised and maybe they will think about clever techniques sometime.

Post a comment