Tuesday, August 10, 2010

How not to compare models to data part eleventy-nine...

Not to beat the old dark smear in the road where the horse used to be, but...

A commenter pointed me towards this which has apparently been accepted for publication in ASL. It's the same sorry old tale of someone comparing an ensemble of models to data, but doing so by checking whether the observations match the ensemble mean.

Well, duh. Of course the obs don't match the ensemble mean. Even the models don't match the ensemble mean - and this difference will frequently be statistically significant (depending on how much data you use). Is anyone seriously going to argue on the basis of this that the models don't predict their own behaviour? If not, why on Earth should it be considered a meaningful test of how well the models simulate reality?

Of course the IPCC Experts did effectively endorse this type of analysis in their recent "expert guidance" note, where they remark (entirely uncritically) that statistical methods may assume that "each ensemble member is sampled from a distribution centered around the truth". But it's utterly bogus nevertheless, as there is no plausible situation in which that can occur, for any ensemble prediction system, ever.

Having said that, IMO a correct comparison of the models with these obs does show the consistency to be somewhat tenuous, as we demonstrated in that (in)famous Heartland presentation. It is quite possible that they will diverge more conclusively in the future. Or they may not. They haven't yet.

92 comments:

Steve Bloom said...

OT, James, but wouldn't this be a good time to send a note to your Russian astronomer friends inquiring after their well-being? :)

PolyisTCOandbanned said...

Do these fellows just lack "physical intuition"? Do they not think about what they are trying to understand, but just crank some formula that is on the shelf? Like a B student on a test, plugging and chugging? And then not looking at their answer and trying to see if it makes sense by some comparison tests?

Please, James, help my skeptics up their game. It embarrases me when my best champions screw up so bad.

James Annan said...

Well I don't think Santer et al get a free pass on this...it's all very well dicking around with fancy details on statistical tests (ok, maybe they were some rather fundamental details) but if the whole basis of the test itself is completely inappropriate, then it's just a huge a waste of everyone's time.

I would call it mathematical rather than physical intuition, and all too frequently it seems to be missing on all sides. Of course we all make mistakes, the real test is how quickly people accept this and change their spots.

Anonymous said...

1) Didn't Santer address this point? e.g. "application of an inappropriate statistical ‘consistency test’." Perhaps you're right that by adding all the extra bits to the paper, they made it so that an idiot might not realize the elementary nature of the most important error, and we need to keep in mind that there are many idiots out there, but...

2) I'd love to see Figure 1 with a proper model spread instead of just the mean.

3) I'm surprised that the trends of RSS, UAH, and the model mean are so far apart in Figures 2 and 3, though maybe that's a lesson about the difficulty of eyeballing trends (just looking at Figure 1, the three trends look like they should be pretty similar, but that might be biased by the fact that they have overlapping start and end points)

-M

James Annan said...

I haven't got Santer to hand (and am about to go travelling, so am not going to go looking for it) so I will take your word for it. In which case this new paper is pretty ridiculous. Well, it's ridiculous anyway.

Martin said...

Actually M, in this paper it is clearly stated that on the Santer vs. Douglass issue, Santer was right and Douglass wrong. But we knew that...

What James refers to is a different issue. It's the assumption that the model runs are centered on the "true value", like the one realization of reality we have; that is, the stochastic uncertainty of an individual model run, i.e., the modeled natural variabilty, is all there is in the way of model run -reality disagreement. If I understand James correctly, there is a contribution due to the difference between our "best knowledge" (to which the ensemble mean tends) and "reality" (to which the mean of real climate on multiple Earth's would tend if we had those).

It is this structural error contribution (and hey, you know it really exists; no way the physics in the models is perfect) which is missing from both Santer and McKitrick.

BTW James, what about doing this one right? Nothing to lose but your illusions ;-)

Anonymous said...

Aha - here is the money quote from Santer:

"DCPS07’s use of σSE is incorrect. While σSE is an appropriate measure of how well the multi-model mean trend can be estimated from a finite sample of model results, it is not an appropriate measure for deciding whether this trend is consistent with a single observed trend."

This is directly relevant to James' "old dark smear", and therefore I thought the subject of this post. I can't swear that the McKitrick paper uses this the sigma-SE where they divide the standard deviation by the square root of the number of model runs, but given their tiny error bounds on the model uncertainty, it sure looks that way (they don't seem to specify in the text). McKitrick does reference Santer, but only in reference to the methodology of determining AR(1) trends, I think.

With regards to Martin's comment: there may be more subtle issues on which Santer, Douglass, and McKitrick are _all_ wrong, but I believe that the error that James was highlighting in this post was the smack-your-forehead-dumb error that Santer did point out (buried in a paper that addressed a whole bunch of issues).

-M

Anonymous said...

Oh, come on. It's just the only remaining method to counter that Santer garbage signed onto by something like 17 names.

Most (not some) of the individual models are running way too hot, as you can see from the attached table and the work done at treesfortheforest blog. While you can call the Santer seventeen -dicking around, the individual models don't get a pass either.

According to this paper, very few models produce trends which are even close.

A model histogram from Nick Stokes

http://www.moyhu.blogspot.com/

How many of those are above the RSS or UAH trends - 0.1 ish? Sure you can argue some pass statistical certainty tests, but many don't make that test either and nearly all are over the measured trend.

If it's only a matter of Santers work stinks, the blocking of rebuttals shouldn't have occurred.

It WAS vigorous blocking.

Fix the models or fix the data, your choice IMO.

Jeff Id

Steve Bloom said...

I'd say fix the commenter, but that's probably illegal in Illinois and anyway it wouldn't shut him up.

Anonymous said...

JeffId: There are a few questions that can be addressed:

1) Do Figures 2 and 3 have appropriate uncertainty limits for the modeled temperature trends?

Answer: Almost certainly not, given Table 1. eg, this uncertainty limit would reject many individual model runs, which is clearly wrong (if you are trying to show that observations are not consistent with models). This seems to be the point of this post, and given that these Figures are likely to be the high-profile part of the paper, it would be good if they were right.

2) Does the main body of the paper use the same inappropriate uncertainty limits?

Answer: I don't know. Maybe the multi-panel regression magic in the paper doesn't depend on the standard error mistake from 1. This would be a good question for someone to answer.

3) Are there actually discrepancies between models and reality?

Trivial Answer: Of course. No model can possibly capture the full complexity of the climate system.
Less trivial answer: In terms of tropical tropospheric temperatures, maybe... more specifically, is the tropical troposphere is warming more slowly than model predictions, or is the tropical troposphere to surface temperature ratio is smaller than model predictions, or both, or neither?

4) What are the implications of any discrepancies? Some imperfections can be shrugged off, some are important, and if it is the latter, that would be good to know. Are there existing research avenues that might address this issue? (eg, for models, Solomon et al. and stratospheric ozone. For observations, trying to fix possible radiosonde biases).

Answer: As far as the forward progression of science, this is the important question, which do you address in your post, though your rudeness somewhat detracts from the interesting question you raise. And I don't know the answer, but I think people are looking at this, despite what you might think about the "govsci" conspiracy.

-M

Harry said...

James,

I'm sure you are aware of the essential point that the authors of this most recent paper are trying to make:

Santer 08 used old data to obtain a result that is not replicable using more recently updated data.

This paper is not the Mc's first attempt at this.

Using a well established methodology from another field is a distinguished, albeit error prone, approach.

The simple fact of the mater is that the result from Santer '08 is being used uncritically by a great many people. If that result is not robust when applied to the best presently available data, it should not be used.

How would you recommend rectifying this?

scientist (PolyisTCOandbanned) said...

Here is what McI should have done:

-shown BOTH methods, if there was a disagreement.

-disaggregated the issue of different mehtods from different (more) data.

------

The fellow has a history of exactly this sort of error (Huybers comment) and always in a direction that overstates his case.

P.s. McIntyre and his authors are responsible for THEIR paper. Their names are on it. Whining about the reviewer or pointing to other papers is just failure to take responsiblity for what happens on your watch, on your ship.

James Annan said...

OK, I have had a quick look at Santer and they do refer to this problem, albeit not quite as clearly as I would have done. It blows apart the whole thing irrespective of details of autocorrelation, so should IMO have been their main or even only point.

No, re-doing the test with new data is also irrelevant if it's the wrong test. I see that one of the original reviewers seemed to note this (according to the McIntyre post) and McI himself seems very coy about whether the results actually have any value. Which I take as an admission that he's playing a silly "gotcha" game by showing that the results of the stats test change with new data without addressing the fundamental point that the test is wrong anyway.

It is clear that MMH were using the standard error on the multi-model mean, this is provided in a table and tallies with the plot (assuming error bars are +- 2x this).

Martin said...

James, yep. Santer get their model uncertainty from inter-model spread, MMH from intra-model wiggle. That alone gives the game away.

Anonymous said...

James, do try to be honest. It is quite clear that WHEN USING THE IDENTICAL METHODS AND TESTS TO SANTER THE MODELS FAIL WHEN USING UP-TO-DATE DATA. What part of "WHEN USING IDENTICAL METHODS AND TESTS" don't you or your clique of brainless prats like Mr Bloom not understand ? You do know don't you that the original comment that was submitted on Santer did precisely that, same method, same tests, up-to-date data: result: failure. Comment rejected ?

Santer's test was rubbish, and only works when using an obsolete subset of the data. I can't see why for scientific reasons one would choose to defend it, politically it does of course make sense, but why pretend it is scientific ?

Gavin said...

It is worth pointing out that it was the reviewers of Santer et al (2008) that insisted that we stick to the 1979-1999 period in the main paper. The results with data up to the end of 2006 (which was what was available at the time of writing in 2007) were reported in the supporting material (exp. "SENS2"):

http://onlinelibrary.wiley.com/store/10.1002/joc.1756/asset/supinfo/joc_1756_sm_santer.pdf?v=1&s=8871a96896654bd04cec3f7c0187fa3e236d60a3

Quote: "The use of longer yo(t) time series in SENS2 yields larger effective sample sizes and larger values of the denominator in equation (4). Values of s{bo} are therefore smaller (by a factor of ca. 1.8) than in BASE. Because of this marked reduction in the size of the observed standard errors, smaller differences between bm and bo are deemed to be statistically significant, and rejection rates are consistently higher than in BASE or SENS1 (see Table 7). Even with longer records, however, no more than 23% of the tests performed lead to rejection of hypothesis H1 at the nominal 5% significance level."

Chip Knappenberger said...

Gavin,

Isn’t the discussion here about tests using the model ensemble (i.e. d*-type tests) instead of tests of individual models (which is how I read the sensitivity test in your Supporting Material)?

I apologize if I missed something.

Would you be opposed to a publication which used your modified d* statistic to test the consistency between models and observations?

Thanks,

-Chip

Jim Crimmins said...

Would your method of ensemble diagnostics rule out the agreement with observation of extremely rudimentary forecasting models, say averaging the past 1-48 monthly anomalies as the ensemble of 48 "models". If not, how seriously should this diagnostic test be taken?

scientist (PolyisTCOandbanned) said...

P.s. Gavin and his authors are responsible for THEIR paper. Their names are on it. Whining about the reviewer or pointing to other papers is just failure to take responsiblity for what happens on your watch, on your ship.

Gavin said...

Chip, pairwise testing is I think the most appropriate (which is what we tested in the SI). The ensemble mean is an estimate of the forced trend, and if there is a substantial component of expected unforced noise, you do not expect it to match any single realisation. James' point about seeing whether the observations are consistent with a distribution derived from the individual model simulations seems like a good way to go. It is also perhaps time for people to stop trying to reject 'models' in general, and instead try and be specific.

McKitrick's plot of the uncertainty in the estimation of the model mean as if it was the appropriate uncertainty range for the model/data comparison is clearly incorrect for the same reasons that Douglass et al (2007) were incorrect.

I have no objection to anyone using the tests discussed in Santer et al (as long as you aren't using the Douglass et al test of course). But see my first point.

Chip Knappenberger said...

Thanks, Gavin,

It sounds like we (you, James, and I) are on the same page when it comes to what we think is the most appropriate way to compare models and observations--something pretty close to that described in my Heartland presentation, http://julesandjames.blogspot.com/2010/05/assessing-consistency-between-short.html).

As for using the Santer et al. modified d*, it seems that when other folks try to use it, it is often suggested (e.g., by James above and others as well), that it not a good way to go.

I am not sensing this sentiment from you (i.e. that using it is a deal breaker), although perhaps there are preferable alternatives, (as noted above).

-Chip

Anonymous said...

If we look at individual models:

http://treesfortheforest.wordpress.com/2009/09/26/ar4-model-hypothesis-tests/

The plots show a mismatch between modeled and observed since 2000, as some have written.

I understand that Santer's model mean makes little sense, however, when the "mean" exceeds significance of observations, that is not an insignificant fact.

In reality, there is a bit more foot on the model gas than is measured. It shouldn't be a big deal in most sciences, but I'm very curious how you guys will react to this.

Saying Santers work is bunk, is a good first step, but the second step, is realization that the individual model story needs to be grasped as well.


The link I left in my previous comment is quite telling. Even though all models don't exceed significance, too many exceed measured trend.

Jeff Id

Steve McIntyre said...

Contrary to Gavin Schmidt's claim, the SI to Santer et al 2008 did not - I repeat not - report results for their H2 hypothesis (difference between model ensemble mean trend and observations).

Their SI contains results for their H1 hypothesis, but not for their H2 hypothesis - a lacuna that we noted in a comment rejected by IJC. In that comment, we reported that for a number of key H2 results where S08 had reported no statistically significant differences, use of up-to-date data resulted in statistically significant differences, rebutting some S08 claims.

None of the referees indicated that our calculations were incorrect.

Our calculations used data that was more up-to-date than that available to the S08 authors. However, had they done their H2 tests using the most up-to-date then available, they would also have observed that some of their H2 claims did not hold up.

PolyisTCOandbanned said...

Steve McIntyre: That's your paper. Not "the reviewers". Thos dastardly people that kept it bottled up so long!

You drew those super tight whiskers on "the models" trends. You defend your paper, dude.

Magnus Westerstrand said...

A bit OT but not completely...

http://www.newscientist.com/article/mg20727725.700-cosmologys-not-broken-so-why-try-to-fix-it.html

"In a recent paper, we have argued that ruling out the entire cosmological model on the basis of a 0.05 per cent probability is similarly ill-advised (Physical Review D, DOI: 10.1103/PhysRevD.81.103008). In cosmology and elsewhere, Bayes tells us it is justifiable to be conservative in the face of statistical anomalies."

Dikran Marsupial said...

Re: Steve Mintyres comments about the outcome of tests on an extended period.

It seems to me that there is an element of multiple hypothesis testing here, which means that the more tests that are performed, the greater the probability of a type-I or a type-II error.

This possibly introduces a risk of inadvertent cherry picking; if the models had "passed" the test with the extended data would Steve have submitted a comment on that? Will Steve be revistiting the tests again to see if the models "pass" with an even more up-to-date data set (possibly the upswing of ENSO might bring the observed trends back into consistency with the models)? Will Santer? Will it mean anything, or have we by then devalued the test by looking at too many variations?

Frequentist significance testing has always seemed a bit of a minefield to me, but if there is going to be a stream of papers perfoming similar tests again and again, it seems to be that is an issue requiring a bit of an audit. Perhaps I am just being picky, but I think that skepticism of the results of significance tests is no bad thing, especially when they tell you what you want to hear.

I think the Bayesian approach is better as it doesn't make a binary distinction between "significant" and "non-significant" which can flip back and forward when you are on the cusp. Presumably the Bayesian test gives a broadly similar posterior probability for the alternative hypothesis regardless of the exact start and end dates, which arguably gives a better picture of the actual situation?

Martin said...

OK, couldn't resist some quick calculations.

What I did was testing, using the mid-troposphere results from Tables 1 and 2, whether the individual models were compatible with the ensemble mean, taking as the standard error of the difference the square root of the sum of squares of the s.e. of the ensemble mean and the s.e. of that model.

For the ensemble I have 0.253 +/- 0.012 degrees / decade; for the individual models, see Table 1 column MT Trend. This is BTW a test that could have been done in the paper.


models = 0.253;
models_sd = 0.012;

m = [0.211,0.380,0.444,0.326,0.30,0.288,0.225,0.193,0.123,0.261,0.230,0.259,0.270,0.186,0.202,0.102,0.284,0.232,0.224,0.285,0.142,0.186,0.270];
sd =[0.053,0.020,0.039,0.111,0.083,0.109,0.104,0.126,0.095,0.043,0.043,0.028,0.028,0.081,0.082,0.084,0.039,0.057,0.045,0.044,0.023,0.063,0.056];

d = m - models;
d_sd = sqrt(sd.*sd + models_sd.*models_sd);

sigma = d./d_sd;

sigma'


Output (number of sigmas difference):


-0.77289
5.44508
4.68087
0.65385
0.56044
0.31917
-0.26746
-0.47405
-1.35763
0.17920
-0.51520
0.19696
0.55805
-0.81823
-0.61540
-1.77955
0.75972
-0.36052
-0.62268
0.70165
-4.27874
-1.04471
0.29683


So, three out of 23 models exceed four sigma. I'm not quite sure how to interpret this, but... this is James's standard test for showing the inappropriateness of the "truth-centred" paradigm. You cannot just average away modelling uncertainty like you can random error!

Methinks everyone (and that means everyone!) should be taking a hard look at the "statistically indistinguishable" approach.

Anonymous said...

I don't believe that is a real McIntyre comment without a lame pun being made Michael Mann's name.

Nick Stokes said...

Martin,
I tried to do something here with a histogram. But I think models_ds is not the se of the ensemble mean - it's too small. Instead it averages the variation that you've called sd. So your d_sd counts this twice.

But the idea is right - the error assumed is so low that models lie outside reasonable bounds. But it seems that they aren't concerned about model variation - they take the model set as given and fixed, so there is no se of the ensemble mean due to model scatter. The error is only due to trend error.

The consequence is that any conclusion applies to this choice of models only. If you want to say something general about models, then you have to allow for their observed scatter.

Anonymous said...

Annan,

I took a quick look at the Knappenberger presentation you linked to, and as far as I can tell, the conclusions broadly coincide with those of MM10. In summary: the models have a tendency to overestimate the warming trend.

Having said that, I’m not impressed with the ‘statistical power’ of the setup employed there. Then again, it’s not really a formally derived test, so speaking about power/size doesn’t make much sense in this context. With all due respect, I don’t understand why everybody got so ‘excited’ about it.

The nice thing about the MMH paper, which *is* a formal test, is that it allows for correlations between the distribution of the trend estimators. This greatly improves the power of the test employed, which is a *must* given the enormous cross-dependencies and short time-interval available.

And given the reactions here I feel the need to point out the obvious: this method for a large part deals with the problem of inter-series dependency. Note that the naive ‘hypothesis test’ presented by Knappenberger in essence fails to do this. Also note that failing to account for the *positive* correlation between the trend point estimates *will* widen the confidence interval for parameter difference, leading to severe statistical power reductions in your test.

People should brush up on their stats, and then proceed to thank MMH for addressing this issue.

Jeez.

Best, VS

PS. I resent your ‘dicking around’ comment on statistical hypothesis testing, as should any positivist scientist. You guys would really benefit from some gravity, especially given that your methods fail to predict even the direction of temperature change correctly.

Martin said...


I tried to do something here with a histogram. But I think models_ds is not the se of the ensemble mean - it's too small. Instead it averages the variation that you've called sd.


Nick, I think you're precisely right about that: the se of all models is indeed calculated by squared-averaging the se's of the participating models obtained from the trend fit. Not the inter-model 'spread'.

So your d_sd counts this twice

Hmmm, strictly speaking yes... there is also the correlation between each individual model trend and the multi-model trend which is not considered here. Rough and ready.

My point is indeed that inter-model spread is not considered. The good news is that MMH now acknowledge the egregious Douglass et al. error of not even considering natural variability of the observations. But they still ignore the inter-model spread issue.

Santer et al. OTOH do try to account for inter-model spread, but in an unsatisfactory way, by dividing by the sqrt of the number of models. But you cannot do that for modelling errors as James Annan has credibly argued: they don't go to zero by ensemble averaging, as random errors like natural variability do. This is the "truth-centred" paradigm, or I would say fallacy.

Note that not accounting (satisfactorily) for inter-model spread is something you can get away with, when your data period is short, as then natural variability dominates the statistics. But extend the data period and it will rise to the surface.

Ron Cram said...

Gavin writes "It is also perhaps time for people to stop trying to reject 'models' in general, and instead try and be specific."

People are not trying to reject models in general. It has already been done. Generally speaking commenters are bringing up points already published in Orrin Pilkey's book "Useless Arithmetic: Why Environmental Scientists Can't Predict the Future."

Nature is simply too chaotic to be predicted by mathematical formulas, no matter how sophisticated the software or powerful the hardware. None of the models relied on by the IPCC have been validated. It is fair to say the models are non-validated, non-physical and non-sensical. Perhaps it is time to quit pretending otherwise.

Jim Crimmins said...

Let's say we have a perfect measurement of ground or surface temperature. The SD of this is zero. No model will exactly match this point distribution. So the lack of accounting for model variance point is obvious and well taken.

However, at some length of observation time, the trend of our zero SD measurement must allow us to impute precisely the CO2 forcing if the concept of CO2 forcing is to have any meaning at all as natural variability averages out.

The modelers have got themselves into a bind by this one-factor CO2 explanation for long term trends (only possible cause of late 20th century warming - stated over and over at RC). Either the trends of the observations and the models match over a long time scale, or the assumed forcings are wrong. It's really that simple.

lucia said...

Gavin--
>>The ensemble mean is an estimate of the forced trend, and if there is a substantial component of expected unforced noise, you do not expect it to match any single realisation.

Of course you don't expect the model mean to match any single realization. The d* test in equation 12 of Santer doesn't expect the forced component to match a single realization. It tests whether the mis-match is inconsistent given the properties of the data and the models.

>>I have no objection to anyone using the tests discussed in Santer et al (as long as you aren't using the Douglass et al test of course). But see my first point.

Can you give an unambiguous answer. Do you object to anyone doing the test of the multi-model mean that involves d*, computing it as defined in equation 12 of Santer. Equation (12) equation is not the Douglas test, but rather the method Santer specifically advocated as the correction to Douglas. Santer used (12) in their results.

It reads as if you have no objection to using (12) in Santer-- but I can read your response either way.

Steve McIntyre said...

Interestingly, Wigley disagreed with Gavin's realclimate post on this matter in a Climategate email as follows:

"932. 1225579812.txt
From: Tom Wigley
To: Ben Santer, Phil Jones
Subject: [Fwd: Re: Possible error in recent IJC paper]
Date: Sat, 01 Nov 2008 18:50:12 -0600
Hi Ben & Phil, No need to push this further, and you probably realize this anyhow, but the RealClimate criticism of Doug et al. is simply wrong. Ho hum. Tom.
"

EliRabett said...

"Nature is simply too chaotic to be predicted by mathematical formulas,"

C'mon, no fruit should be allowed to hang THAT low.

Anonymous said...

"932. 1225579812.txt
From: Tom Wigley
To: Ben Santer, Phil Jones
Subject: [Fwd: Re: Possible error in recent IJC paper]
Date: Sat, 01 Nov 2008 18:50:12 -0600
Hi Ben & Phil, No need to push this further, and you probably realize this anyhow, but the RealClimate criticism of Doug et al. is simply wrong. Ho hum. Tom.
"


Ho hum, indeed. Just can't help yourself, can you?

Ron Broberg said...

2) I'd love to see Figure 1 with a proper model spread instead of just the mean.

Something like this?
http://rhinohide.wordpress.com/2010/08/11/mmh10-the-charts-i-wanted-to-see/

文王廷 said...

Necessity is the mother of invention..................................................................

Chris Colose said...

Jeff ID remarked,

//"Fix the models or fix the data, your choice IMO."//

This is ridiculous. The point is to get both the data and the models closer to reality;the observational specialists and the model groups work quite hard to do this. One of the largest issues of the tropical lapse rate issue is the fact that the radiosonde network is relatively sparse in this region.

As a large number of recent papers have been pointing out (the Santer paper not standing on its own), there is no *obvious* discrepancy between model-and-obs trends in the upper tropical troposphere. There might be, but it remains to be shown and it will take more time to do this given the climatic variability and observational issues. And let's not criticize Santer et al too much for ending in 1999...this was, in large part a paper following the Douglass approach and the time of maximum overlap between obs. and AR4 model simulations.

It is still worth speculating on what might happen if we wait for some time and it turns out that the tropics become decoupled from the moist adiabat, just on the off chance this realization turns out to be correct. Obviously the models will be 'incorrect', but as Gavin notes, this does not inherently mean much in practice. Given the substantial importance this issue has for moist stability (and in particular, the hurricane community) it would be worth investigating this possibility. Virtually all of the hurricane prediction literature in recent years will not hold up if the temperature profile departs significantly from the moist adiabat. Furthermore, to first-principles, the underestimate of models for upper trop. amplification must mean a model underestimate to climate sensitivity, since all layers within the atmospheric column would radiate to space at a temperature colder than in the model. To my knowledge however, no one has investigated this scenario in any detail since there is so much 'weather' confirmation that the temperature profiles closely follow the moist adiabat even as air is undersaturated, in regions that do not experience convection, responses to ENSO ,etc so there's no a priori reason why CO2 should be any different.

dhogaza said...

Ron Cram:

"Nature is simply too chaotic to be predicted by mathematical formulas"

You wouldn't even be aware of the word "Chaotic" if it weren't for chaos theory, which of course was developed to explain circumstances under which your statement if false (mathematically false).

Anonymous said...

dhogaza said...

"You wouldn't even be aware of the word "Chaotic" if it weren't for chaos theory,..."

Yes, Chaos theory was created in the 15th century, bringing us the term chaotic.

Kan

Loquor said...

Dr. Annan,

a couple of somewhat naive questions upon reading this post and your interesting presentation with Knappenberger:

a) Have you any opinion on the Keenlyside, Latif et al. study on the PDO slowing the warming temporarily down?

b) Have you any opinion on the possible role of the solar cycle and the cosmic theory of Svansmark slowing the warming temporarily (or maybe more permanently) down?

c) What would be the consequences of this apparent slowdown in the warming rate, if any, for the estimations of the climate sensitivity? Wouldn´t the equilibrium CO2 sensitivity still be around 3C, even if the models turn out to have underestimated the thermal inertia or the solar forcing?

Loquor said...

......and another question: Isn´t it true that the RSS series are above the surface measurements, as seen on the Wikipedia graph:

http://en.wikipedia.org/wiki/File:Satellite_Temperatures.png

It appears from your graphs as if all models perform worse re the satellite measurements as compared to all surface measurement series. Are there any differences between your data and those on Wikipedia, or is it just me making some naive eyeballing error?

Martin said...

Steve, in the cut & paste department, Wigley wrote some other things too.

"From: Tom Wigley
To: Ben Santer
Subject: Re: [Fwd: [Fwd: FW: Press Release from The Science & Environmental Policy Project]]
Date: Mon, 10 Dec 2007 17:17:14 -0700
Cc: ...

Dear all,

I think the scientific fraud committed by Douglass needs to be exposed. His co-authors may be innocent bystanders, but I doubt it.

In normal circumstances, what Douglass has done would cause him to lose his job -- a parallel is the South Korean cloning fraud case.

I have suggested that someone like Chris Mooney should be told about this.

Tom.
"

Anonymous said...

McIntyre has spent a year or more pouring over those emails with a fine tooth comb. Must have been an accident that he missed that one. He will have to put on his 'concern troll' tone of voice and admonish himself. I would suggest a heading along the lines of "Wacky Macky is very Slacky"

fred said...

I believe McIntyre refers to Schmidt's RC post, not to Douglas itself?

And yeah, headings like "Wacky Macky is very Slacky" is illustrative for this obscure blog.

Chip Knappenberger said...

Loquor,

Our analysis was a comparison of observed trends with trends projected with models run with the A1B SRES scenario. This limited how far back we could use observations (the A1B model runs only begin in ~2001) for a fair comparison. Thus, our longest trends begin in 1995. The wiki chart you linked to looks like the trends are calculated since the beginning of the common timeframe...the late 1970s. This is why they differ from our calculations.

I hope this helps.

-Chip Knappenberger

Loquor said...

Dr. Knappenberger,

thank you very much indeed. Yes, that does help with respect to the Wikipedia satelite chart.

I am sure you are a busy man, but in Dr. Annans apparent absence, I would very much appreciate any comments from your side regarding the three other questions I asked (Keenlyside/Latif´s PDO + solar influences on the apparent slowdown and the long-term sensitivity in the light of the apparent warming slowdown). I find this very interesting.

Kind regards, L

Frank said...

There may be only two choices: A) The ensemble is useful for establishing a "distribution centered around 'truth'" and the 'truth' is that the models are 'wrong'. B) The ensemble is NOT useful for establishing a "distribution centered around 'truth'" and the uncertainty inherent in model projections makes them of little value. How useful are projections about 2100 when we have this much trouble with 1979-2009 at the site that should show the greatest warming due to increasing GHGs? If the consensus prediction from models is a climate sensitivity of 3, then the discrepancy between models and observation in the upper tropical troposphere might be analogous to a climate sensitivity of 2 (RSS) or 1 (UAH). Reasons to be optimistic about this issue would be valuable

The IPCC's job is to inform (for the cynical, scare) policymakers about future climate change, especially by making regional projections relevant to their countries. From the IPCC's perspective, A) appears to be the obvious choice because the ensemble: performs better than any model and is the only viable solution to the problem of conflicting regional projections. If necessary, the models will eventually be tweeked to eliminate the discrepancy with RSS and everyone can pretend the change was trivial.

Anonymous said...

Martin, were you hoping that nobody would notice that the date of the email you posted is almost one year before the date of the email that McIntyre posted?

Ron Broberg said...

I encourage readers to drop by Easterbrooks post on model uncertainty.

http://www.easterbrook.ca/steve/?p=1758

Chip Knappenberger said...

Loquor,

Certainly the things you listed, as well as other factors, such as variability in stratospheric water vapor (as reported by Solomon et al.), may impact the comparison between observed trends and model projections. If climate models do not accurately capture the scales of the variability or even some of the processes themselves, they will have difficulty containing the observations. What impact this may have on climate sensitivity is speculative at this point, but if the models continue to over-represent the observed trend, then the topic can't be dismissed.

But, our analysis was not geared towards determining the precise cause of any model/observation mismatch, but rather whether or not a mismatch exists.

-Chip

Jesús said...

I guess I would need a "dummies guide", because I get completely lost... :(

Could someone explain to me whether Santer et al 08 and MMH have both used the same estimation of error for the models? It doesn't seem so. What's the difference? If MMH is supposed to "update" the Santer et al paper, shouldn't MMH have used the same error estimation? Is James' critic (about comparing to the ensemble mean instead of the model spread) also applicable to Santer et al 08?

I guess the answer is inherent to some commments, I'm sorry I cannot follow them...

Thanks!

Martin said...

Anonymous (what a nice name!), you mean I forgot to remove the date from the mail?

My point is simply that if you want to use Wigley as an authority for anything by quote mining, I can play that game too.

Anonymous said...

Or maybe the point is that Wigley learned something in the intervening year.

Martin said...

Hardly, my nameless friend. I suggest you look up the full email that McIntyre is referring to. Did you notice it was truncated? Now why do you think Steve did that? It is here.

As you see, Wigley's remark is a comment on a forwarded correspondence between one Gavin Cawley and Ben Santer, where the former confesses to getting his (mis-)conception on Santer et al. from the first RC blog post on the matter. Which, having read, I remember indeed to have been somewhat confusing.

As to Wigley changing his mind on a paper that was so egregiously wrong as to be, to him, evidently fraudulent -- i.e., Douglass put in things that he damn well knew to be wrong in the hope of slipping it past reviewers --, as he himself put it in yet another mail:

"Yes -- I had this in an earlier version, but I did not want to overwhelm people with the myriad errors in the D et al. paper."

...and being an author on Santer et al. himself, sounds like a bit of a stretch, don't you think?

David B. Benson said...

chaos ---
Date: 15th century
2 b : the inherent unpredictability in the behavior of a complex natural system (as the atmosphere, boiling water, or the beating heart)
from Merriam-Webster Online Dictionary.

Chris Colose said...

Loquor,
here is my response if it means anything...

It's hard to say if Keenlyside is right at this point. They do not have very good skill in their hindcast compared to observations, although it is better than the null hypothesis of persistence. I am working with Shu Wu and Zhengyu Liu at University of Wisconsin-Madison and they have developed a statistical model which outperforms Keenlyside in the hindcast, and we believe it is a useful model for prediction. The paper is still some time before we can submit it to a journal, but the prediction is for warm North Atlantic anomaly and cold South anomaly relative to the climatology (it's not clear to me what it is relative to the present day). We don't look at the global-scale and right now we don't attribute it to anything in particular, although the dominant modes over the next decade appear to be the forced trend and something that resembles AMOC.

On the cosmic ray idea, it's pretty dicey stuff that doesn't have any explanatory or predictive power, and no one has proposed a coherent quantitative and physical explanation for how cosmic rays are supposed to alter the climate. There's been several papers which show no linkage between temperatures and cosmic rays (e.g., Sloan and Wolfendale, 2008; Pierce and Adams 2009). Further, if you look at the 10Be record in Greenland some 40,000 years ago you see a big change in cosmic rays without a corresponding temperature change, so there's really nothing in the past either that should give us any reason to believe cosmic rays are important. Regardless, there's been no secular trend in any seemingly important solar-related variable over the last half a century or so.

As for climate sensitivity, it's really not possible to use the 20th century record to constrain the sensitivity very well (too much uncertainty in the forcing, aerosols especially, and a lot of potentially big forcings have a non-linear dependence to the equilibrium climate sensitivity so for example if you turn up the eq. sensitivity the peak cooling of a volcanic eruption changes only slightly)... also 10 years of flatlining temperatures don't change the argument.

Hope that helps
Chris

Loquor said...

Dr. Knappenberger

thanks again. I realised that your presentation simply showed a mismatch (though, so far, a statistically insignificant inconsistency).

An often heard claim is that the apparent 10y slowdown indicates that a sensitivity of 3C is likely overestimated, but clearly, this would very much be dependant on what the reasons be. Even if Keenlyside et al. are right, they themselves have claimed that their results have no implications for the equilibrium sensitivity. It only means that temperatures wouldn´t start to pick up again before around 2015.
As far as I understand, the same applies to possible solar cycle effects - only should the temperatures begin to pick up again very soon with the waking up of the solar cycle.

However, if there is a cosmic ray effect, then the sensitivity is likely to be lower - at least, this is what Shaviv claimed in his JGR 2005 paper. This would also cast doubt upon the consensus about GHGs being the major forcing of the 20th century temperature increase.

These are the most often heard disputes in the arguments about the temperatures of the last 10y as I see them. I may well have missed or misunderstood something.

Therefore, I would be very interested in hearing any thoughts you may have (as an honest, moderate sceptic/lukewarmer)
about the reasons and implications for the mismatch you and Annan showed. I know there are all sorts of caveats and uncertainties, so an enlightened guess will do perfectly - everything you say will not be used against you.....

Kind regards, and please forgive me for assigning you to a highly simplified and conveniently labeled group - as you surely know, this is a trick often employed by small minds to ease their understanding . :)

Loquor said...

Dr. Colose

thank you. I am aware of the papers you cite about the cosmic rays (and of Damon & Laut 2004, Harrison et al. 2005, Lockwood & Fröhlich 2007, Overholt et al. 2009, Calogovic et al. 2009 and others). Taken together, "dicey" is surely a charitable characterisation of the cosmic theory, and it appears to me that this theory urgently needs support from firm evidence outside the tribe of Shaviv and Svensmark and their diehard faithfuls if it is to retain any scientific credibility. However, as you surely know, Shaviv did make a quantitative estimate about 20th century warming and equilibrium sensitivity in the light of the cosmic theory in his aforementioned JGR 2005 paper, so I would be interested in hearing whether anything related to the apparent slowdown could or should be taken as indirect evidence for any cosmic link (I have no doubts that it will by the fervent believers).

I know that the 20th century is a poor constraint for estimating equilibrium sensitivity (I have followed Stephen Schwarz´ follies on this blog, among others), and that 10y slowdown per se does not change that. However, I have understood that some of the suggested reasons for this slowdown could have implications for the sensitivity, and I would therefore like to hear some more thoughts about this from you and other pros.
(I´m just an ecological geneticist with an amateur interest, so I have followed the debate and I´m quite familiar with the general scientific foundation, but I have put my judgement on permanent suspense when it comes to specifics about modelling and space/ocean physics).

As you certainly must have experienced, just saying that "10y warming slowdown changes nothing with respect to AGW" will not convince any "sceptics" in a debate - they will immediately ask about why and what, then, would change anything (before descending into the usual boring "AGW-is-a-religion"-stuff). If one acknowledges the slowdown and immediately proceeds onto saying that this changes nothing, you make it far too easy for sceptics/deniers to frame you as a "believer" in public debates. So I´m looking for more qualified and subtle answers.

I´ll look forward to reading your "better than Keenlyside/Latif" paper on ocean oscillation when it comes out! Kudos, L

Chris Colose said...

(Mr.) Colose (and Chris much preferred)...just an Atmospheric Science student posting on a blog right now, but I try to stick with stuff that I can post about with confidence or provide citation. Sorry to disappoint :-)

It seems you already know most of what I am saying, and I don't like to speak for *the experts* like Gavin or James Annan, but I do not feel their responses will substantially differ from mine with respect to your questions here, but they can always post if this is an unfair statement. There's certainly no need to re-phrase everything to address more skeptics.

There's also no need to wait for another few decades of temperature data to talk about sensitivity with some constraint since we have the paleoclimate record. The fact is that estimates based on the instrumental record are all over the place (and much more difficult to obtain sensitivities below 1.5°C from the observational data than it is to get values above 4.5°C, see Fig. 3 in the Knutti and Hegerl 2008 Nature Geoscience paper on this) but the uncertainty in forcing and response preclude confident estimates of sensitivity (this is discussed in Wigley et al., 1997, PNAS); further, a considerable fraction of the annual to decadal scale variability must be due to internally generated 'noise' which we can fully expect (in obs. and models) to offset anthropogenic forcings on timescales of one or two decades (as Easterling and Wehner make explicit in their 2009 paper)...note for very large forcings like that of a volcanic eruption where models with very low sensitivity cannot work, but sensitivities higher than reality could be made to work.

Anonymous said...

Anonymous said...

Or maybe the point is that Wigley learned something in the intervening year.


Or maybe that Wigley thinks that Douglass is obviously wrong, but that Gavin's reasoning is incorrect.

Anonymous said...

I'm willing to concede that Wigley learned nothing, as long as we agree that Gavin is wrong.

Ron Cram said...

To Eli Rabett,

I suggest you read Orrin Pilkey's book before you claim he is wrong. As a professor at Duke University, he is considered the leading environmental scientist on coastlines. The opportunity for model validation is much greater with coastlines than with climate when the time horizon is 100 years out. Short-term and intermediate-term climate models are no better than tossing a coin. What makes you think long-term models will be better? It is a ridiculous notion.

Even better, see the boxplot on Steve McIntyre's post at http://climateaudit.org/2010/08/11/within-group-and-between-group-variance/

That should quickly clear up for you any misunderstandings you may have about climate models being physical. There is absolutely no internal climate variability. We all know the arctic sea ice melted and the Northwest Passage opened up in 1944 and the sea ice came roaring back in the following years. That was repeated in 2007. The current models don't even come close to showing that kind of variability.

bluegrue said...

There's something I don't understand. In MMH the UAH/RSS LT trends are listed

as 0.070/0.157 °C/decade in table 1
as 0.079/0.159 °C/decade in table 2

However, a simple linear regression of monthly data yields 0.138/0.162 °C/decade. I don't have a beef with the last digits, but why is the UAH trend only about half as high as its linear trend?

Chip Knappenberger said...

Loquor,

Natural and anthropogenic processes, modeled, unmodeled, and poorly modeled, have acted to cause the recent slowdown in the rate of global warming. It is my (current) opinion, that the underlying temperature increase driven by enhanced atmospheric greenhouse gas concentrations as projected by the multi-model ensemble mean is too high. I don't profess to know the precise reasons why.

-Chip

Ron Cram said...

Steve McIntyre has an new guest blog post by Ross McKitrick this morning. See http://climateaudit.org/2010/08/13/ross-on-panel-regressions/

I asked the simple question "Do you think James Annan is smart enough to understand this point?"

Evidently Steve McIntyre is too nice a guy to call you out on this because he moderated out my comment. So I will do it. What do you say, James?

SteveF said...

I for one am sure that insulting our blog host will prove conducive to productive dialogue.

Steve Bloom said...

I would say, Ron, that McI would want to stay rather distant from smartness comparisons with James.

"We all know the arctic sea ice melted and the Northwest Passage opened up in 1944 and the sea ice came roaring back in the following years."

Denialist mantras of this sort are a dead giveaway as to your own lack of smartness, BTW.

nono said...

@Ron "Even better, see the boxplot on Steve McIntyre's post at http://climateaudit.org/2010/08/11/within-group-and-between-group-variance/

That should quickly clear up for you any misunderstandings you may have about climate models being physical. There is absolutely no internal climate variability. We all know the arctic sea ice melted and the Northwest Passage opened up in 1944 and the sea ice came roaring back in the following years. That was repeated in 2007. The current models don't even come close to showing that kind of variability."

Talking about a self-ownage...

The intra-variances Steve calculated were ridiculously small, and he had to retract himself. Now the thread is closed.

Anonymous said...

The intra-variances Steve calculated were ridiculously small, and he had to retract himself. Now the thread is closed.

Closed it? I want to see the emails.

EliRabett said...

Well Ron, want to tell Eli how many years in the past decade the NW Passage and the Northern Route have been open? A few more than in the 1940s eh?

Rattus Norvegicus said...

Well, at least he admitted that he made a major mistake. Awaiting a correction.

Anonymous said...

Well, at least he admitted that he made a major mistake. Awaiting a correction.

I think people should spend the next ten years analysing and re-analysing that mistake and wondering how he could be so incompetent that the question of fraud must he hinted at, but not mentioned specifically.

Rattus Norvegicus said...

Now, now, it's only been 7 years...

Carrick said...

Eli: Well Ron, want to tell Eli how many years in the past decade the NW Passage and the Northern Route have been open? A few more than in the 1940s eh?

It's not like people have been tracking how often it was open between then and now.

Do you have numbers by decade? I doubt they exist.

Steve Bloom said...

Just so folks are clear, the NWP wasn't open in the 1940s. IIRC a Canadian boat did get through in 1944, but painfully. Picking through the floes while taking months to get through isn't "open." Also, what's been opening recently is the northerly deep-water route suitable for navigation by large vessels, in contrast to the southern route that's useless for such purposes.

Denialists are such idiots.

Ron Cram said...

nono, your comment was not exactly forthcoming. I just checked and Steve did agree he made an error, but he has promised to repost after the error is corrected. Until the error is corrected, there really isn't anything more to say on the matter. It will be interesting to see how much change there will be after the correction.

Anonymous said...

The comment at 12/8/10 9:17 PM is rankly deceptive and ought to be deep-sixed.

Ron Cram said...

Steve Bloom,
Actually McIntyre has been recognized for his mathematical brilliance since his days at Oxford. I have spoken to climate scientists who think James is smart but not nearly as smart as he thinks himself to be. We shall see if James has grasp the point Ross has made or not.

Ron Cram said...

Steve Bloom,
You are doing your best to misinform people. A number of boats have made through the Northwest Passage in the 20th century. In 1944, the captain left a tad early but he sailed the last 2,000 miles in just a few days.

In 1937, Scotty Gall sailed through the Northwest Passage in a wooden boat. You can see picture of it here.
http://tiny.cc/3uvk7

SteveF said...

My favourite blogger is smarter than your favourite blogger so ha!

PolyisTCOandbanned said...

The whole thing is silly. The Denialists (and since I am one, I can use the term) are using the wrong method for showing variability in "model space". When challenged, they say, "but the models differ a lot from each other, in some cases more than within model run to run". Duh. then they have this whole, "we'll admit X, but then you need to admit Y" thing going on. It's the same crap that was going on wrt Douglas in the blogs.

First, they needed to put their logic for their too tight whiskers into their paper (just so others could discount or debate it). This is the Feynmanian ideal of showing where you "might be wrong". But instead they hid it, larded on a bunch of matrices and algebra to confuse people. And shopped around for 2 years to try to slide their paper through.

Second, they're wrong. We don't even USE models in the way that the MMH method would assume. When we think about the century-long temp rise, we don't report a super-tight point estimate. We recognize the structural uncertainty.

And yes, granted, a wide spread of models or adding nonsense models can make it easier to pass a more open or range-based view of the ensemble...but...see the paragraph preceding!

Add onto that, the paper is just poorly done. Look at table one: they list the average trend and standard deviation of the models. But they call it standard deviation in one spot, standard error in another. They list a standard deviation for single run models! (must be some time series thing, not a classic sampling statistic, but is NOT EXPLAINED in the table).

Now they are flailing around trying to do damage control in the blogs. And messing those up as well. What a mess.

2 years spent on this crap.

It's also funny how close they kept this thing to the vest, before putting it out. McI often justifies his blog as a "lab notebook" when challenged on either it's usefulness or its accuracy. But it really seems like a PR organ. They didn't share the in review copies for instance of MMH.

James Annan said...

working backwards...

PolyisTCOandbanned, yes you are right on the money on this.

SteveF and Bloom and Ron Cram, actually I am indeed as clever as I think I am. However this may well not be quite as clever as some people think I think I am :-) I am well aware of many limitations in my understanding of various things, but on the subject of ensembles, it is clear that in a matter of months Jules and I have recently developed some insights that have eluded all the climate scientists - and indeed a much broader set of researchers - who have been pondering over these questions for several years. It's all about bringing a fresh perspective.

As for "Ross on panel regressions", well he seems to have put his foot firmly in his mouth by talking about testing models-obs equivalence, since we *know* the models are not equivalent, therefore they cannot possibly all agree with the obs. Talk about a nil hypothesis... Also, we hardly have to work to "build in" any idea of model heterogeneity, it is apparent at first glance just from looking at them. Any test that pretends otherwise is plainly silly.

I think I agree fairly closely with Chip's assessment as stated on 14/8/10 12:12 AM, though if we put numbers to these opinions there would probably be a bit of daylight. Even so, I am probably closer to his opinion than I am to the MIT crowd who still claim that (with high probability) the models substantially underestimate the underlying rate of forced response, using methods that I regard as rather dodgy. Every extra year that does not break the 1998 HadCRUT record will be a small piece of evidence towards a slightly lower sensitivity/transient response, but this process is a slow drift in my beliefs and there is nothing very conclusive yet IMO.

Have I missed anything else substantive?

James Annan said...

working backwards...

PolyisTCOandbanned, yes you are right on the money on this.

SteveF and Bloom and Ron Cram, actually I am indeed as clever as I think I am. However this may well not be quite as clever as some people think I think I am :-) I am well aware of many limitations in my understanding of various things, but on the subject of ensembles, it is clear that in a matter of months Jules and I have recently developed some insights that have eluded all the climate scientists - and indeed a much broader set of researchers - who have been pondering over these questions for several years. It's all about bringing a fresh perspective.

As for "Ross on panel regressions", well he seems to have put his foot firmly in his mouth by talking about testing models-obs equivalence, since we *know* the models are not equivalent, therefore they cannot possibly all agree with the obs. Talk about a nil hypothesis... Also, we hardly have to work to "build in" any idea of model heterogeneity, it is apparent at first glance just from looking at them. Any test that pretends otherwise is plainly silly.

James Annan said...

Continuing up...

I think I agree fairly closely with Chip's assessment as stated on 14/8/10 12:12 AM, though if we put numbers to these opinions there would probably be a bit of daylight. Even so, I am probably closer to his opinion than I am to the MIT crowd who still claim that (with high probability) the models substantially underestimate the underlying rate of forced response, using methods that I regard as rather dodgy. Every extra year that does not break the 1998 HadCRUT record will be a small piece of evidence towards a slightly lower sensitivity/transient response, but this process is a slow drift in my beliefs and there is nothing very conclusive yet IMO.

Have I missed anything else substantive?

jules said...

> Actually McIntyre has been recognized for his mathematical brilliance since his days at Oxford.

But Wikipedia says that McIntyre studied PPE at Ox, so presumably that's the opinion of some economists, rather than actual mathematicians!

nono said...

@Ron

nono, your comment was not exactly forthcoming. I just checked and Steve did agree he made an error, but he has promised to repost after the error is corrected. Until the error is corrected, there really isn't anything more to say on the matter. It will be interesting to see how much change there will be after the correction.
_____________________________________

Well Steve acknowledged that the intra-variances were ridiculously small -- something that was obvious to me and some of other commenters at first sight.

However, he retracted himself on the basis that he had performed the calculation up to 2099 instead of 2009, which tells me that he still haven't grasped the deeper flaws of his analysis.

Steve seems to be convinced that if you ask 5 times an expert his forecast for GDP growth AND the uncertainty he affects to his forecast, then it's okay to throw away the uncertainties and replace them with the standard deviation of the 5 forecasts set. Since one expert is unlikely to give 5 tremendously different answers, this gives a ridicously small final error bar.

I mean, several of us asked Steve what was becoming of the individual uncertainties in the intra-variances, and his answers were rather elusive to say the least.

Tom C said...

You know, this "who's smarter than whom" game is pretty ridiculous. That said, quite amazing that Team partisans would try to disparage SM's intelligence. Maybe he is wrong about paleo reconstructions, but he is quite clearly a very smart, well-read guy who has had a successful career and knows mathematics very well. His math credentials are superior to those of his main antagonists. One might really lose credibilty by trying to claim otherwise.

James Annan said...

Well I agree that credentials aren't everything, but when I hear of someone being described as good at maths for a PPE graduate, my assumption is that they are being damned with faint praise (whether accidentally or not). And since despite your protestations you are still making claims about his "maths credentials" then you should realise that at least one of his antagonists (perhaps not a "main" one) has a first class maths degree and DPhil from Oxford, on top of various school and national maths prizes as a schoolboy. But I am happy to agree that McI obviously knows a fair bit of linear algebra which was never a particularly strong point of mine. The important question here of course is whether his arguments stack up, and on the MMH paper, they obviously don't.

Jesús said...

James Annan 15/8/10 1:13 AM said:
"Every extra year that does not break the 1998 HadCRUT record will be a small piece of evidence towards a slightly lower sensitivity/transient response"

What about too little natural variability in climate models? Wouldn't recent papers about ocean heat content point to that direction? (the warming trend may have slowed down at the surface but not in the climate system as a whole (oceans)). I think this was also the view of Keenlyside et al or Swanson & Tsonis...