Wednesday, March 21, 2012

HadCRUT4: 1998 and all that

So the long-awaited HadCRUT4 paper is now published, and the UMKO web site has a press release, though the full data set does not seem to be available yet. I'm sure that won't be long.

As was exclusively revealed on this blog some time ago, according to this updated analysis, 1998 is no longer regarded as the warmest year on record, having been overtaken first by 2005 and then again in 2010. Of course, this was not really controversial in scientific circles, as the small cold bias of HadCRUT3 (due to data voids in the most rapidly warming areas) had been well documented. Competing analyses (NCDC and GISTEMP) that smooth over the gaps already showed these results.

Some of the major differences between the data sets are illustrated by this figure from the paper:

I've added two green ovals to each map to highlight some differences - firstly, filling in the big gap in Siberia, where it was clear in HadCRUT3 that there had been lots of warming even though there were gaps in the coverage. The addition of more observations has merely confirmed what everyone knew, and the previous data has not been changed in any significant way. Moreover, the additional data in the relatively cool area of the south Indian ocean shows that they didn't just try to collect data in the hottest regions. So whatever desperate attempts the sceptics make to discredit this update, they simply don't have a leg to stand on.

The problem really is in the presentation of the data average as representing "global mean temperature" in the first place, when it doesn't, as the data are not missing at random. One good way to deal with missing data in this sort of situation is to fill the gaps with some sort of smoothing (there are a wide range of options of varying sophistication) to generate a global field before averaging, as NCDC and GISTEMP have always done. However, this doesn't matter in the context of the sort of detailed model-data comparisons that underpin detection and attribution, since they generally work at the level of the gridded data and voids are ignored. The mid-century changes to ocean data may however have a modest impact here.

David Whitehouse (or someone impersonating him) has been quick off the mark to bluster about how he would have really won the bet anyway. Unfortunately his comments are highly misleading.

Firstly, we never clearly specified HadCRUT3. I did try to get David to confirm exact details via email, but he refused, and I believe the precise phrase used by Tim Harford was "the Hadley Centre analysis". Maybe his behaviour should have been a red flag that he would resort to rewriting history whenever possible, but I thought it was unlikely to be ambiguous. Of course I didn't know the HC were planning to change their analysis, even though it is inevitable that these sort of changes do take place over time (hence the "3" in HadCRUT3). For how a bet of this type is more clearly specified, see here for example: "in the data set HadCRUTv. or successor data set. Successor data set is the data set used by the Hadley Centre to compare hottest years to the media at that time." Incidentally, Gabi has not won that bet yet, as it specifically refers to the setting of a new record, and 2010 was not.

Secondly, he claims that because 2010 is no hotter than 2005 in the new analysis, he wins anyway. This is more than a bit ridiculous, as the entire bet was predicated on the usually long interval after 1998 in which the record had (apparently) not been broken.

(Of course it will only be a matter of weeks before we start seeing sceptical arguments along the lines of "no global warming since 2005").


PeteB said...

I was wondering about DW's claim that 2010 is no hotter than 2005 - is that really true ? Usually the HADCRut global temperature anomaly in the data files is 3 decimal places - but the global temperature anomaly data file isn't avaiable yet - although the press release states they are the same to 2 decimal places - I know this is dwarfed by the uncertainty, but for the terms of the bet if 2010 anomaly is higher than 2005 (even to 3 dp) - then I think you win !

James Annan said...

Based on the pic I blogged earlier, it looks to me like 2010 just loses out to 2005 on the next digit.

EliRabett said...

So, and this is probably a job for Nick Stokes, what if you masked the GISSTemp/RSS/UAH/NOAA data with the HADCRUT3/4 sampling mask. Would there be any significant differences (you would expect some with GISS which weights by stations up to 1000 km out), but how about the others.

fff said...

there is already an image with the four reconstructions masked to the minimum coverage, hadcrut4 1979-2010 trend is slightly higher:
Morice 2012 preprint is here:

Mark said...

The bet you're talking about is one that was never agreed upon? And one of the parties is saying that he would have won it? I say, that's rather poor form, isn't it?

Paul Matthews said...

It's not David Whitehouse who is misleading. It's the Met Office press release and your post. What you fail to mention is that the margin by which 2010 beats 1998 in HADCRUT4 is one tenth of the uncertainty error! So any talk of ranking of years is meaningless.
This is quite rightly pointed out by Dr Whitehouse.

Paul Matthews said...

Note also that the meaningless ranking of years given here and in the Met Office press release is not mentioned in the HADCRUT4 paper.
The authors were presumably aware that such nonsense would not get through peer review.

James Annan said...

If Whitehouse is now going to claim that "any talk of ranking of years is meaningless" (your words, but you claim they represent his view) then that would appear to make him seem somewhat duplicitous in arranging the bet in the first place, don't you think?

Francis Turner said...

The really good news is that they are going to provide the data and source code so we can see how they do it. and can presumably see whether the new Arctic adjustment makes sense or whether it's smearing some poor data over rather too much of the globe.

Note: The key here is that I don't know which is is, which is why I'm so glad to see the open data announcement. I do know that some of the Russian/Soviet weather reports looked somewhat iffy though a few years back - one IIRC looked like a fairly clear case of UHI which clearly should not be allowed to count for thousands of adjacent tundra but on the other hand it is also clear that the Arctic has warmed up more than more temperate latitudes so it is undoubtedly good to have more coverage of stuff up there.

Carrick said...

I've said pretty much all of this before, but I guess my big problem with looking at individual dates (beyond for the sport of it) is that these numbers have measurement uncertainty and are strongly influenced by short-period natural variability. And generally the ignoring of the fact that these are measurements (that have bounds): The treatment of central values without considering their uncertainty is really kind of pointless (again, as I said, except for the sport of it).

Whether HadCRUT3 was microscopically warmer in 1998 for a few months than HadCRUT4 and GISTEMP is absolutely meaningless in terms of its impact on our assessment of the amount of warming of climate...

It's much more meaningful, for me at least, to discuss comparisons in the context of the uncertainty bounds. For example in the question "has it been warming from 2005 to now", natural climate variability adds a roughly ±0.3°C/decade variability (95% CL) to any trend estimate, so all you can really say in the absence of a a substantial warming (> 0.3°C/decade) is that the measurement interval is too short to resolve global warming on that scale.

That's the correct way to respond to denialists IMO. Engaging in whether the central values have tended up without discussing their uncertainty bounds is missing a teaching moment.

Seriously, if you can get it finally through their heads that the interval is too short maybe they will just STFU every time we have a 7-year "lull" in warming: This is guaranteed to happen in about 15% of the time (zero or negative trends on a seven year interval), assuming climate variability remains constant.

Of course if the unforced variability is increases over time as Hansen has been suggesting, the frequency of zero/negative temperature slopes for a seven year period will increase too, assuming the same 0.2°C/decade long term trend.

(And before anybody accuses me of not being a fun person to have around at cocktail parties because I keep bring up the uncertainty in the measurement every time people talk about the central values, I'll simply point out that anybody who even mentions the accronym HADCrut at a cocktail party is likely to pull wife aggro, and suffer a subsequent beheading in private).

James Annan said...

I agree of course that the individual year values are of limited scientific interest. OTOH the spacing of record years is more than purely frivolous, as we expect basically equal spacing with a warming trend, versus increasingly rare for no trend.

Carrick said...

James: OTOH the spacing of record years is more than purely frivolous, as we expect basically equal spacing with a warming trend, versus increasingly rare for no trend


And the fact that you expect get a zero or negative trend in a seven year span roughy 15% of the time is a statement that there is a net underlying warming too. (It should be 50% with no warming).

The same guys who crow about periods of "no warming" are the ones who pass of any warming we've seen as "natural fluctuation". Strange they don't see any contradictions...

Carrick said...

The fallacy of looking at the warmest year on record (for a given site) sometimes shows up in the lay media.

For example, here's a nice example where they looked at the top 10 warmest winters in Bismark, North Dakota. (Many US sites, last I check anyway, still have their warmest year as being in the 1930s).

Number one is 1931, number two is 1878. Most of the rest of the warmest winters are post 1980, but nobody finds that remarkable.

In fact that is the only part that is significant in any meaningful sense, "fluke" weather doesn't signify a trend. A period of 10 very warm years weights much more heavily than a couple of isolated outliers.

By the way, if you look at the figures you posted, there are still large missing swaths of land in Africa and the mainland of Antarctica. It's be interesting to speculate on whether a HADCRUT product that included these regions would still have 2010 as the warmest year on record...

James Annan said...

Yes there are still plenty of gaps. However NCDC and GISS both smooth over the gaps (with rather simplistic methods, but never mind) and all methods seem to agree pretty well.

Carrick said...

In the case of Aftica, I'm thinking you'd have to do something along the lines of what Steig did in S09 to infill correctly. Inland temperatures are probably warming faster than what you'd infer from coastal measurements. (OK, if this speculation is correct, it would has the opposite effect of what I said previously: It would make the Africa-corrected CRUTEMP series warm even faster.)

Scientifically, I'd predict what you'd find wouldn't be very interesting, though it might be nice to "nail things down" a bit further, if only to what the reactions of certain people who seem to feed off the uncertainty. >.>

My bigger concern remains the large gaps in the global data set prior to 1950.

You made a point earlier that the 1905-1945ish trends weren't as large as the 1975-2010ish ones.

I think if you reflect on this, you might admit this is somewhat self-referential.... how do you know the true trend for that period is smaller until you've accurately measured it?

One can assume it's small because a reconstruction with known large systematic errors is showing small trends, but we don't know really happens to the trend of the data after the systematic errors are corrected for until after they have been applied... the true trend might be much larger (though color me skeptical that this will happen).

Carrick said...

Grr... make that

"if only to *watch* the reactions".