I've got lots of bits and pieces to write about, so this will probably turn out to be a fairly incoherent blog post as I don't have time to write a concise and structured one. We are still struggling to obtain a proper broadband connection, though the end (one way or the other) might be in sight with a BT engineer visit planned for next week.
We had a brief visit to Bristol last week - sorry to those we had promised to contact, but it was a chance to see two different people on the same rather busy day, so we jumped at it. We are now both officially "visiting collaborators" of some sort there which is nice, the main practical benefit is library access (inc remotely), and perhaps also the right to a cup of coffee in the staff common room? So hopefully we'll be back on an occasional basis in the future. This is all as a follow-up to Paul Valdes' sabbatical visit to Japan last year, from which joint work is still ongoing.
The "pause" discussion continues (see RC for a summary of recent coverage), which seems a bit silly to me, because it isn't really a "pause" at all, just a continued anthropogenically-forced warming with some other (anthropogenic and natural) forcings and internal variability added on, such that the trend is a little lower than most expected. Of course idiots will continue to play the "down the up escalator" game indefinitely, but I don't feel the need to play with them. I'm usually happy to let the "communicators" duke it out on the most politically correct way to present the science, but perhaps they could start by not using a term that's factually wrong.
There are many possible causes for the model-data discrepancy: the forcings might have been more negative than anticipated, or perhaps natural variability has a bit more negative recently, and just possibly the forced response is a little lower than (most) models predicted. I'm a little surprised to see people like Gavin apparently nailing his colours to the mast of the models being right, for one thing, his calculations (which may be mildly optimistic) only explain "most" of the model-data discrepancy, and it is worth noting that since the natural forcings and internal variability are relatively transient and short-term in nature, this view implies a substantial future near-term acceleration in order for the world to catch up with where the models say we should be. All model simulations show only a very gradual rise in underlying trend, and it is worth mentioning (to those who think that climate scientists have been slow to discuss this) that back in about 2006 I was pointing out to the authors of the IPCC AR4 drafts that the model trends were already starting to look a little high relative to recent observations. This acceleration has been promised "just around the corner" for a long time now, I'm happy to give people like Hansen a bit of a pass on his 1984 work because it was so groundbreaking (and substantially correct), but it's now starting to feel like people are scrabbling around trying to find excuses. I even think I saw some wag in a recent paper (sorry I forgot where) arguing that there were so many excuses for a lack of warming, that the logical conclusion from the model-data discrepancy was that sensitivity was actually higher than the models say!
By the way, one point that is sometimes ignored in these recent energy balance type of calculations, is that some of the analyses (ie, those based on D&A techniques) aim to specifically separate out the different forcings though the different warming patterns they generate. So it is not enough to claim that there are additional negative forcings, but these forcings actually have to generate a spatial warming pattern that negates the well-known pattern of GHG response. Or else, the model patterns of response to the different forcings have to be wrong in a way that leads to a large systematic underestimation of the GHG impact. It's not impossible, but at some point Occam's razor has to kick in.
Oh yes, the GWPF thing has also been published. Disclaimer: those who actually read the thing (which seems to be a small minority, judging from my in-box) will see that I'm acknowledged, which is due to having acted as a reviewer. I haven't carefully checked the final version, but on the whole I saw it as a slightly optimistic but basically defensible interpretation of the evidence. There have certainly been worse papers published on climate sensitivity! I haven't seen any very convincing rebuttals, but am of course open to further dicussion on that score. I'm sort of assuming it's been on the blogosphere but haven't had much time (or internet) to look. Some have found it notable that the GWPF is explicitly acknowledging a significant future warming (albeit at the low end of IPCC projections) thanks to ongoing emissions. I'm not sure how much of a narrowing of disagreement that represents, especially when others are doubling down on a high sensitivity.
BTW, for those who accuse me of being in the pay of the fossil fuel industry - yes, the trip to Bristol was funded by a major oil company :-)
Having strayed into tl;dr territory (some time ago!) I'd better stop here.
36 comments:
What often also gets shoved under the rug is that some of the observations are chancy. Probably not so much with temperatures (except for coverage issues see Cowtran and Way), but as a recent note by Trenberth et al pointed out precipitation records are in need of a major combing out and reconciliation.
Well, I think these days most people are aware of the temps issues and factor them into the comparison. At least, they should. Cowtan and Way is probably good enough that residual issues are unimportant. Precip and other things, I agree it's a bit more vague.
Generally enjoy your posts (even though all the lovely pictures load very slowly). But I'm curious about your presentation here.
"just possibly the forced response is a little lower than (most) models predicted" You used the word "little" quite a few times here, which gives me the impression that the changes don't matter much. But given that Lewis' numbers are about a third lower, and the fat right-hand tail of the distribution is enormously lower - why wouldn't that matter a tremendous lot?
First of all, losing that right-hand tail (I know you've written about this before) is a big deal - if we could nail this down (97% consensus?) we could all stop worrying about the really disastrous outcomes, and settle down to discussing nice economist future values stuff. I would think that it would be neat if more mainstream climate scientists would spend as much time telling fools on one side that the sea level likely won't rise _very_ far, and that extreme weather likely won't increase _very_ much, as they do attacking fools on the other side who think that they won't rise at all.
Secondly, according to some economists, renewables will become definitely cheaper than fossil fuels by any standard around mid-century. Maybe electric cars will be common by then too. Most nations may stop outputting much CO2 on their own around then. A lower sensitivity may bring the total damage way down.
Are you sure this doesn't make a big difference?
but these forcings actually have to generate a spatial warming pattern that negates the well-known pattern of GHG response.
I've found for sufficiently large and wide scale forcings the response patterns are almost the same, at least for surface temperatures. If you look at Figure 2 in Gillett et al. 2012 the aerosol and ghg trend maps are largely negations of the same pattern - differences are subtle, and easy to confuse with effects of internal variability in any real-world test.
On a related note I've just noticed there is a large discontinuity in the CanESM2 organic aerosol input from 2000 to 2001 such that the global average loading drops abruptly to a level which is 60% of the 1850-2000 increase. Looking at Figure 1 in Gillett et al. 2012 there is an abrupt warming of about 0.1ºC at that time - perhaps explains some of the post-2000 discrepancy in this model.
Interesting on that note that the Gillett et al. sensitivity tests indicate the choice to include post-2000 data tends to significantly reduce the "Other" scaling coefficient.
Also interesting in the sensitivity test array is one using only global mean temperatures, which produces nearly identical results to the full test involving spatial patterns. It could be that I don't understand what such a test involves exactly but it seems to me the similarity of these results proves what I'm saying about the redundancy of forcing response patterns. Of course, it could be a completely different story in other models.
Hmm, Shindell's new paper seems pertinent.
Well the response to different forcings can be similar, but D&A has generally managed to at least partially distinguish them, and it's a well known problem with geoengineering that we cannot completely cancel out co2 with other things. I would be very interested to see how the Shindell paper reconciles itself with the recent D&A results. It seems to me they are coming to quite different conclusions from the same evidence. Maybe PaulS has some thoughts?
Nic Lewis has a comment up on CA on Shindell.
Paul S, your link didn't work for me (extraneous chars at end) but this does.
http://www.newscientist.com/article/dn25187-how-much-hotter-is-the-planet-going-to-get.html
"It's a complicated story. So New Scientist has broken it down.
What is this climate sensitivity business about anyway?
If you kicked your best friend in the teeth, how would they react? Would they shrug it off, burst into tears, or stalk off to plot your murder?
Climate sensitivity is a measure of how strongly the planet will react to the kicking we are giving it. It is how much surface warming we can expect if we double the amount of carbon dioxide into the atmosphere.
Is that as simple as it seems?
If only. ..."
James,
I was mainly responding to the idea of finding spatial patterns related to forcing discrepancies over the past 20 years. Surely internal variability is too strong over such short periods?
I think you're right that subtle differences will be distinguishable to some extent over longer timescales, but van Oldenborgh et al. 2013 suggests all models show too little natural spatial variability. I understand D&A studies tend to be informed about internal variability by the models themselves, so aren't the results necessarily overconfident?
On the Shindell paper,
Can't access the main paper, so only the SI to go on. Looks a bit of a mess trying to combine forcing estimates from different methods (regression versus fixedSST) and in some cases different models in order to derive TCR from linear formulae.
The 1%/yr 2xCO2 TCR used for a couple of models doesn't match figures elsewhere. HadGEM2-ES is reported in AR5 as 2.5ºC, whereas here it is 2.1ºC. Using the more accepted number changes the TCR enhancement factor from 1.48 to 1.24.
Despite these issues there is still an apparent correlation, though the small sample size, and uncertainties related to how forcing factors are combined, means it would be difficult to argue against the idea that it may be spurious.
From Gavin's paper:
"Here we argue that a combination of factors, by coincidence, conspired to dampen warming trends in the real world after about 1992."
Whew! Good thing that the "dampening" "conspired" by "coincidence" or we would have had to call Lewandowsky.
Very revealing choice of words.
I think 'slightly optimistic' is quite generous. I thought the GWPF report was badly biased, personally. It sought an excuse to dismiss all moderate to high sensitivity estimates - some excuses valid but exaggerated, others wrong - and didn't even consider the shortcomings with the method that yields the lower sensitivity estimate. And then Shindell's paper rather demolishes the GWPF report as well. In any case, my take is here.
I even think I saw some wag in a recent paper (sorry I forgot where) arguing that there were so many excuses for a lack of warming, that the logical conclusion from the model-data discrepancy was that sensitivity was actually higher than the models say!
1. Perhaps the Economist?
2. Why would that be a priori out of the realm of possibility? Not that the models aren't sensitive enough, but that when the combination of unforced variability, coverage bias, forcing errors are fully accounted for, is it really out of the realm of possibility that the adjusted trends would have the obs running a little above the multimodel mean?
Aside, Paul S's original link failed with invisible control characters tacked on; I 'oogled them because I saw the same thing happen to one of my links at RC a few days ago. It's a known, er, something that WP does.
Dano,
I think it's a bit early to say that Shindell's paper "demolishes" anything. There is a whole lot of literature, authored by a whole lot of prominent people, that argues for a rather different conclusion to this one new paper by Shindell. That doesn't necessarily mean he's wrong, but shouting a single paper because you like it is exactly what the sceptics do when they sneak some crap into a journal. It will take a while for people to work out if they like it, and it seems to me there are some difficulties with it.
TB, yes it may have been the economist - at least, that would explain why I can't find it in my recently read papers. I don't think it's completely impossible that the models are too insensitive, but taking a step back from the minutiae, there's been a whole bunch of motivated reasoning going on, and this slow-motion divergence has been going on some time now...I know which side my money would be on.
Paul,
Well it's not black and white, but more shades of grey. I don't believe that von Oldenborg paper accounted correctly for observational uncertainty, which should in principle weaken their result (though I don't really doubt the overall conclusion). I think the D&A people claim that the models do a pretty respectable job on internal variability, and that their results are generally robust to reasonable uncertainty regarding this.
but taking a step back from the minutiae, there's been a whole bunch of motivated reasoning going on
Sure. But there was also "a whole bunch of motivated reasoning going on" to explain why the satellites were showing cooling instead of warming. The motivation was the expectation based on our understanding of the system at the time. Now, I am not saying that these two examples are entirely equivalent, but merely suggesting that "motivated reasoning" to find out why obs aren't confirming expectations is not in and of itself foolish.
I can assure you, seeing the project grow and the results emerge in more or less real time, the coverage bias issue was not suspected to be quite as large nor quite as tied to the 1998-present window as it was. Although I have less personal familiarity with the project, I've been keeping up with the literature as it comes out on the issue of strat aerosols, and this too seems to be somewhat of a coincidence between something that would have been noticed and published on eventually with the desire to look for explanations for the model-obs discrepancy.
There have of course been some bad/wrong explanations which aren't based on obs or physics, but mere curve-fitting, but those are pretty obviously rubbish and are being ignored or refuted in the literature.
thingsbreak: I can assure you, seeing the project grow and the results emerge in more or less real time, the coverage bias issue was not suspected to be quite as large nor quite as tied to the 1998-present window as it was
Actually it's quoted for 1997-2012 (or 2013 for the in press paper) for Cowtan and Way. But in fact, the effect suggested by Cowtan and Way is pretty small, even for this period.
If you plot the difference as zonal trends, you see only a slight change due to the improved coverage from HadCRUT4.
There are substantial regions where you have negative trends. So the factor of two arises because of cancellations of the contributions from lower latitudes, making the relatively small adjustments at high Arctic regions relatively more important.
In long term averages, the effect from C&W is relatively small. I did a comparison of trends for the 80-90N and global values for a number of different data products and intervals.
For 1959-2012, C&W Kriging (C&W satellite hybrid is not available for this interval) is actually slightly smaller that HacCRUT4.
The real take home is yes, you want to include the missing regions, and Cowtan and Way is probably an improvement over the existing HadCRUT4 product, but no, it's not an explanation of the roughly factor of two discrepancy between the long-term trends seen observation record and GCMs.
The picture is essentially unchanged by the addition of the improved coverage.
Carrick, this is silly. I didn't claim that CW13 explained all of the discrepancy. In fact, I explicitly discussed another factor in the discrepancy (unaccounted for strat aerosols), and these are by no means the only two. That there are multiple factors involved is the entire point of the discussion I was having with James
And CW13 certainly does have a large impact on the 1998-2013 trends, it doubles the decadal rate.
You appear to be arguing with strawmen of your own construction- something I have no interest in joining.
Carrick,
The C&W 1997-2012 trend is about 0.11K/Decade for both methods, HadCRUT4 trend from the published median time series is about 0.05K/Decade.
I'm not sure what you think the latitudinal plot shows in terms of an argument that the difference is small? One of the main points made by C&W was about how the standard practice of treating missing grid cells affects the results when producing global averages: they are effectively infilled with the global average rather than local information. The latitudinal averages for HadCRUT4 closely match C&W (as you'd hope) but there are only a small number of filled grid cells in the highest latitudes. In producing the HadCRUT4 global time series the remaining empty cells are effectively infilled with the global average, not the latitudinal average.
thingsbreak: And CW13 certainly does have a large impact on the 1998-2013 trends, it doubles the decadal rate.
The point is that the multiplicative difference is meaningless, because the numbers aren't positive definite. It gives a false indicate of the magnitude of the difference, which is 0.13 - 0.06 or about 0.07°C/decade.
That number is tiny compared to the uncertainties involved.
You appear to be arguing with strawmen of your own construction- something I have no interest in joining.
Strawman? What nonsense.
PaulS: The C&W 1997-2012 trend is about 0.11K/Decade for both methods, HadCRUT4 trend from the published median time series is about 0.05K/Decade.
Which as I pointed out to TB, this amounts to a difference of 0.07K/decade. Compare this to expected fluctuation (95%) in the 15-year OLS trend of about 0.16K/decade.
That's the sense in which it is "tiny".
I'm not sure what you think the latitudinal plot shows in terms of an argument that the difference is small?
That's not the argument I was making. The trend has both signs over the full range of -90 … + 90. The small trends seen is partly due to a cancellation between larger quantities.
Because the quantities aren't positive definite, the ratio of trends isn't particularly meaningful.
So suppose it was 0.06 °C/decade C&W versus 0.0 °C/decade for HadCRUT. C&W would be infinitely better.
Suppose instead it was 0.00°C/decade C&W versus -0.06/decade for HadCRUT. C&W would be infinitely worse.
You specify scaling bias as a ratio, bu this is not a scaling error. But the effect of bias error itself can have either sign, so you can't argue it's a scaling error, rather it's an offset error.
Look at my table… C&W Kriging versus HadCRUT 4 global as an example 1959-2012: 0.130 versus 0.132 for HadCRUT 4. HadCrut is biased high for earlier periods due to missing coverage in the Souther part of the globe.
In producing the HadCRUT4 global time series the remaining empty cells are effectively infilled with the global average, not the latitudinal average
I agree with that part, but this difference due primarily to what I view as an error in the HadCRUT algorithm. It should be using a better estimate of temperature for the missing cells than the global average.
But of course that point was well known before C&W.
I like what C&W have done over all, but I do think they deserve a bit of criticism for quoting the effect of their results as a multiplicative factor, which is misleading because the actual magnitude of the correction is indisputably small, and for quite brazenly cherry picking an extreme interval to compare the difference between their result and the original HadCRUT algorithm.
I suspect had Chris Monckton done exactly this same thing, there'd be people all over him. I'm just not a fan of "free passes" because we "like the results".
Eli refers James, Carrick and The Thing to the first comment. Flyspecking rates in spotty data over short periods and small areas is inherently futile. Mom Rabett taught Eli to avoid taking derivatives of noisy data
Eli, in general I agree.
That said, I don't see any harm in including the 1997- 2012/2013 interval in your analysis, especially since people like Chris Monckton have been cherry picking it for years, but I think you need to use other intervals too. 1997-2012/2013 would be a much better interval for your "flagship result".
This interval is especially odd in that C&W seem self-aware that the El Niño event is distorting their results:
Trends starting in 1997 or 1998 are particularly biased with respect to the global trend. The issue is exacerbated by the strong El Niño event of 1997–1998, which also tends to suppress trends starting during those years.
Hopefully Eli's Mom also taught Eli to not pick intervals that contain an outlier near one end point. This is criticized strongly when the skeptics do it, and there shouldn't be two sets of standards for when a particular thing is appropriate to do.
That was a dumb typo!!!
I meant to say
1979-2012/2013 would be a much better interval for your "flagship result
Carrick,
CW13 indeed looked at 1979-2012/13. What you seem to be bothered by is not at all the paper itself, but rather the way the paper has been framed. Which would I guess be a fair point if not for the fact that the framing you have a problem with only exists because of an initial meme that you agree was in need of refutation.
If there was no "warming stopped in 1998" meme, CW13 would simply be known as a superior (to HadCRUT) method of dealing with the missing coverage. And believe me, the authors would have been perfectly happy to leave it at that. Interestingly, the HadCRUT bias turned out to be most pronounced from 1998 on. In the absence of the denailist meme, this would have been a 'fun fact' or what have you. That is not the world we live in, however.
CW13 did not set out to 'prove' anything one way or another (though admittedly, common sense and speculation by many others pointed to HadCRUT underestimating the global mean). It turned out that, by virtue of the data and by virtue of denialists making such a huge deal over the post El Niño warming rate, CW13 directly refuted an incredibly common misconception.
If you believe that in a perfect world, the 1998-present aspect of CW13 should take a back seat to the 1979-present (and even further back with newer iterations) results, and that no one would ever have made a big deal about the 1998-present "no warming" meme in the first place, CW would wholeheartedly agree.
thingsbreak, thank you for your comments.
I do generally agree with what you have said here regarding C&W. And I am sure I come across as more negative than I really am about their paper. [Other than what amount to presentation issues, my only other open question is how they've done the cross-validation. That's too O/T here.]
To be clear, I'm not arguing that C&W 13 isn't likely superior to HadCRUT4. Assuming they've done the infilling more accurately than HadCRUT4, which is likely, it certainly is. Also, the comparison between HadCRUT4, C&W and other series suggests that C&W's result is not outlandish, and in fact is bracketed between ERA40-Iterim and HadCRUT4. This gives some reason to be confident about the result (had their polar values been insanely high, that would signal a problem, but the values they give are totally plausible, given the other methods for estimation of zonal trends).
Correcting the problems with HadCRUT4 is definitely is an issue that needs addressing. In deed, I've made comments on James' blog in the past on this very issue in fact, and I think C&W did a very nice job of addressing it and deserve praise for their hard work.
Nor do I have a problem with addressing the 1997-2012 interval. In fact I think it would be a mistake to ignore it, for the reasons you've explicated. But I think it needed to be bracketed by more representative results that don't have the issues a major ENSO event near one boundary.
What I do have a problem with is discussing the effect of the coverage bias as a ratio, because it is misleading and even wrong to discuss a result this way, when you have quantities that can either be positive, negative, or zero.
The right way to look at the comparison is to look at the difference in trends. This is how I discussed the comparison above, and in fact in fact Cowtan and Way do just that: See Table 4 of Cowtan and Way, 2013.
Cowtan and Way's bias estimate, from their own table never reaches 1-standard deviation (the large magnitude difference is 0.7 sigma). This again is the sense in which I meant the correction is "small".
Ultimately this paper is just not the game changer that some have portrayed it as. The question we had before about why there is a "hiatus" is still unanswered.
So unless the USAF was flying an MSU before 1979 it is futile to ask for a blended record (v1) from C&W but they are producing new separate land (v2)/sea ones and it looks like that this has indeed been a very hot decade (rank in the various records)
Rank HDCR4 C&Wv2 C&Wv1
1 2010 2010 2010
2 2005 2005 2005
3 1998 2007 2009
4 2003 2009 2007
5 2006 2013 2013
6 2009 2006 2002
7 2002 1998 2006
8 2013 2002 2012
9 2007 2003 2003
10 2012 2012 1998
Eli: So unless the USAF was flying an MSU before 1979 it is futile to ask for a blended record (v1) from C&W but they are producing new separate land (v2)/sea ones and it looks like that this has indeed been a very hot decade (rank in the various records)
They are also doing kriging in their new series, which gets them back to circa 1810. (The C&W kriging I reported on is from their new paper.)
Robert Way has reservations about using the reanalysis products for computing trends. To the extent these concerns are valid, you could still use these instead of satellite to infill the missing regions. I actually think this would be better than satellite, since satellite is just a distant third cousin to 1-m elevation surface air temperature.
James, you and Annan 2012 along with Schmittner 2011 and bracketed by Olson et al. 2012 A climate sensitivity estimate using Bayesian fusion of instrumental observations and an Earth System model estimating modern sensitivity with much the same method as the Schmittner paper make a case for the LGM showing a lower bound for sensitivity. However your sensitivity result is largely based on a high estimate for the LGM temperature, and so the opposite conclusion may also apply.
Tripati_et al. 2014 Modern and glacial tropical snowlines controlled by sea surface temperature and atmospheric mixing are not quite convinced of your temperature and say "Furthermore, if all glacial tropical temperatures were cooler than previously estimated, it would imply a higher equilibrium climate sensitivity than included in present models9, 10."
Telford_et al. 2013 Mismatch between the depth habitat of planktonic foraminifera and the calibration depth of SST transfer functions may bias reconstructions say
"A sensitivity study undertaken by Schmittner
et al. (2011) finds that a global 0.5°C bias in LGM ocean
anomalies gives a 1°C change in climate sensitivity. Hargreaves
et al. (2012) find that there is a 1 : 1.2 relationship between
modelled tropical LGM temperatures anomalies and
climate sensitivity in an ensemble of PMIP2 models, and use
this relationship to estimate climate sensitivity from LGM
proxy data; if the LGM cooling has been underestimated by1°C, their estimate of climate sensitivity would be 1.2°C too low."
So a lower reconstructed LGM temperature reconstruction based on oxygen isotopes per Hansen and others and consequent higher sensitivity may yet be correct as far as I can see. Do you agree?
Pete Dunkelberg
Pete,
Yes certainly our LGM analysis was based on what we considered to be the best proxy analyses available at that time - although even then, there was some debate. If the LGM was colder, that would point to a higher sensitivity. However, I don't think we should be too quick to overthrow what was a quite widely held position (eg in the IPCC) based on one new paper. There were also factors pointing to a lower sensitivity than our headline figure (due to biases in the LGM simulation protocol) and a number of other possible problems too.
As for the pause and C&W etc, I think it's clear that the adjustments they made have little broader scientific impact, as they are only correcting what everyone realised was a bit of a limitation of HadCRUT (that is, if one wants a true global mean). Debate over whether the warming has really "paused" and the statistical significance of a trend over a short interval are basically debating points.
TB (back on the 18th) yes I have myself remarked on the past how some model-data discrepancies have been resolved substantially in favour of the models. However, I still think that the most reasonable interpretation of all the evidence is that reality (at least as far as transient global mean temp change is concerned) is somewhere around the lower end of the model range. I'm not saying outside the range...
It's worth noting here the exchange between Miles Allen and Nic Lewis on BishopHill.
Post a Comment