We thought there were quite possibly some problems with her result, but weren’t a priori sure how important a factor this might have been, so that was an extra motivation to revisit our own work.
It took a while, mostly because I was trying to incrementally improve our previous method (multivariate pattern scaling) and it took a long time to get round to realising that what I really wanted was an Ensemble Kalman Filter, which is what Tierney et al (TEA) had already used. However, they used an ensemble made by sampling internal variability of a single model (CESM1-2) and a few different sets of boundary conditions (18ka and 21ka for LGM, 0 and 3ka for the pre-industrial), whereas I’m using the PMIP meta-ensemble of PMIP2, PMIP3, and PMIP4 models.
OK, being honest, that was part of the reason, the other part was general procrastination and laziness. Once I could see where it was going, tidying up the details for publication was a bit boring. But it got done, and the paper is currently in review at CPD. Our new headline result is -4.5±1.7C, so slightly colder and much more uncertain than our previous result, but nowhere near as cold as TEA.
I submitted an abstract for the EGU meeting which is on again right now. It’s fully blended in-person and on-line now, which is a fabulous step forwards that I’ve been agitating for from the sidelines for a while. They used to say it was impossible, but covid forced their hand somewhat with two years of virtual meetings, and now they have worked out how to blend it. A few teething niggles but it’s working pretty well, at least for us as virtual attendees. Talks are very short so rather than go over the whole reconstruction again (I’ve presented early versions previously) I focussed just on one question: why is our result so different from Tierney et al? While I hadn’t set out specifically to critique that work, the reviewers seemed keen to explore, so I’ve recently done a bit more digging into our result. My presentation can be found via this link, I think.
One might assume a major reason might be that the new TEA proxy data set was substantially colder than what went before, but we didn’t find that to be the case. In fact many of the gridded data points coincide physically with the MARGO SST data set which we had previously used, and the average value over these locations was only 0.3C colder in TEA than MARGO (though there was a substantial RMS difference between the points, which is interesting in itself as it suggests that these temperature estimates may still be rather uncertain). A modest cooling of 0.3 in the mean for these SST points might be expected to translate to about 0.5 or so for surface air temperature globally, not close to to the 2.1C difference seen between our 2013 result and their 2020 paper. Also, our results are very similar when we switch between using MARGO and TEA and both together. So, we don’t believe the new TEA data are substantially different from what went before.
What is really different between TEA and our new work is the priors we used.
Here is a figure summarising our main analysis, which follows the Ensemble Kalman Filter approach, which means we have a prior ensemble of model simulations (lower blue dots, summarised in the blue gaussian curve above) each of which is updated by nudging towards observations, generating the posterior ensemble of upper red dots and red curve. I’ve highlighted one model in green, which is CESM1-2. Under this plot I have pasted bits of a figure from Tierney et al which shows their prior and posterior 95% ranges. I lined up the scales carefully. You can see that the middle of their ensembles, which are entirely based on CESM1-2, are really quite close to what we get with the CESM1-2 model (the big dots in their ranges are the median of their distributions, which obviously aren’t quite gaussian). Their calculation isn’t identical to what we get with CESM1-2, because it’s a different model simulation, with different forcing, we are using different data and there are various other differences in the details of our calculation. But it’s close.
Here is a terrible animated gif. It isn’t that fuzzy in the full presentation. What it shows is the latitudinal temperatures (anomalies relative to pre-industrial) of our posterior ensemble of reconstructions (thin black lines, thick line showing the mean), with the CESM-derived member highlighted in green, and Tierney et al’s mean estimate added in purple. The structural similarity between those two lines is striking.
A simple calculation also shows that the global temperature field of our CESM-derived sample is closer to their mean in the RMS difference sense, than any other of our ensemble members. Clearly, there’s a strong imprint of the underlying model even after the nudge towards the data sets.
So, this is why we think their result is largely down to their choice of prior. While we have a solution that looks like their mean estimate, this lies close to the edge of our range. The reason they don’t have any solutions that look like the bulk of our results is simply that they excluded them a priori. It’s nothing to do with their new data or their analysis method.
We’ve been warning against the use of single model ensembles to represent uncertainty in climate change for a full decade now, it’s disappointing that the message doesn’t seem to have got through.
4 comments:
What is the definition of "LGM temperature anomaly?" I.e., to what date's average temperature is "-4.5±1.7°C" or "-6.1±0.4°C" compared? Is it the 2000-2020 average? Is it 1950 (like the "BP" baseline)? Is it HCO peak? Is it late LIA "pre-industrial?"
Tierney says, "Our assimilated product provides a constraint on global mean LGM cooling of −6.1 degrees Celsius (95 per cent confidence interval: −6.5 to −5.7 degrees Celsius)." Calling it "cooling" means that she's comparing with a prior temperature, i.e., the Eemian. But that seems very odd.
My guess is that she doesn't really mean "cooling," she means "6.1°C cooler than _____", and calling it "cooling" is just another example of the paleoclimate community's fuzzy jargon. But what date should fill in the blank is unmentioned.
{RANT} I find annoying the paleoclimate community's bad habit of using jargon which redefines and contradicts plain English. E.g., why say "BP," which stands for "Before Present," when they actually mean "before 1950"? That's pretty much guaranteed to mislead readers who aren't part of the "in crowd," and don't know the secret code. Why can't they use normal English, which means what it says, or at the very least, invent NEW terms, which don't redefine established meanings? E.g., why not call it "B1950"? {/RANT}
The baseline is basically "pre-industrial" in both cases and the definition is a little bit imprecise and variable between authors, but if you think of 1850-1900 you won't be far wrong. Tierney et al actually used the top of their cores (and called it "Late Holocene" rather than "pre-industrial"), and the date for these won't be the same for all data points, it depends what is retrieved and analysed. However, we actually tried a few sensitivity tests with different baselines and the differences were only minor. Not that much happened up to the last 50y or so really, at least compared to an ice age.
Bear in mind that when writing research papers, they are very much aimed at the in-crowd and not a wider readership.
JA,
You know this I am sure ...
Objectively combining climate sensitivity evidence
Nicholas Lewis
Critiquing Sherwood, et. al. (2020).
I will patiently await anything you all have to say in response.
Thanks in advance.
Sadly yes I'm vaguely aware of it. If you want to read and provide a summary, have fun :-)
Post a Comment