James' Empty Blog: July 2020

Thursday, July 30, 2020

The price of freedom

The Govt changed the lockdown rules substantially from the 4th July, with pubs, restaurants reopening and a new “1m plus” rule to replace the previous 2m distancing requirement. Predictably, the tabloids announced a new free-for-all which they labelled “Independence day”.

Up to this time, the R number had been fairly stable at around 0.8, meaning that each infected person would pass the disease onto less than one person on average and the rate of illness (and death) was dropping fairly steadily at about 20% reduction per week.

Below is how my model fits to the data up to 4th July (red circles in both plots). You can click on the plots for bigger and clearer versions. The left hand plot is daily reported cases, and the right hand plot is daily deaths. The green plume shows the model fit to each of these, with a few lines from the ensemble drawn on (dark blue) and the median prediction in magenta. The thin blue plume is the total modelled number of new infections each day, which is much higher. There is also a red plume on this plot representing the “case ascertainment factor”, ie the proportion of infections that is actually observed. This uses the scale on the right hand side of the plot, and so rises from about 1% at the start of the epidemic, to around 10% now. The blue circles represent data that had not been observed by the 4th July, and you can see in the LH plot that they tend to drift above the model forecast.

On the right hand plot, the red plume is the R number (which again uses the axis on the right hand side of that plot). It starts off around 3ish, then drops sharply when the lockdown controls were imposed, and wobbles around a bit after that point. The “current” number quoted there (mean and range) is the estimate as of the 4th July. The data observed subsequent to that date agree better with the model than was the case in the LH plot, but still look to be more above than below the forecast.

Redoing the analysis as of yesterday's data (i.e. including all data points in the estimation), and we get the following:

Now the rise in cases is reflected in the LH graph, and the corresponding rise in R is shown at the bottom of the RH plot. R is probably greater than 1, meaning that the epidemic is starting to take off again. It seems that something happened around the 4th of July to increase the rate of infection. I wonder what that could have been?

So, emboldened by these results and Peter's comment below we can try adding in a step change on the 4th July - this is just a high variance step in the prior, I'm not imposing a rise specifically, just allowing a large change. This generates the result below and it looks like a rather better fit especially to cases. However I'm not really that confident about what is going to happen and especially wouldn't be surprised if there is a bit of a decoupling between cases and deaths due to differences in the age range of people infected (eg mostly younger working age with a much lower fatality rate).

Thursday, July 23, 2020

BlueSkiesResearch.org.uk: Back to the future

Way back in the mists of time (ie, 2006), jules and I saw what was going on with people estimating climate sensitivity, and in particular how this literature was interpreted by the authors of the IPCC AR4. And we didn’t like it. We thought that any reasonable synthesis should consider the multiple lines of evidence in a coherent fashion in order to form a credible overall view. This resulted in the paper "Using multiple observationally‐based constraints to estimate climate sensitivity" described in this blog post (paper here), which people unfamiliar with the story might like to glance at before progressing further…

It’s fair to say that our intervention was not met by universal approval at the time, with the established researchers mostly finding excuses as to why our result might not be entirely trustworthy. Fine, do your own calculations, we said. And they didn’t.

Time passed, and a new generation of people with different backgrounds became interested in estimating climate sensitivity. The World Climate Research Program (WCRP) made it a central theme in one of their Grand Challenges in climate science. There were a couple of meetings in Ringberg that jules and then I attended sequentially.

In 2016, several of leaders of this WCRP steering group wrote a paper which kicked off a project to perform a new synthesis of the evidence on climate sensitivity. Their idea was to form an overall synthesis of the multiple lines of evidence, roughly along the lines that we had originally proposed, but in a far more comprehensive and thorough fashion. This is something that the IPCC isn’t really equipped to do, as it just assesses and summarises the literature. The project leaders considered three main strands of evidence: that arising from process studies (ie the behaviour of clouds, including simulations from GCMs), the transient warming over the historical record, and paleoclimate. Jules was one of the lead authors for the paleo chapter, but I wasn’t involved at the outset. However when invited to join the group I was of course happy to contribute to it, having thought about the problem off and on for the past decade.

Writing it was a lengthy and at times frustrating process, due to the huge range of ideas, topics, backgrounds and knowledge of the author team. That is also what gives this review its strength, of course, as we have genuine experts in multiple areas of modelling and data analysis, covering a huge range of time scales and techniques, and the different perspectives meant we gave each other quite a workout in testing the robustness of our approaches and ideas. During the 4 year process we had regular videoconferences, typically 9pm UK time, being 6am for Japan, 10am in Australia and afternoon for the continental USA. Luckily we had an 8-9h gap in the global spread so no-one actually had to get up in the middle of the night each time! We also had a single major writing meeting in Edinburgh in summer 2018 which almost all the main authors were able to attend in person, and a handful of "meet-ups of opportunity" when subsets happened to go to other conferences. In all, it was good practice for the new normal that we are enjoying due to COVID.

The peer review was probably the most extensive I’ve experienced, with something like 10 sets of comments – this was something we were all keen on, as we suspected it would be beyond the compass of just the usual 2-3 people. Comments were basically encouraging but gave us quite a lot to work on and in fact we reorganised the paper substantially for the better resulting in the 2nd set of reviews being very positive. Finally got it done a couple of months ago and it was accepted subject to very minor corrections (which were mostly things we had spotted ourselves, in fact).

The new paper has now been published, actually I’m not entirely sure it is up yet (minor snafu on the embargo timing) but anyone who needs an urgent look can find it here. I may write more on the details if pressed, but for now here is a quick peek at the main results:

The "baseline" calculation is what we get from putting together all the evidence, with a resulting 2.6-3.9C "likely" range. The coloured curves are various sensitivity tests, with the purple line at the top defined as the range from the lowest 17th percentile, and the highest 83rd percentile, across these tests. This isn’t really a probability range and doesn’t correspond to any particular calculation.

Tuesday, July 21, 2020

That Russian Report, in full, in brief

We hear no evidence of Russian interference. We discuss no evidence of Russian interference. We see no evidence of Russian interference.

Sunday, July 19, 2020

Patrick Vallance's faulty memory

On reflection, perhaps it shouldn't be surprising. We expect the Chief Scientist to be a genius with a brain the size of a planet who is perpetually on top of their game, but in fact they are a human frequently operating under great stress, and fallible like the rest of us. Nevertheless, his first responsibility - and ours - is to the truth, and it is therefore my task to explain that he unfortunately misled the House of Commons Science and Technology Committee when he appeared before it on Thursday 16th July.

The topic under consideration is SAGE's recommendations around mid-March, when the various restrictions were being introduced - some have argued (and I'm among them) that this happened rather too late, with the result that the country suffered many more deaths, and far greater economic damage, than would have been the case with prompt action.

Most of the interesting action during his appearance was under questioning from Graham Stringer MP, from about 50 minutes in to the video, or Q1041 on the transcript. Stringer is pressing him on the promtness (or otherwise) of introducing the lockdown, and particularly the speed of response to the data showing more rapid doubling than they had originally assumed:

Q1041 Graham Stringer: As a scientist, I was always taught to forget hypotheses, theories and ideas and look at the data, because having preconceived ideas can distort the way you look at things. When we went into this, scientists in this country were looking at data from China that showed a doubling of the infection every six or seven days. When you looked at our data closely, the infection death rates were doubling every 30 to 36 hours. Why didn’t you and SAGE advise the Government to change their attitude because, if you had looked at that and given that advice, the lockdown might have happened earlier?

To start with, to avoid the usual tedious ducking and weaving from the usual tedious suspects, it's important to be clear about the terms. When Stringer and Vallance are talking about “lockdown”, they mean the strict policies from the 23rd March onwards, when we were told to stay at home, all non-essential shopping and travel was forbidden, etc. As Vallance puts it:

there was a series of steps in the run-up to lockdown, which started with the isolation of people who had come from China, but the main ones were: case isolation; household isolation; and recommendations not to go to pubs, theatres and so on.

So, “lockdown” here means policies of the 23rd March, as also confirmed by Hancock in Hansard:

the level of daily deaths is lower than at any time since lockdown began on 23 March.

Sorry for this tedious pedantry, but experience has shown some people will, having lost the argument about timing, duck and weave about what "lockdown" means in the first place.

So, back to the timing. Vallance's main claim, which I will argue is incorrect, is contained in the following sentences:

When the SAGE sub-group on modelling, SPI-M, saw that the doubling time had gone down to three days, which was in the middle of March, that was when the advice SAGE issued was that the remainder of the measures should be introduced as soon as possible. I think that advice was given on 16 or 18 March, and that was when those data became available.

Note how clear he is that this advice to introduce the remainder of the measures - ie implementation of the full lockdown - was based on the realisation that the doubling time was as short as 3 days. I'll let him off with his use of “had gone down to” - in reality the doubling time had not changed at all, it was just SAGE's realisation that had gone down, but I will be generous and attribute this to sloppy language. He emphasises this reliance on the new data repeatedly:

Sir Patrick Vallance: Knowledge of the three-day doubling rate became evident during the week before.

Q1042 Graham Stringer: Did it immediately affect the recommendations on what to do?

Sir Patrick Vallance: It absolutely affected the recommendations on what to do, which was that the remaining measures should be implemented as soon as possible. I think that was the advice given.

and again:

Sir Patrick Vallance: The advice changed because the doubling rate of the epidemic was seen to be down to three days instead of six or seven days. We did not explicitly say how many weeks we were behind Italy as a reason to change; it was the doubling time, and the realisation that, on the basis of the data, we were further ahead in the epidemic than had been thought by the modelling groups up until that time.

So he is absolutely certain that the advice to proceed full steam ahead on the lockdown was predicated on the new 3 day doubling time.

However, he also claimed that this advice was given “on 16 or 18 March.” This is the critical error in his statements, that prompted this blog. Some people have jumped on this claim (and to be fair to Vallance, he was obviously unsure of the exact date in his response) to argue that the Govt was slow to react to SAGE's recommendation, and that this was the cause of the late lockdown and large death toll.

Unfortunately, Vallance was mistaken with his dates. In fact, SAGE actually still thought the doubling time was 5-6 days on the 16th March (minutes):

UK cases may be doubling in number every 5-6 days.

and by the 18th March their estimate was even slightly longer (minutes):

Assuming a doubling time of around 5-7 days continues to be reasonable.

It is therefore not at all surprising that the minutes of these two meetings do not contain any recommendation, or even a hint of a suggestion of a recommendation, that we should proceed with haste to a full lockdown. In fact the minutes of the 18th March make the very specific and detailed recommendation that schools should be shut, with the clear statement that further action would only be necessary “if compliance rates are low” (NB compliance with all measures has been consistently higher than in the modelling assumptions):

2. SAGE advises that available evidence now supports implementing school closures on a national level as soon as practicable to prevent NHS intensive care capacity being exceeded.
3. SAGE advises that the measures already announced should have a significant effect, provided compliance rates are good and in line with the assumptions. Additional measures will be needed if compliance rates are low.

Incidentally, this is why we have to be precise about what “lockdown” means, so that certain people don't pivot to “Aha! They said we should shut something! Vallance was right all along!” SAGE here is not recommending “lockdown” in the sense used by Vallance, Stringer, Hancock, or anyone else. They are only recommending school closures, which the Govt did implement promptly at that time.

Now let's go back to this from Vallance:

When the SAGE sub-group on modelling, SPI-M, saw that the doubling time had gone down to three days, which was in the middle of March, that was when the advice SAGE issued was that the remainder of the measures should be introduced as soon as possible.

The relevant SPI-M meeting at which they reduced their estimate of doubling time was actually on the 20th March (minutes). At this meeting, they abruptly realised:

Nowcasting and forecasting of the current situation in the UK suggests that the doubling time of cases accruing in ICU is short, ranging from 3 to 5 days.

[...]

The observed rapid increase in ICU admissions is consistent with a higher reproduction number than 2.4 previously estimated and modelled; we cannot rule out it being higher than 3.

All well and good, but a week late.

The nest SAGE meeting was on the 23rd (21st-22nd was a weekend) and at this point they conclude (minutes):

The accumulation of cases over the previous two weeks suggests the reproduction number is slightly higher than previously reported. The science suggests this is now around 2.6-2.8. The doubling time for ICU patients is estimated to be 3-4 days.

(NB doubling time is in principle the same for all measures of the outbreak, ignoring transient effects as the epidemic gets established. That's why it is such a useful concept and measure.)

SAGE also state at this meeting on the 23rd:

Case numbers in London could exceed NHS capacity within the next 10 days on the current trajectory.

They don't explicitly minute the need for a tight lockdown, but certainly provide statements that point in that direction, such as:

High rates of compliance for social distancing will be needed to bring the reproduction number below one and to bring cases within NHS capacity.

It seems reasonable to conclude that the message taken from this meeting was that London at least was on the verge of exceeding capacity and that strong measures needed to be urgently taken to slow transmission. As Vallance had put it:

the remaining measures should be implemented as soon as possible.

So it seems that Vallance has described the narrative arc precisely as the minutes of all the meetings around this time describe, but for the important fact that he got the date of this final recommendation wrong. He appears to have created a false memory of a world where the heroes of SAGE worked it all out in the nick of time, and told the government....who then sat on this information and delayed lockdown for a week. It's a nice story, but it's not actually what happened. The data were certainly clear to many by mid-March (ie the 14th, prior to the famously uncalibrated runs of the Imperial College model) but SAGE resolutely ignored and rejected this evidence for a further week, and this delay caused huge unnecessary harm to the country.

This would be a minor tale of a small slip of memory, were it not for the unfortunate fact that various factions have glommed onto Vallance's statement as proof that the scientists were blameless and the Govt guilty. Most egregiously, SAGE member Jeremy Farrar tweeted:

To be clear SAGE advice came the SAGE Meeting on Friday 13th March. When infections are doubling every 2-3 days, those days matter.
— Jeremy Farrar (@JeremyFarrar) July 16, 2020

To make the mistake that Vallance did, under pressure of live questioning, is forgivable. To double down on the error from the comfort of your own computer, when the documentation is freely available, is not. The minutes prove that SAGE did not accept the evidence of the short doubling time on the 16th and 18th March. It is quite possible that some SAGE members - perhaps including Farrar - had tried to sound the alarm about the rapid doubling at an earlier time. However, they did not carry the day and I find no evidence that they spoke up in public either. SAGE did not recommend lockdown prior to the 23rd March, however much it suits various peoples' agendas to claim so.

Saturday, July 18, 2020

Mountains and molehills

A couple of weeks ago, I heard about an issue with the way COVID deaths are counted in England. It seems that PHE are going through the lists of people who had died every day, and checking to see if they had previously had a positive COVID test. If so, they are added to the total number of COVID deaths for the day, even if they had long since made a full recovery and were run over by a bus (or died of some other illness).

Clearly this is wrong, and will tend to overstate the number of people killed by the disease. Equally clearly, there aren't many deaths falling into this situation. Take the total number of 300k positive tests, assume this means 300k people (which it doesn't, as many people are tested more than once) and that they have an average remaining life span of 40 years. Then we'd expect to see 20 of them die every day from all causes, implying about this many of the daily "COVID deaths" in the PHE stats are bogus. That took me under 5 mins to work out, so I shrugged and ignored the issue. My number might not be quite right, the 40m remaining years of life thing will depend on the precise age/gender distribution of those who have tested positive but it's hard to see it being too far wrong. In the face of 100 deaths per day, about 20 of them being erroneously counted is not a huge issue though it would become more of a problem as/when the daily death toll shrinks further. It certainly has little bearing on any retrospective analysis of the size of the outbreak so far.

Two weeks later, and Loke (who I now note is who I first learnt about this issue from) and Heneghan write an article covering this issue, and promote it all round the press. I'm sure it is just an unfortunate accident that they make it sound like it's a really big issue that is major factor in explaining why the death toll in England has remained so high, as they are surely competent enough to have reproduced the calculation I presented above. Unsurprisingly, it's been picked up by the denialist wing of the media which is desperately trying to pretend that the response in England has been anything other than absolutely terrible. It's probably worth mentioning that Heneghan has form for minimising the dangers of COVID: in this piece he argues that the fatality rate is down around 0.2% which is far below all credible estimates I've seen and implies that a very large proportion of UK population has had the disease, which is robustly refuted by a hefty pile of evidence.

Now Hancock has called for an "urgent inquiry" into this and is using it as a excuse to halt publication of the daily statistics. Even though he's a bit dim, it's hard to believe he doesn't have any numerate advisors who could tell him why it's not that big a deal. Indeed PHE quickly put out a rebuttal which supports my analysis - they pointed out that 90% of the COVID deaths occurred within 28 days of diagnosis, and of the remaining 4000, half of them were directly attributed to COVID on the death certificate anyway. Leaving perhaps 2000 bogus deaths which should not have been added. Over a ~100 day period that's pretty much the same as the 20 per day figure I came up with above.

Compare and contrast with the known under-reporting which is clear from the total death statistics and perhaps most stark in care homes, where the total "non-COVID" deaths have a massive bump coincident with the epidemic. We know that patients were pushed out of hospitals into care homes, without any testing or facilities for safe care and treatment, and it's clear that many thousands of these people died without being counted in the COVID statistics. See the huge yellow bump in the official ONS statistics below:

This miscoding of unrelated deaths is small beer in comparison.

One way of getting the "correct" answer would be to use excess deaths, but that involves a certain amount of statistical work (excess over what, and how is this calculated?) and is not so quick and easy to come by. So I don't know what they will come up with as a solution. Using a cut-off date might be a reasonable solution, perhaps in conjunction with death certificate where it didn't specify COVID as the cause. Ie, cut out those 2k deaths where they both (a) took more than 28d from diagnosis to death and (b) were not directly attributed to COVID by a doctor. That would seem to minimise any errors in a straightforward manner, so probably they will do something more complicated...

Friday, July 03, 2020

Ho hum

Haven't posted for a while, so how about a few minutes of James O'Brien to pass the time.

Some links