A lot of different estimates of the growth rate (R) of the epidemic have come out in the last couple of days, so here's a summary of which ones are wrong (and why) and which ones you can believe. And who am I to do this, you might reasonably ask? While not an epidemiologist, my professional expertise is in fitting models to data, which is precisely what this question demands. And the available evidence suggests I'm rather better at it than many epidemiologists appear to be.
As you may recall, a month ago I posted an argument that R really couldn't be under 1 any longer, and the epidemic was starting to grow again. At the time, the "experts" of SAGE were still insisting that R was less than 1, and they kept on claiming that for a while, despite the very clear rise in reported case numbers. The rise has continued and indeed accelerated a bit, other than for a very brief hiatus in the middle of last month. Part of this steady rise might have be due to a bit more testing, but it's been pretty implausible to believe that all of it was for a while now. I'll come back to SAGE's ongoing incompetence later.
I'll start with my own estimate which currently comes out at R= ~1.3. This is based on fitting a very simple model to both case and death data, which the model struggles to reconcile due to its simplicity. The average death rate (as a percentage of infected people) has dropped in recent weeks, thanks to mostly younger people being infected recently, and perhaps also helped by some improvements in treatment. I could try to account for this in the model but haven't got round to it. So it consistently undershoots the case numbers and overshoots deaths a bit, but I don't think this biases the estimate of R enough to really matter (precisely because the biases are fairly constant). Incidentally, the method I'm using for the estimation is an iterative version of an ensemble Kalman smoother, which is a technique I developed about 15 years ago for a different purpose. It's rather effective for this problem and clearly superior to anything that the epidemiologists are aware of. Ho hum.
Here are my plots of the fit to cases (top) and deaths (bottom) along with the R number.
Another relevant weekly analysis that came out recently is the infection pilot survey from ONS. Up to now it's been pretty flat and inconclusive, with estimates that have wobbled about a little but with no clear signal. This all changed with their latest result, in which the previous estimate of 27,100 cases (uncertainty range 19,300 - 36,700) in the week of 19 - 25 Aug increasing to 39,700 (29,300 - 52,700) in the week 30 Aug - 5 Sept. That is a rise of 46% in 11 days or about 3.5% per day. R is roughly the 5-day growth rate (for this disease), so that corresponds to an R value of 1.2, but note that their analysis doesn't extend over the past week when the cases have increased more sharply.
Actually, I don't really think the ONS modelling is particularly good - it's a rather arbitrary curve-fitting exercise - but when the data are clear enough it doesn't matter too much. Just looking at the raw data that they kindly make available, they had almost 1 positive test result per 1000 participants over the fortnight 23 Aug - 5 Sept (55 cases in 59k people) which was 65% up on the rate for the previous fortnight of 26 cases in 46k people. Again, that works out at R=1.2.
A rather worse perspective was provided by SAGE, who continue to baffle me with their inability to apply a bit of common sense and update their methods when they repeatedly give results so strikingly at odds with reality. They have finally noted the growth in the epidemic and managed to come up with an estimate marginally greater than 1, but only to the level of R=1.1 with a range of 1-1.2. And even this is a rounding-up of their estimate of daily growth rate of 1 ± 2% per day (which equates more closely to R=1.05 with range of 0.95-1.15). Yes, they really did say that the epidemic might be shrinking by 1% per day, even as cases are soaring and hospital admissions are rising. I do understand how they've managed to generate this answer - some of the estimates that feed into their calculation only use death data, and this is still very flat - but it's such obvious nonsense that they really ought to have pulled their heads out of their arses by now. I sometimes think my work is a bit artificial and removed from practical issues but their unwillingness to bend to reality gives ivory tower academics a bad name.
At the other extreme, a paper claiming R=1.7 was puffed in the press yesterday. It's a large survey from Imperial College, that bastion of incompetent modelling from the start of the epidemic. The 1.7 number comes from the bottom right hand panel in the below plot where they have fitted an exponential through this short subset of the full time series of data. There is of course a lot of uncertainty there. More importantly, it doesn't line up at all with the exponential fitted through the immediately preceding data set, starting at a lower level than the previous curve finishes. While R might not have been constant over this entire time frame, the epidemic has certainly progressed in a continuous manner, which would imply the gap is filled by something like the purple line I've added by hand.
It's obviously stupid to pretend that R was higher than 1 in both of the recent intervals where they made observations, and just happened to briefly drop below 1 exactly in the week where they didn't observe. The sad thing about the way they presented this work to the media is that they've actually done a rather more sensible analysis where they fit the 3rd and 4th intervals simultaneously, which is shown as the green results in the 3rd and 4th panels on the top row of the plots (the green in the 3rd panel is largely overlain by blue which is the fit to 2nd and 3rd intervals, but you can see if you click for a larger view). Which gives.....R=1.3. Who'd have thought it?
Of course R=1.7 is much more headline-grabbing. And it is possible that R has increased towards the end of their experimental period. Rather than fitting simple exponentials (ie fixed R values) to intervals of data, perhaps a more intelligent thing to do would have been to fit an epidemiological model where R is allowed to vary through time. Like I have been doing, for example. I'm available to help and my consultancy rates are very reasonable.
In conclusion, R=1.3(ish) but this is a significant rise on the value it took previously and it might well be heading higher.
8 comments:
Can you remind me what everything is on your regular charts? What's the right-hand axis on the first chart?
And can you confirm my recollection that you fit some other coefficients (daily ones named with greek letters, IIRC) and then you derive R from the fitted coefficients using some assumptions not in your model? (Such as a five-day generation time, or something like that)?
Have been thinking it needs a key, something like:
First graph
Red line & range = C = Case Ascertainment = fraction of infections that become confirmed cases (right hand scale)
Blue higher smooth lines = total modelled number of new infections
Red circles = data
Green= modelled range
Lower dark blue less smooth lines = sample ensemble member
Magenta = ensemble average
Second graph
Red line & range = R (right hand scale)
Red circles = data
Green= modeled range
Blue = sample ensemble member
Pink = ensemble average
or maybe could reference https://julesandjames.blogspot.com/2020/07/the-price-of-freedom.html
Yeah thanks Chris. I have edited post and will also try to fit some text on the plots in future as people keep asking :-)
3rd para says R = - 1.3
Wonder what a negative value means!!
Each person with the virus cures 1.3 people!
UJ.
Nooooooo! That's a tilde. Approx. I was too lazy to find out how to type ≅ :-)
Good summary - thanks! I am wondering what all this means in policy terms. The Covid Alert Level is currently at Level 3 (general circulation). Level 4 is "transmission is high or rising exponentially", so even on SAGE's estimate we should me moving to 4 (although I admit my understanding of 'rising exponentially' is fuzzy)? https://www.gov.uk/government/news/update-from-the-uk-chief-medical-officers-on-the-uk-alert-level
Since the beginning of July, essentially every red circle (case count) falls above the magenta line (model median). Is that reasonable, or a bug? It feels as if your model is running cold and not correcting adequately for some reason. Similarly on the deaths chart, almost every data point is below the model median.
Nick, that's because the 0.75% death rate in the model is too high currently. Especially with the change to 28d deaths (which undercounts the true number!), there simply aren't enough of them for the cases we are getting. So the model is compromising between the two. I have thought of putting in a death rate that reduces over time but haven't got round to it. It would involve more subjective choices, degrees of freedom...and work :-)
Warren, well it certainly seems quite high to me (though not overwhelming) and rising exponentially to me....though the numbers in hospital are still moderate so I guess that is their excuse for sitting on their hands.
Post a Comment