Wednesday, August 05, 2020

Could R still be less than 1?

It's been suggested that things might all be fine, maybe the increase in case numbers is just due to more/better testing. There certainly could be a grain of truth in the idea, as the number of tests undertaken has risen a little and the proportion of tests that have been positive has actually kept fairly steady over recent weeks at around 1%. On the other hand, you might reasonably expect the proportion of positives to drop with rising test numbers even if the number of ill people was constant, let alone falling as SAGE claim - consider at the extreme, 65 million tests couldn't find 650,000 positives if only a few tens of thousands are actually ill at any one time. Also, the ONS pilot survey is solid independent evidence for a slight increase in cases, albeit not entirely conclusive. But let's ignore that inconvenient result (as the BBC journalist did), and consider the plausibility of R not having increased in recent weeks. 

This is fairly easy to test with my data assimilation system. I can just stop R from varying at some point in time (by setting the prior variance on the daily step to a negligible size). For the first experiment, I replaced the large jump I had allowed on 4th July, with fixing the value of R from that point on. Note however that the estimation is still using data subsequent to that date, ie it is finding the (probabilistic) best fit for the full time series, under the constraint that R cannot change past 4 July. I've also got a time-varying case ascertainment factor which I'll call C, which can continue to vary throughout the full interval.

Here are the results, which are not quite what I expected. Sure, R doesn't vary past the 4th of July, but in order to fit the data, it shoots up to 1 in the few days preceding that date (red plume on 2nd plot). The fit to the death data in the bottom plot looks pretty decent (the scatter of the data is very large, due to artefacts in the counting methodology) and also the case numbers in the top plot are reasonable. See what has happened to the C factor though (red plume on top graph). After being fairly stable through May and most of June, it takes a brief nose-dive to compensate for R rising at the end of June, and then has to bounce back up in July to explain the rise in case numbers.

While this isn't impossible, it looks a bit contrived, and also note that even so, we still have R=1, firmly outside the SAGE range of 0.8-0.9. Which isn't exactly great news with school opening widely expected to raise this value by 0.2-0.5 (link1link2).

So, how about fixing R to a more optimistic level, somewhere below 1? My code isn't actually set up very well for that specific experiment, so instead of holding R down directly, I just put the date back at which R stops varying. In the simulations below it can't change past the 1st June. It still climbs up just prior to that date, but only to 0.9 this time, right at the edge of the range of SAGE values. The fit to the death data is similar, but tis time the swoop down for C on the upper plot is a bit more pronounced (because R is higher through June) and then it has to really ramp up suddenly in July to match the rise in case numbers. You can see that it starts to underestimate the case numbers towards the present day too, C would have to keep on ramping up even more to match that properly.

So R being in the SAGE range isn't completely impossible, but requires some rather contrived behaviour from the rest of the model which doesn't look reasonable to me. I don't believe it and think that unfortunately there is a much simpler explanation for (some of) the rise in case numbers.

More what-ifs

It was pointed out to me that my previous scenarios were roughly comparable to those produced by some experts, specifically this BBC article  referring to this report. And then yesterday another analysis which focussed on schools opening.

The experts, using more sophisticated models, generated these scenarios (the BBC image is simplified and the full report has uncertainties attached):

and for the schools opening report:

The tick marks are not labelled on my screenshot but they are at 3 month intervals with the peaks being Dec on the left hand and March on the right hand panel.

While these are broadly compatible with my analyses, the second peak for both of them is significantly later than my modelling generates. I think one important reason for this is that my model has R a little greater than 1 already at the start of July, whereas they are assuming ongoing suppression right through August until schools reopen. So they are starting from a lower baseline of infection. The reports themselves are mutually inconsistent too, with the first report having a 2nd peak (in the worst case) that is barely any higher than the first peak, and the second report having a markedly worse 2nd peak, despite having a substantially lower R number over the future period that only briefly exceeds 1.5. It's a bit strange that they differ so significantly, now I think about it...I'm probably missing something obvious in the modelling.

Of course in reality policy will react to observations, so all scenarios are liable to being falsified by events one way or another.

Sunday, August 02, 2020

What if?

It's a while since I did any real forecasting, the current system just runs on a bit into the future with the R values gradually spreading out due to the daily random perturbations, and the end result is pretty obvious. Now with the effective R value probably just above 1, and various further relaxation planned (e.g. end of furlough, schools returning) jules thought it would be interesting to see what might possibly happen if R goes up a bit.

Here are two ensembles of simulations, both tuned the same way to historical data, which gives an R_effective of about 1.1 right now. The step up on the 4th July is a modelling choice I made through choice of prior, in allowing a large change on that one day only rather than a gradual ramping up around that time. In the first set of forecasts, I ramp up R by 0.5 over 30 days through September. For the modellers, I'm actually using R as my underlying parameter, calculating R_effective based on the proportion of people who (it is assumed!) have acquired immunity through prior infection. So typically the underlying R value is going up from 1.2 to 1.7 or thereabouts. You can see the resulting ramp up in R_eff on the plot, with the subsequent drop entirely due to the herd immunity factor kicking in as the second wave peaks. The new peak in deaths is...not pretty. I'm disappointed it is so severe, in my head I'd been assuming that a much lower R number (compared to the 3-3.5 at the start) and non-negligible level of current immunity would have helped to keep it lower.

The second set of results is a more optimistic assumption where R only goes up by 0.2, this time in a single step when the schools go back near the start of Sept (don't quote me on the date, it was just a guess).'s still not great I'm afraid. The lower R gives a more spread out peak and there is a chance of things turning out not too badly but a lot of the trajectories still go up pretty high, with most of them exceeding the April max in daily deaths, and sustaining this for quite a while.

So...that's all a bit of a shame. There are however reasons why this may be a bit too pessimistic: it is well-known that this simple model will overestimate the total penetration of the disease as it doesn't account for heterogeneity in the population, which could make a significant difference. Also, I've kept the fatality rate at 0.75% despite advances in treatment which have definitely nudged it lower than it was at the start. On the other hand, the model does not account for loss of immunity here among people who have had the virus. Not clear if that simplification is truly valid over this time scale.

Anyway, these are not predictions, I just put in some reasonable-sounding (to me!) numbers to see what would happen. It does look like any further significant increase in R will have serious consequences.