COVID-19: Mathematical models in the face of uncertainty

A study recently released by Imperial College London has been heavily reported in the media; the authors estimate that in the absence of mitigation, COVID-19 could claim the lives of 510,000 people in Great Britain and 2.2 million people in the United States by the end of August.1 A collaborative study overseen by Columbia University that incorporated very different assumptions was coincidentally released the same day (March 16th).2 The two studies offer a valuable opportunity to learn about the limitations of mathematical models of novel phenomena, namely, accounting for historically unprecedented mitigation efforts currently underway around the world to combat a novel virus never before seen in humans, where the total number of infections (and thus the true overall fatality rate) is unknown. Disparate assumptions draw from premature, incomplete, and uncertain data.

For example, from the Imperial study: “We assume that symptomatic individuals are 50% more infectious than asymptomatic individuals. … Analyses of data from China as well as data from those returning on repatriation flights suggest that 40-50% of infections were not identified as cases.”

Compare this with the Columbia study: “This estimate reveals a very high rate of undocumented infections: 86%. This finding is independently corroborated by the infection rate among foreign nationals evacuated from Wuhan (see supplementary materials). These undocumented infections are estimated to have been half as contagious per individual as reported infections (μ = 0.55; 95% CI: 0.46–0.62).”

In other words, the Imperial study estimates that 40-50% of infections are undocumented, while the Columbia study estimates that 86% are undocumented. Furthermore, the Imperial study assumes that symptomatic individuals are 1.5 times as infectious as asymptomatic individuals, while Columbia estimates that symptomatic individuals are 2 times as infectious. To put it mildly, these are major discrepancies.

The Imperial study estimates an infection fatality rate (IFR) of 0.9% (95% CI 0.4%-1.4%); that is, the percentage of deaths among all those who are infected, not simply the subset detected, tested, and found to be positive, which is the case fatality rate (CFR). It’s worth noting that if 86% of infections are undocumented (as the Columbia study assumes), then the infection fatality rate will be will be (1-0.86)*CFR, or nearly one tenth that of the CFR. As of this blog post, the global CFR is 4.1%; assuming that 86% of infections are undocumented implies that the overall infection fatality rate is approximately 0.57%.

The Imperial study’s IFR estimate of 0.9% is difficult to reconcile with the disparate case fatality rates of countries with confirmed infections. Let’s consider case fatality rates for countries with greater than 1,000 confirmed infections as of this blog post (March 18, 2020):

from least to most lab-confirmed infections over 1,000 Case Fatality Rate (to nearest tenth of a percent)3
Denmark 0.4%
Sweden 0.8%
Belgium 0.9%
Norway 0.4%
Austria 0.2%
Netherlands 2.8%
United Kingdom 4.0%
Switzerland 1.0%
South Korea 1.0%
France 2.9%
United States 1.6%
Germany 0.2%
Spain 4.0%
Iran 6.5%
Italy 8.3%
China (mainland) 4.0%

There are few striking details here. For one, South Korea has a CFR that is one fourth that of the UK despite the fact that it has more than three times as many confirmed cases. Germany’s CFR is one twentieth that of the UK, with nearly five times as many confirmed cases. While the three countries with the highest number of confirmed cases also have the highest CFRs, looking through the other countries, it would seem impossible to accurately predict how one country’s CFR will compare to another’s based on any data we currently have. Despite this high variability, uncertainty, and complexity, the Imperial College London paper chose to rely on “a subset of cases from China.”

“The age-stratified proportion of infections that require hospitalisation and the infection fatality ratio (IFR) were obtained from an analysis of a subset of cases from China. These estimates were corrected for non-uniform attack rates by age and when applied to the GB population result in an IFR of 0.9% with 4.4% of infections hospitalised.”

Where did the authors get this subset? They cite a non-peer-reviewed preprint on medrxiv4 by the same lead author (Neil Ferguson)…and it’s worth noting that this study estimated an overall IFR for China that’s lower than the Imperial study’s estimate for Great Britain:
“We obtain an overall IFR estimate for China of 0.66% (0.39%,1.33%), again with an increasing profile with age.”

So the Imperial study extrapolated from a study that extrapolated from a subset of cases from China (both by the same lead author), and applied these numbers to the populations of Great Britain and the US.

The assumptions that were fed into this model could have been applied to any country at the beginning of its respective outbreak. Keep in mind that Germany’s case fatality rate of 0.2% is lower than low cutoff of the Imperial study’s 95% confidence interval for infection fatality rate at 0.4%. As pointed out by the World Health Organization, the infection fatality rate for any country is necessarily lower than the case fatality rate.

Furthermore, the Imperial study forecasts that COVID-19 illnesses in Great Britain and the US will either peak in June if no action is taken, or from the end of June through the middle of July in the best case mitigation scenario.

This sparks the question of whether or not SARS-CoV-2 transmission will be affected by the transition to summer. Despite apparent skepticism from public health officials, as far as I know, all endemic human coronaviruses are seasonal, with transmission rates reliably declining during the summer in temperate climates (like in the US and UK). Another fact strenuously downplayed by some experts is that the original SARS coronavirus epidemic in 2003 ended just as summer was beginning. While the end of the SARS epidemic was undoubtedly facilitated by successful containment efforts and features of the infection that made it easier to contain than SARS-CoV-2 or influenza, the timing is noteworthy. The first day of summer in the northern hemisphere in 2003 was June 21st.

SARS epidemic curve from the World Health Organization.

It should go without saying that influenza virus transmission is seasonal. Note the regular seasonal variation in pneumonia and flu mortality in the figure below. Also note that both the 2016-17 and 2017-18 flu seasons saw flu-associated deaths at epidemic levels. CDC estimates that approximately 61,000 people died in the US alone due to influenza in the 2017-18 flu season, including over 600 deaths among those below the age of 17.5 Despite the magnitude of the impact of influenza in the 2017-18 season, I don’t recall much if any coverage at all in the mainstream media (and certainly not in prime time cable news).

Pneumonia and Influenza (P&I) Mortality Surveillance, CDC and National Center for Health Statistics (NCHS).

The expected seasonality of COVID-19 transmission is further corroborated by a recently published analysis available on SSRN.6

Quoting from the abstract:
“…we find, under a linear regression framework for 100 Chinese cities, high temperature and high relative humidity significantly reduce the transmission of COVID-19, respectively, even after controlling for population density and GDP per capita of cities. One degree Celsius increase in temperature and one percent increase in relative humidity lower R by 0.0383 and 0.0224, respectively. This result is consistent with the fact that the high temperature and high humidity significantly reduce the transmission of influenza. It indicates that the arrival of summer and rainy season in the northern hemisphere can effectively reduce the transmission of the COVID-19.”

To the best of my knowledge, there is no case in recent history (and possibly all of history) where the number of illnesses and deaths due to a viral pandemic peaked in July in a temperate climate; not with SARS, not with any other coronavirus, not with influenza, not with any viral pandemic that I know of. If I’m wrong, please feel free to contact me and let me know.

Mortality Distributions and Timing of Waves of Previous Influenza Pandemics. The shaded columns indicate normal seasonal patterns of influenza.

The bottom line is that a model is only as good as its assumptions, and those assumptions are only as good as the data. The discrepancies shown here reveal that the data regarding COVID-19 are premature, incomplete, and uncertain.

[1] Ferguson N, Laydon D, Nedjati Gilani G, et al. Report 9: Impact of Non-Pharmaceutical Interventions (NPIs) to Reduce COVID19 Mortality and Healthcare Demand. Imperial College London; 2020. doi:10.25561/77482

[2] Li R, Pei S, Chen B, et al. Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2). Science. March 2020:eabb3221. doi:10.1126/science.abb3221

[3] 2019–20 coronavirus pandemic data. Wikipedia. Retrieved 18 March 2020.

[4] Ferguson N, Verity R, Okell LC, Dorigatti I, et al. Estimates of the severity of COVID-19 disease medRxiv 2020.03.09.20033357; doi: 10.1101/2020.03.09.20033357

[5] CDC: 2017-2018 Estimated Influenza Illnesses, Medical visits, Hospitalizations, and Deaths and Estimated Influenza Illnesses, Medical visits, Hospitalizations, and Deaths Averted by Vaccination in the United States. Retrieved 18 March 2020.

[6] Wang, J., Tang, K., Feng, K., & Lv, W. (2020). High Temperature and High Humidity Reduce the Transmission of COVID-19. SSRN Electronic Journal. doi: 10.2139/ssrn.3551767

[7] Miller MA, Viboud C, Balinska M, Simonsen L. The Signature Features of Influenza Pandemics — Implications for Policy. New England Journal of Medicine. 2009;360(25):2595-2598. doi:10.1056/nejmp0903906