How policymakers should use the wealth of COVID-19 data

Apr 20, 2020

A man washes hands at a hand-washing station to prevent the spread of coronavirus disease (COVID-19) in Jakarta, Indonesia, March 26, 2020. REUTERS/Ajeng Dinar Ulfiana - RC2PRF9F7MS9

Hand washing and social distancing aim to reduce the average number of new infections.

Image: REUTERS/Ajeng Dinar Ulfiana

Rees Kassen

Professor of Evolutionary Biology, McGill University

Our Impact

What's the World Economic Forum doing to accelerate action on Health and Healthcare Systems?

The Big Picture

Explore and monitor how Data Science is affecting economies, industries and global issues

Stay up to date:

Data Science

New technology makes it possible to gather data about coronavirus in near real-time.
Rapid data generation increases risk around the accuracy and reliability of the data.
Decision-makers will maintain trust by adapting policies as new data emerges.

The world has not known, in living memory, a pandemic on the scale of what we are experiencing with COVID-19. Nor has the world had access to data and analysis, much of it being generated rapidly and disseminated freely, on the SARS-CoV-2 virus itself. Navigating a path out of this crisis will require effective integration of this data into decision making.

Have you read?

This is not an easy task at the best of times. It is even harder now because the virus causing the pandemic, SARS-CoV-2, is new to humans, having crossed the species barrier from bats. As little as four months ago, we could not answer even the most basic questions about the virus and the disease it causes – how transmissible is it, how virulent (damaging) is the disease to our bodies, whether we can mount an effective immune response. We are learning as we go.

We’ve been here before, most recently with coronaviruses that caused the Middle East Respiratory Syndrome (MERS) in 2012 and, before that, SARS in 2002. We knew just as little then as we do now about these diseases when they were first observed in humans. The difference between then and now is how fast we are learning the basic biology of the virus and the disease it causes, and how we are navigating the uncertainties along the way.

New technologies for rapid data generation and dissemination are making it possible to gather and analyze data about the virus in near real-time. Never before have we seen this much data generated and shared so quickly, sometimes at the cost of more uncertainty than we would like. But the speed and scale with which this virus spreads and evolves means that never before have so many needed this data so urgently.

Consider the beginning of the pandemic in January, as the virus started to spread from its origins in Wuhan, China across the world. With no vaccine available, the only public health intervention tool available is containment, and doing this effectively requires knowing where the virus is and how fast the virus is spreading from person to person. Rapid diagnostics and widespread testing to find cases and trace their contacts at a regional level are key here, as the success of early programmes in places like Taiwan, Singapore, and South Korea demonstrate.

Delay, even by as little as a few days, can be disastrous, as Italy was the first to learn. Looking ahead, new tools for point-of-care diagnosis, that can go from swab to signal in less than an hour will be indispensable in accelerating the scale and scope of testing.

Getting the numbers right is a real challenge. Right now, best estimates are a single case of SARS-CoV-2 infects on average 2.5 additional people. But uncertainty around the transmission dynamics of the disease mean this number can vary substantially and, along with it, the expected total number of deaths. The predictions from epidemiological models are only as good as the data we feed into them. Policymakers must be willing to live with uncertainty in the predictions, and adjust their recommendations accordingly.

Expected number of global deaths for three levels of R0 based on data from Imperial College COVID-19 Response Team’s model.

Expected number of global deaths for three levels of R0, based on data from Imperial College. Image: Imperial College

The objective of public health measures like hand washing, social distancing, and quarantine is to reduce the average number of new infections as low as it can go and ideally to below 1. At this level, the virus will be contained. Indeed, this is the lens through which all decisions should be made right now, in the thick of the pandemic. Ahead, hard choices will have to be made, as public health is weighed against the impacts on the economy, personal liberty, and public trust.

This rapidly changing landscape presents an additional challenge to decision-makers. What appears to be true one day is not true the next, as more evidence comes to light. Initial estimates of the mortality rate due to COVID-19 are a case in point. The case fatality rate (CFR; the number of confirmed infections resulting in death) was first reported in January to be as high as 15%, but more and better information has seen a steady revision of this number down to ~1%. Decision-makers need to be ready to revise their recommendations in light of new information, and be ready to explain those changes openly and honestly to the public.

The sheer rate at which data is being generated presents a unique opportunity. Take the growth in genomic data as an example. The sequence of the original Wuhan strain was uploaded in mid-December 2019. As of this writing, there are now just shy of 4,000 SARS-CoV-2 strains available for analysis. This is a rate of growth in data that is, quite simply, staggering. The virus evolves rapidly, accumulating mutations at the rate of one every two weeks or so. Managing and making sense of the data is helped by public repositories like GISAID.org; while analysis platforms like nextstrain.org allow near real-time tracking of viral evolution and spread.

Genomic epidemiology of novel coronavirus - Global subsampling

Real-time tracking of the spread of coronavirus. Image: Nextstrain; GISAID

In this climate of rapid data generation and widespread sharing comes additional risk about the accuracy and reliability of the data. The usually slow, deliberative process for evaluating results has accelerated, increasing the possibility of mistakes being made. The risks are lessened somewhat by a more informal and open peer review process playing out in online fora, linked to pre-print servers (which collate academic papers before they are formally peer reviewed and published) like medRxiv, bioRxiv, and virological.org and even Twitter.

Multiple groups working on the same problem also helps. If all arrive at the same answer while working independently and using slightly different approaches, we can be fairly certain the result they come to is robust. Policymakers would do well to remember the advice of the late biologist Richard Levins on the use of mathematical models, our truth is the intersection of independent lies. The same is true for navigating the wealth of scientific literature.

Discover

What is the World Economic Forum doing about the coronavirus outbreak?

SARS-CoV-2 jumped the species boundary into humans and is galloping from country to country along the richly tangled web of global connections that we have woven. This is a problem of our own creation. Fortunately, we now have more effective levers to bring it under control. New tools for rapid data collection and analysis make it easier to feed the right type of evidence into decision-makers’ hands.

This is not enough. A strategic forum to establish a harmonized global approach would help, as would embedding epidemiologists into policy shops where they haven’t traditionally been located, like urban planning departments in cities. Most importantly, decision-makers need to maintain public trust. This starts by listening to the science, adapting policies as new data comes to light, and explaining the changes clearly to the public.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.