With advances in data collection, information technology and data science in the 21st century, the prospects of targeted disease prevention and precision healthcare have become a reality in many parts of the world. The principle on which they rely is simple and obvious: namely, that the effectiveness with which we can treat any one person depends upon the quantity, quality and variety of data we can obtain from a great number of other people.

It is reasonable to think that people want to enjoy the best attainable standard of healthcare. They might also reasonably believe that achieving this depends on people like them sharing data for use in research. And yet not everyone arrives at the conclusion that they themselves should share data. And thus the promise of improved, data-driven health care may be delayed, unevenly kept or unfulfilled. So we ask: what are the obstacles that inhibit data-sharing and how can we move from this paradox, which inhibits the promise of precision medicine, to a policy for action all can accept?

The search for solidarity

The reasons that data-sharing may be limited in practice are partly to do with infrastructural inertia and partly to do with moral conditions. Much of the infrastructural inertia rests in the way that healthcare and biomedical research have been separated historically. These separations run through staff, institutions and governance systems.

The moral conditions have to do with considerations of privacy and public interest. Many of the data we are talking about are personal. They are personal in the sense of being about a particular individual, but also in the sense of having special value to that individual: some contain information that may be regarded as sensitive or consequential, disclosed only in the context of intimate personal relationships or professional doctor-patient relationships. Their use in biomedical research challenges these norms. Information governance is a crucial part of trustworthy systems: we should have credible assurances that our systems can manage information in the way that we agree they should do so. But “how they should do so” – the norms governing data use – also needs to be rethought in response to the potential that personal data now have to advance biomedical research. The UK’s Nuffield Council on Bioethics has recommended the need to define a “set of morally reasonable expectations about the use of data”.

As well as enlarging our understanding of human physiology, advances in data science and data-generating technologies have changed the way we understand the relationship between an individual and the other members of the population of which they are a part.

To take one example: genomic information is, in an important way, trans-personal, both linking and differentiating us from others as instances of a family, a population, a species. Further, we are both linked to and differentiated from others by the environment we share, what we eat and how we live, which can affect our epigenomes, as well as our state of health. It is as a consequence of these relations that our capacity to help others through sharing data arises. And this further urges on us the consolidation of research and care: advances in care no longer need to be sought in distinct and rarefied research projects, but can be extracted from the systems in which people are already implicated.

Achieving a social contract

A system that is designed to secure the highest attainable standard of care for the public should be organised in a way to which people can freely and reasonably consent. This is the crux of the problem: people may understand the way to an attainable standard of care, but not all of them may consent.

One way of creating a bridge between the conclusion that sharing data provides the best standard of care and the policy objective of securing this care is through the idea of a “social contract”. This is the idea that a notional contract would express, in some way, the entitlements and obligations of all participants in a system of healthcare provision. Implicitly, such an arrangement would recognise obligations of data stewardship on professionals and institutions, while, on the other side, it would recognize the obligations of individuals who enjoy the advantages of health systems and biomedicine as a public good. It could extend, also, to data that are locked up in private health systems and commercial clinical trials. [1]

Achieving a social contract is no small feat. In some systems, notably the US, patients’ electronic health records – de-identified and aggregated with other patient data – are bought, sold and shared at will, with little to no notice and with questionable consent from patients. This is largely because de-identified data are not considered personal health information, despite studies showing it is fairly easy to re-identify an individual from a dataset. The consent process often takes place at inappropriate times, when a patient has little ability to opt out of data-sharing. It can be one line embedded in a long hospital admissions form, stating a hospital owns the rights to all data generated about a patient, from care, tissue collection, etc., in perpetuity. These data can and likely will be used for commercial purposes. Many patients are starting to push back against this model. An effective social contract could address, explicitly and accountably, questions about the appropriate management of data collected in healthcare and research systems. It could also provide assurances that, for example, health information will not be used by parties outside of healthcare in a way that is discriminatory or harmful to individuals (such as unjustifiably to raise life insurance premiums, deny housing loans or withhold credit).

Ensuring justice

As the technologies of the Fourth Industrial Revolution diffuse unevenly across the globe, we are still confronted by unacceptable variations in the standard of available healthcare.

However, a real concern is that data-driven medicine could neglect whole populations through lack of opportunities for participation or through a failure to meet concerns about privacy and justice. As scale is the main driver of data-driven developments in biomedicine, there is an implicit reason to design systems to be both inclusive and interlinked. The challenge is to ensure that just as no groups within a population should be left behind, no populations should be left behind in the global diffusion of data-driven biomedicine, since to do so would compromise both medicine and morals. [2]

Health is a universal human interest. The idea of a social contract is to make explicit the link between this interest and the expectations that, being met, can secure the highest attainable standard of health through continuing technology and service innovation.

[1] There are a range of systems that facilitate the sharing of data, including clinical trials databases (e.g. the US National Library of Medicine’s ClinicalTrials.gov) for de-identified clinical trials data and the Global Alliance for Genomics and Health, which provides interoperable standards for sharing genomic research information. Many leading journals now ask authors to confirm whether and how they plan to share data.

[2] This is something that our Global Futures Council (GFC) on Biotechnology is exploring this year.