• The COVID-19 pandemic has demonstrated that sharing data can improve clinical outcomes for patients, but few healthcare organizations are participating in data-sharing initiatives.
  • A federated data system can enable sensitive data sharing across borders and ensure data security, patient privacy and data interoperability.
  • The World Economic Forum, Australian Genomics and Genomics4RD have developed a genomic data consortium governance model to drive innovation and mitigate potential risks for rare disease.

Data is central to offering an innovative, personalized approach to healthcare. Personalized medicine, or precision medicine, is a growing field, with the aim to apply scientific processes, technology and evidence to optimize for the prevention, diagnosis, treatment and wider disease management of individual patients.

COVID-19 has demonstrated that access to data literally has life-or-death consequences in healthcare delivery. Data is the lifeblood of healthcare: either you can test, treat and trace the virus, or succumb to the will of an “invisible enemy”, as coined by David Nott. The hospitals and healthcare systems set up to share data quickly and efficiently have been able to adapt to new, expanding flows of information on the coronavirus, while those ill-equipped to participate in data-sharing efforts have struggled to flatten the curve.

Yet recognizing the value of data collaboration – and actually designing, establishing and administering a data consortium – segregates the global healthcare ecosystem. Many want to share data to improve clinical outcomes for patients, but far fewer are actively participating in data-sharing initiatives.

Genomic data exemplifies that health data can have value beyond the rationale for its initial collection; genomic data at scale can provide answers to how we prevent, diagnosis, and treat disease. Genomic data in isolation isn’t incredibly insightful, but larger datasets linked to de-identified clinical health records and phenotypic data can serve as a treasure trove of information on our most complex and destructive diseases.

As the global health community continues to understand the benefits of data collaboration and the value of participating in a data consortium, how can we advance data interoperability in genomics and other types of sensitive health data and globally deliver a more personalized approach to healthcare?

Genomic data in isolation isn’t incredibly insightful, but larger datasets linked to de-identified clinical health records and phenotypic data can serve as a treasure trove of information on our most complex and destructive diseases.

The technical side of sharing sensitive data

The biggest perceived challenge to creating a data consortium is how to build and use the technology that enables such data sharing and interoperability.

Yet the technology solution is evident amid a range of other proof of concept examples led by the Global Alliance for Genomics and Health and other international collaborations: a federated data system.

In fact, developing and delivering a federated data system and associated management system has been solved before in other applications. The global network of genomic data established and managed by ELIXIR is an example of how aligned organizations with common interests can relatively quickly build a functional federated data system for life sciences data.

Challenges such as managing the security of the query “in flight” to the data system, and similarly securing the return result, require attention, but can be solved. The scale of genomic data storage presents a formidable challenge, with the annual storage requirements predicted to be 2 – 40EB (EB = 10^18 b) commonly compared to multiples of the total global astronomy data or the universe of YouTube data. Multiple technologies ranging from novel storage media to compression standards should support this need.

Where the challenge does become more formidable is the real need to link clinical records to genomic data. A genomic variant (a variation in an individual’s DNA) coupled with a longitudinal health record is vastly more informative than either piece of data alone. Data ontologies and standards will likely become universal over the coming years, but the vast investment in legacy data systems such as Electronic Health Records (EHRs) will make it very difficult to migrate such pre-existing data to a standardized, machine-readable format. Solving this challenge, however, is possible by aligning the needs of organizations and focusing on a specific use case to start with as a shared dataset.

Health and healthcare

How is the World Economic Forum bringing data-driven healthcare to life?

The application of “precision medicine” to save and improve lives relies on good-quality, easily-accessible data on everything from our DNA to lifestyle and environmental factors. The opposite to a one-size-fits-all healthcare system, it has vast, untapped potential to transform the treatment and prediction of rare diseases—and disease in general.

But there is no global governance framework for such data and no common data portal. This is a problem that contributes to the premature deaths of hundreds of millions of rare-disease patients worldwide.

The World Economic Forum’s Breaking Barriers to Health Data Governance initiative is focused on creating, testing and growing a framework to support effective and responsible access – across borders – to sensitive health data for the treatment and diagnosis of rare diseases.

The data will be shared via a “federated data system”: a decentralized approach that allows different institutions to access each other’s data without that data ever leaving the organization it originated from. This is done via an application programming interface and strikes a balance between simply pooling data (posing security concerns) and limiting access completely.

The project is a collaboration between entities in the UK (Genomics England), Australia (Australian Genomics Health Alliance), Canada (Genomics4RD), and the US (Intermountain Healthcare).

Why collaborate and share data as an independent health institution?

The Australian Genomics Health Alliance (Australian Genomics) is an example of a national-scale genomic data consortium that has driven considerable value for partner organizations, collaborators, the health system and individuals and families participating in the research. It operates as a collaborative network of clinicians, researchers and academics dedicated to sharing data to advance scientific knowledge, improve patient outcomes and inform policy.

The clinical, phenotypic and genomic data generated through Australian Genomics research is managed in scalable, standardized systems established to facilitate data sharing. The tools and guidelines implemented by the collaboration are designed to empower Australians to contribute to data sharing, if they choose.

With 25 million people in Australia, international collaboration and data sharing are critical to increase the power and maximize the value of the country’s health and genomic datasets – particularly in the context of rare diseases.

The formation of international genomic data consortia is a compelling means to achieve this. With trusted partners, defined governance and agreed standards and approaches, a data consortium delivers the value of data sharing, mitigates security risks and provides the opportunity to learn from other genomic initiatives globally.

How to create a trusted governance model

The World Economic Forum, Australian Genomics and Genomics4RD in Canada set out to better understand how to provide a clear governance model to drive global innovation via federated data access while still mitigating potential privacy or security risks.

The result is an 8-Step Guide to Sharing Sensitive Health Data in a Federated Data Consortium Model. From finding trustworthy partners (Step 1) to determining a common problem where federating data is beneficial (Step 2), to aligning on incentives and capacities (Step 3), to identifying resourcing (Step 4), to designing and deploying a governance model (Steps 5 and 6), to structuring data (Step 7), to deploying the API technology (Step 8), creating a new health data consortium requires a custom process and extensive dialogue to ensure success and long-term viability.

Eight steps to follow to build a federated data consortium
Eight steps to follow to build a federated data consortium
Image: World Economic Forum

Accessing large volumes of pre-collected health data for further analysis beyond a single country’s border will unlock new innovations and discoveries, but also produce new risks to patient privacy and data security. While it is impossible to implement a singular policy safeguarding against all potential hazards of participating in a health data consortium, it is possible to create a consortium governance model by following a calculated development process. Creating a governance model will foster a cohesive, symbiotic relationship between institutions with otherwise differing models of consent, operations, security and technology. It is possible to optimize for the best outcomes policy and implementation.

Beyond technical frameworks, operational, legal and ethical frameworks of genomic data sharing will allow for transformation in healthcare delivery, enriched reference databases representative of all ethnicities, large genomic data resources available for clinical deliberation and scientific discovery, and an informed, empowered patient community.