Recently the news has been full of stories about the potential for “big data” to redefine the ways in which businesses, governments and individuals can all relate to one another. While there is no lack of controversy on the use of big data and the concerns it can raise regarding privacy, there is also the question of how big data can be used for the public good. What are the opportunities for the average person? For the poor? For the billions of individuals living in extreme poverty who – through the use of mobile phones – are now members of the networked global economy?
Everywhere in the world and in every business sector we all contribute to creating freely available common goods – public roads and schools, water and sewage systems, police and judges – that allow children to grow, business to thrive and the quality of life to improve. Now that personal data is becoming so valuable, it makes sense to ask if we can make a data commons that helps individuals, businesses and government.
In early May of 2013 we saw the public unveiling of what is perhaps the world’s first true data commons, with more than 150 research organizations from around the world reporting results from analysis of data describing the mobility and call patterns of citizens of Ivory Coast. This aggregated anonymous data was donated by the mobile carrier Orange, with help from the University of Louvain (Belgium) and MIT (US), in collaboration with Bouake University (Ivory Coast), the United Nation’s Global Pulse, the World Economic Forum and the GSMA.
Highlights of this unveiling include an analysis by IBM’s Dublin laboratory of the public transportation system. The results showed that for very little cost, the average commute time in Abidjan, the biggest city in the Ivory Coast, could be cut by 10%. Other highlights included analysis of disease spreading conducted by groups from Novi Sad University (Serbia), EPFL (Switzerland) and the University of Birmingham (UK) that showed that small changes in the public health system could potentially cut the spread of flu by 20% as well as significantly reduce the spread of HIV and malaria. Other research groups used the data to demonstrate the potential for improvements in government, commerce, agriculture and finance.
These research results have demonstrated the great potential for such a data commons to improve society. From the point of view of Orange, it also demonstrates the potential for new lines of business that combine this shared resource with your personal data: imagine a phone app that advises you about which bus will get you to work quickest, or how to reduce your risk of catching the flu.
The work of these 150 research groups also suggests that many of the privacy concerns associated with the release of this type of human behaviour data can be safely addressed. In this project the data was anonymized and processed by advanced computer algorithms, making it unlikely that any individual would be re-identified. No path to re-identification was discovered, even though several of the research groups studied this specific question. In addition, while the data was freely available for any legitimate research, it was distributed and administered under a legal contract that specified that it could only be used for the purpose proposed and only by the specific people making the proposal.
The use of both advanced computer algorithms and contract law to specify and audit how personal data may be used and shared is the goal of new privacy regulations in the EU, US and elsewhere. Data about human behaviour, such as census data, has always been essential for both government and industry. In this new era of big data, we must make sure that a digital data commons is freely available and at the same time protect the privacy and safety of the individuals whose lives are reflected in that data commons. Indeed we need a “new deal on data”, where individuals can understand what their information is used for, and the benefits and risks of that use, so that they can choose how data will be shared both individually and collectively through government.
Author: Alex “Sandy” Pentland is Toshiba Professor of Media, Arts and Sciences, at the Massachusetts Institute of Technology (MIT) and the World Economic Forum’s lead academic for its Big Data and Personal Data Initiatives.
Image: Students from an underprivileged background use a computer at a school in Islamabad REUTERS/Zohra Bensemra