The increasing “datafication” of our society is leading to a world where data equates to power, and the lack of transparency and accountability regarding its use raises questions of trust. Without appropriate data stewardship practices and policy frameworks, technologies such as the internet of things and broad applications of machine learning will only exacerbate the current situation, and undermine the immense promise of emerging data ecosystems.
To avoid this outcome, the public discourse needs to focus on trust – on building user-centred data ecosystems where all stakeholders employ mechanisms that respect individual preferences and empower people to decide how their data is used. All parties, from businesses to regulators, must take full responsibility for creating value- and trust-based frameworks that define, among other things, principles for appropriate uses of data. Such ecosystems are distinct from those where individuals are solely responsible for the use of all data related to them.
Using data in the appropriate context and in a way consistent with individuals’ expectations is essential to re-establishing trust and sustainability of personal data ecosystems. Yet, despite the growing recognition of the importance of context in data usage, there is little evidence on how individuals define it across cultures, and how it can practically be incorporated into systems and policy-making. In 2013, Microsoft and the World Economic Forum initiated a joint interdisciplinary project to explore the development of context-aware systems that would respect individuals’ preferences and restore trust, involving researchers from Carnegie Mellon University, the University of California, Irvine, and the Rhode Island School of Design.
The Forum’s report on Trust and Context in User-Centered Data Ecosystems summarized the findings based on a study of over 9,000 individuals in eight countries. It analysed how a number of factors affected how individuals perceive acceptable use of personal data (e.g. bank account number, medical history). The context in which data is used clearly matters. What people perceive as acceptable use is nuanced, personal and based on a combination of objective and subjective factors that evolve over time and reflect differences in cultural and social norms. There are no absolutes.
Four factors stood out as consistently important. These were the collection method, data usage, trust in the service provider and value exchange. Their relative impacts vary across the countries studied. In general, the most favourable scenarios involve those where data collection is active (the user is aware of the data being collected), data usage is as agreed to, the service provider is well known, and the data is used to provide something of perceived value. However, clear cultural differences exist. For example, with regard to value exchange, providing a benefit to the community was perceived much more positively in China and India than in the Western countries studied.
The prominence of data collection and data usage in the acceptability criteria can perhaps be equated to the current approach of notice and consent at the data collection point. However, in a world of ubiquitous data, where collection points and what are considered personal data cannot always be clearly identified, the remaining context variables – particularly trust in service provider and value exchange – play increasingly important roles in determining how comfortable people are with the way their data is used.
With a better understanding of context, data stewardship can be more user-centred. Technology can play a central role in enabling this more nuanced approach, and in combination with appropriate user experience design, enable more meaningful engagement and transparency. Conceptually, “recommender systems” can implement these context models and use machine learning to improve predictions of what will be considered acceptable data use by individuals. These systems can work on behalf of service providers to make more personalized recommendations to their customers, or on behalf of individuals as personal assistants that can negotiate with service providers for more acceptable criteria for personal data use. In either case, continuous incorporation of input from different users into the context algorithms will enable the establishment of prevailing acceptable norms on data use, as well as adaptation to the evolution of these norms in different countries. At the point of engagement with the end user, an appropriately designed user experience is necessary – one that leverages the context information to respect individual values or at least to align with commonly acceptable norms.
It’s evident from the report that binary approaches to data-use policies that treat all data equally, and that apply universally, are neither appropriate nor flexible enough. These systems do not encourage meaningful interactions between individuals and service providers; instead, they invite manipulations of the rules and discourage trust. However, incorporating context-related nuances into regulations can be challenging. Technologies similar to those described here offer an alternative, providing a foundation for enabling context-aware policy frameworks that are driven by principles and outcomes.
If context-aware systems are also coupled with other technologies such as interoperable metadata tags that contain use policies, actual usage and provenance information, user preferences would be perpetuated throughout the data ecosystems. This could also facilitate good accountability practices. However, although technology can enable good behaviour, it cannot, by itself, prevent bad behaviour. Appropriate regulations are still needed to discourage violation of use context and prevent the modification of policies without supporting evidence. Regulations can also encourage use of these technologies through incentives.
Context is key, even if not well understood. More research is needed on how context can be defined more concisely and simply, and how it can be practically integrated into systems and interface designs to create meaningful user engagement and engender trust. Such research is essential to developing effective ecosystems and policies. An interdisciplinary dialogue is needed between technologists, social scientists, industry and policy-makers. A balanced approach that incorporates these multifaceted perspectives will be required if we are to create trusted and sustainable user-centred data ecosystems.
Author: Harry Shum is Executive Vice-President, Technology and Research, at Microsoft.
Image: A robotic tape library used for mass storage of digital data is pictured at the Konrad-Zuse Centre for applied mathematics and computer science (ZIB), in Berlin August 13, 2013. REUTERS/Thomas Peter