How AI can develop its own biases

Sep 12, 2018

This article is published in collaboration with Futurism.

Programmers work on computers in Belgrade, Serbia, July 4, 2018. Picture taken July 4, 2018. REUTERS/Marko Djurica

Programmers are slowly identifying and addressing these biases. Image: REUTERS/Marko Djurica

Dan Robitzski

Journalist, Futurism

Our Impact

What's the World Economic Forum doing to accelerate action on Equity, Diversity and Inclusion?

The Big Picture

Explore and monitor how Gender Inequality is affecting economies, industries and global issues

A hand holding a looking glass by a lake

Crowdsource Innovation

Get involved with our crowdsourced digital platform to deliver impact at scale

Stay up to date:

Gender Inequality

If artificial intelligence were to run the world, it wouldn’t be so bad — it could objectively make decisions on exactly the kinds of things humans tend to screw up. But if we’re gonna hand over the reins to AI, it’s gotta be fair. And so far, it’s not. AI trained on datasets that were annotated or curated by people tend to learn the same racist, sexist, or otherwise bigoted biases of those people.

Slowly, programmers seem to be correcting for these biases. But even if we succeed at keeping our own prejudice out of our code, it seems that AI is now capable of developing it all on its own.

New research published Wednesday in Scientific Reports shows how a network of AI agents autonomously developed not only an in-group preference for other agents that were similar to themselves but also an active bias against those that were different. In fact, the scientists concluded that it takes very little cognitive ability at all to develop these biases, which means it could pop up in code all over the place.

The AI slowly began to show prejudice towards the agents that would reciprocate their generosity. Image: Nature

The AI agents were set up so that they could donate virtual money to other agents, with the goal of getting as much as possible in return. Essentially, they had to choose how much to share and which other agents to share with. The researchers behind the experiment saw two trends emerge: AI agents were more likely to donate to others that had been labeled with similar traits (denoted by an arbitrary numeric value) and active prejudice against those that were different. The AI agents seemed to have learned that a donation to the in-group result would in more reciprocation — and that donating to others would actively lead to a loss for them.

This is a more abstract form of prejudice than what we see in the real world, where algorithms are used to specifically target and oppress black people. The AI agents didn’t develop a specific disdain for a specific minority group, as some people have. Instead, it’s a prejudice against a vague “other,” against anything different from themselves. And, yes, it’s a form of prejudice that’s limited to this particular simulation.

But the research does have big implications for real-world applications. If left unchecked, algorithms like this could lead to greater institutionalized racism and sexism — maybe, in some far-out scenario, even anti-human bias altogether — in spite of our best efforts to prevent it.

There are ways to repair this, according to the paper. For instance, the researchers found that they could reduce the levels of prejudice by forcing AI agents to engage in what they called global learning, or interacting with the different AI agents outside of their own bubble. And when populations of the AI agents had more traits in general, prejudice levels also dropped, simply because there was more built-in diversity. The researchers drew a parallel to exposing prejudiced people to other perspectives in real life.

Have you read?

From the paper:

These factors abstractly reflect the pluralism of a society, being influenced by issues beyond the individual, such as social policy, government, historical conflict, culture, religion and the media.

More broadly, this means that we absolutely cannot leave AI to its own devices and assume everything will work out. In her upcoming book, “Hello World: Being Human in the Age of Algorithms,” an excerpt of which was published in The Wall Street Journal, Mathematician Hannah Fry urges us to be more critical, more skeptical of the algorithms that shape our lives. And as this study shows, it’s not bad advice.

No matter how much we would like to believe that AI systems are objective, impartial machines, we need to accept that there may always be glitches in the system. For now, that means we need to keep an eye on these algorithms and watch out for ourselves.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.