A team of researchers is working to build trust between humans and artificial intelligence (AI) by creating an “interpreter” that can explain how an AI arrived at the answer to a specific question.

In an age of self-driving cars and autonomous drones, AI is becoming a bigger part of our lives. It’s also getting increasingly savvy. Today, AI can recognize text, distinguish people by their faces, and even identify physical objects, to some degree. But even the best AI systems still get things wrong much of the time.

That poses a big problem, says Kate Saenko, assistant professor of computer science at Boston University.

“If an AI tool makes mistakes, human users quickly learn to discount it, and eventually stop using it altogether,” she says. “I think that humans by nature are not likely to just accept things that a machine tells them.”

A further complication, she adds, is that as AI becomes more powerful, the algorithms that drive it have become increasingly opaque to human users. Information goes into one end of a computational “black box,” and an answer comes out the other side—yet the set of rules and reasoning used to find that answer are obscured.

Saenko is working to change that relationship. Her research seeks to uncover new ways of getting inside the “mind” of AI, creating a translation tool that explains its decision-making process to human users.

On the surface, that goal may sound trivial. Who cares how a computer came to an answer, as long as it’s right? Getting feedback on why an AI device makes a particular decision, however, may ultimately help improve its accuracy by giving opportunities for humans to offer tiny course corrections, Saenko says.

In the process, it could increase the trust that humans put into a machine, making it a better collaborator on complex jobs. Achieving that sort of openness in today’s AI, though, may not be so simple.

New types of AI

It hasn’t always been hard to look inside the mind of AI. In the past, many artificial intelligence systems, like facial recognition, used rules and guidelines that programmers identified ahead of time—rules for defining skin color, for what shapes make up a nose, for defining light and shadow. All those user-created concepts had to be hard-coded into AI from the start, giving it a framework to do its job.

This method makes it fairly simple to figure out how a machine came to its conclusion: just identify which preprogrammed rules it used to get there. It also fundamentally limits the abilities of AI. Real life is vastly complex, after all, and even the best human programmers can’t come up with every possible rule that a computer might use to make sense of the world.

“It’s very hard for us to anticipate all possible ways a dog might look in any image anywhere in the world, for example,” says Saenko. “If you have enough processing power and data, a better approach would be to show a computer a million pictures of dogs and let it define them itself.”

In the last five years or so, that approach has become more widely used in the AI world. Instead of working with a single template, these new systems involve a more iterative approach, modeled on the way that our own nervous system works.

These new types of AI, called “deep neural networks,” employ huge numbers of interconnected functions, or nodes, arranged in a vast web. Each one is responsible for parsing a tiny amount of information and progressively builds on the work of the nodes before it.

This sort of incremental process, building bit by bit on simple data, is at the core of a deep neural network. It makes AI flexible, fast, and powerful—for some systems, it can operate with more than 95 percent accuracy. In those few cases where it’s not accurate, though, deep neural networks make it extremely hard to figure out why. There are no preset coded definitions to turn to, since a neural network creates those big-picture guidelines as it goes.

“The sheer number of parameters these models can process is the reason they’ve been good at visual and language tasks like automated translation. It lets them soak up a lot of data,” says collaborator Trevor Darrell of the University of California, Berkeley. “But because they have so many parameters, it’s very difficult to directly extract and interpret structures within them.”

Creating an AI ‘interpreter’

Saenko and Darrell are working with Zeynep Akata, a colleague at the University of Amsterdam in the Netherlands, and Kitware, an open-source software company, on ways to crack into deep neural networks and make them more easily understood.

Asking a network like this to explain itself would likely reduce its speed and efficiency, the researchers say, so they’re hoping to create a sort of translation tool—a second network that acts alongside the first, interpreting its choices in real time and reporting them to a human user.

“The primary neural network is just doing its job. All of its processing is just devoted to solving its task, like finding doors or windows in an image, for example,” says Saenko. “That’s why we want to use a second neural network that has access to that machinery and input data, and can learn to translate all that into a textual version that humans can understand.”

This translation is important, Saenko says, because when a deep neural network does make a mistake, it’s probably because it found a pattern in the data that doesn’t quite match the real world. If it’s steering an autonomous car along a poorly maintained road, for instance, it might stop at a shadow, thinking it’s a massive pothole.

If that happens, an “interpreter” AI could prompt a user for more information in plain English. “I want it to be able to say, ‘I stopped driving because I’m not sure if that’s a pothole or a shadow, so tell me what to do here,'” she says.

“In the future, we’re going to be using AI as a collaboration between humans and computers. We need to be able to communicate with it, understand its strengths, and know what it’s good at, so it can help us with things we’re not so good at—like sorting through a petabyte of video to identify content,” Saenko says. “I see this as creating superhumans. It’s a collaboration between humans and AI.”

Saenko’s research is funded by a grant from the Defense Advanced Research Projects Agency (DARPA).