What if robots decide they want to take control?

Jun 6, 2016

This article is published in collaboration with Business Insider.

A humanoid robot runs during a presentation.

Image: REUTERS/Francois Lenoir

Sam Shead

Technology Reporter, Business Insider

Our Impact

What's the World Economic Forum doing to accelerate action on Emerging Technologies?

The Big Picture

Explore and monitor how Fourth Industrial Revolution is affecting economies, industries and global issues

Stay up to date:

Fourth Industrial Revolution

Machines are becoming more intelligent every year thanks to advances being made by companies like Google, Facebook, Microsoft, and many others.

AI agents, as they're sometimes known, can already beat us at complex board games like Go and they're becoming more competent in a range of other areas.

Now a London AI research lab owned by Google has carried out a study to make sure we can pull the plug on self-learning machines when we want to.

DeepMind, acquired by Google for a reported £400 million in 2014, teamed up with scientists at the University of Oxford to find a way to make sure AI agents don't learn to prevent, or seek to prevent humans, from taking control.

The peer-reviewed paper — titled "Safely Interruptible Agents [PDF]" and published on the website of the Machine Intelligence Research Institute (MIRI) — was written by Laurent Orseau, a research scientist at Google DeepMind, Stuart Armstrong at Oxford University's Future of Humanity Institute, and several others. It is forthcoming at The Conference on Uncertainty in Artificial Intelligence (UAI).

The researchers explain in the paper's abstract that AI agents are "unlikely to behave optimally all the time." They add: "If such an agent is operating in real-time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions — harmful either for the agent or for the environment — and lead the agent into a safer situation."

The researchers claim to have created a "framework" that allows a "human operator" to repeatedly and safely interrupt an AI, while making sure that the AI doesn't learn how to prevent or induce the interruptions.

"Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for this," the authors write.

Will machines outsmart us one day? Image: World Economic Forum

When asked about the framework, Orseau explained to Business Insider that reinforcement learning agents — AIs that can automatically determine the ideal "behaviour" within a specific setting — need to receive rewards in order to better themselves. "If the agent expects a reward but can predict it's going to be shut down, it will try to resist so as to get its reward," he wrote in an email. "Our framework allows the human supervisor to temporarily take control of the agent and make it believe it /chooses/ to shut down itself."

The researchers found that some algorithms, such as "Q-learning" algorithms, are already safely interruptible, while others, like "Sarsa", aren't when they're off the shelf but they can be modified relatively easily so they are. "It is unclear if all algorithms can be easily made safely interruptible," the authors admit.

Armstrong told Business Insider that it's very hard to predict when humans will need to start pressing a "big red button" on self-learning machines.

DeepMind's work with The Future of Humanity Institute is interesting; DeepMind wants to "solve intelligence" and create general purpose AIs, while the Future of Humanity Institute is researching potential threats to our existence. The institute is led by Nick Bostrom, whobelieves that machines will outsmart humans within the next 100 years and thinks they have the potential to turn against us.

Speaking at the University of Oxford last May at the annual Silicon Valley comes to Oxford event, Bostrom said: "I personally believe that once human equivalence is reached, it will not be long before machines become superintelligent after that. It might take a long time to get to human level but I think the step from there to superintelligence might be very quick. I think these machines with superintelligence might be extremely powerful, for the same basic reasons that we humans are very powerful relative to other animals on this planet. It’s not because our muscles are stronger or our teeth are sharper, it’s because our brains are better."

DeepMind knows the technology it's creating has the potential to cause harm. The founders — Demis Hassabis, Mustafa Suleyman, and Shane Legg — allowed their company to be acquired by Google on the condition that the search giant created an AI ethics board to monitor advances that Google makes in the field. Who sits on this board, and what they do exactly,remains a mystery.

The founders have also attended and spoken at several conferences about ethics in AI, highlighting that they want to ensure the technology they and others are developing is used for good, not evil. It's likely that they will look to incorporate some of the findings from the "Safely Interruptible Agents" paper into their work going forward.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.