Fourth Industrial Revolution

This computer isn't perfect. So it understands speech as well as you

An employee types on a computer keyboard with both Latin and Cyrillic letters in Sofia June 23, 2008. Bulgaria applied on Monday to register an Internet domain name in Cyrillic script as part of efforts to boost national pride amid a growing influence of Engli

Twenty years ago, the error rate of the best published research system had a word error rate above 43%. Image: REUTERS

Keith Breene
Share:
Our Impact
What's the World Economic Forum doing to accelerate action on Fourth Industrial Revolution?
The Big Picture
Explore and monitor how Fourth Industrial Revolution is affecting economies, industries and global issues
A hand holding a looking glass by a lake
Crowdsource Innovation
Get involved with our crowdsourced digital platform to deliver impact at scale
Stay up to date:

Emerging Technologies

In a significant breakthrough for artificial intelligence, voice recognition software can now understand language as accurately as humans, although grasping the context behind it remains elusive.

Researchers at Microsoft have created software that has a word error rate of 5.9%, which is about the same as a human transcriber.

“The research milestone doesn’t mean the computer recognized every word perfectly. In fact, humans don’t do that, either. Instead, it means that the error rate – or the rate at which the computer misheard a word like have for is or a for the – is the same as you’d expect from a person hearing the same conversation,” Microsoft said in a blog post.

The result has been edging closer for many years and comes just weeks after the same team reported that they had got the error rate down to a tantalising 6.3%.

Twenty years ago, the error rate of the best published research system had a word error rate above 43%.

Neural networks

Both IBM and Microsoft cite the advent of deep neural networks, which are inspired by the biological processes of the brain, as a key reason for advances in speech recognition.

Computer scientists have for decades been trying to train computer systems to do things like recognize images and comprehend speech, but until recently those systems were plagued with inaccuracies.

The new Microsoft programme relies on these deep neural networks as well as specialized graphics processing units that allow the software to learn at speeds not previously possible.

Have you read?

Booming market

The milestone has far-reaching implications.

Recent research by Tractica forecast that voice recognition software licenses will pass 550 million worldwide by 2024. Consumer and healthcare uses are the strongest growth sectors but the technology has implications across multiple industries.

Annual Voice and Speech Recognition licences by region, 2015-2024

More to do

Researchers say more work is needed to improve the system in real-life settings, such as places where there is a lot of background noise. Research into identifying individual speakers when multiple people are talking is also a part of longer-term research efforts.

And, as anyone who has spent time shouting at Siri, Cortana or Google Assistant will testify, there is still a lot of work needed to enable computers to not just understand which words are being spoken, but their meaning and context too. It will still be some time before computers can answer questions or follow instructions with the same accuracy as humans.

Harry Shum, who heads the Microsoft Artificial Intelligence and Research group, “It will be much longer, much further down the road until computers can understand the real meaning of what’s being said or shown.”

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

Sign up for free

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.

Related topics:
Fourth Industrial RevolutionEmerging Technologies
Share:
World Economic Forum logo
Global Agenda

The Agenda Weekly

A weekly update of the most important issues driving the global agenda

Subscribe today

You can unsubscribe at any time using the link in our emails. For more details, review our privacy policy.

Why the Global Digital Compact's focus on digital trust and security is key to the future of internet

Agustina Callegari and Daniel Dobrygowski

April 24, 2024

About Us

Events

Media

Partners & Members

  • Join Us

Language Editions

Privacy Policy & Terms of Service

© 2024 World Economic Forum