Google debuted “Duplex” at its I/O developer conference on May 8, an AI system that could make virtual assistants as capable as humans available to anyone.
Today’s digital assistants tend to fall down as soon as they depart from their limited scripts. Programs that can understand natural language, converse with people and respond to new context and information like humans still feel like science fiction. Duplex may make them reality.
To show off its capabilities, CEO Sundar Pichai played two recordings of Google Assistant running Duplex, scheduling a hair appointment and a dinner reservation. In each, the person picking up the phone didn’t seem to realize they were talking to a computer. The conversations proceeded back-and-forth to find the right time, and confirm what the customer wanted. Even when conversations didn’t go as expected, the assistant understood the context, responded appropriately, and carried on the task. Here’s one example:
Restaurant hostess: Hello, how many I help you?
Google Assistant: Hi, umm, I’d like to reserve a reservation for Wed the 7th
Restaurant hostess: For 7 people?
Google Assistant: Um, it’s for 4 people
Restaurant hostess: 4 people. When? Today? Tonight?
Google Assistant: Umm, Wednesday at 6 PM.
Restaurant hostess: Uh, actually we reserve upwards of 5 people. For 4 people, you can come.
Google Assistant: How long is the wait usually to be seated?
Restaurant hostess: When tomorrow? or weekday?
Google Assistant: For next Wednesday, uh, the 7th
Restaurant hostess: Oh it’s not too busy. You can come with 4 people. OK?
Google Assistant: Oooh, I gotcha. Thanks
Restaurant hostess: Bye bye.
(You can listen to the recordings here.)
It’s a far more natural conversation than consumers may be used to with digital assistants. The AI’s voice lacks a stilted cadence and comes complete with “ums” and natural pauses (which also helps cover up the fact that it is still processing). It uses the phone’s on-board processing, as well as the cloud, to deliver the right response with just the right amount of pause.
Google is taking advantage of its primary asset: data. It trained Duplex on a massive body of “anonymized phone conversations,” according to a release. Every scheduling task will have its own problems to solve when arranging a specific type of appointment, but all will be underpinned by Google’s massive volume of data from searches and recordings that will help the AI hold a conversation.
Still, the technology cannot carry on just any conversation. Even though Duplex can seemingly handle far more context than other systems, it only works within a narrow set of queries (Google hasn’t listed all of them yet). And despite releasing six new more natural sounding voices for the Assistant product available today, none approached the humanity of its Duplex example.
This wasn’t a live demo. As astounding as the conversations were, they were likely cherry-picked among the many tests that Google says it has conducted. For now, Pichai say he is in no rush to release the “developing” technology. “We want to get the user experience right for businesses and users,” he said. “We’re going to work hard and get it right.”
Have you read?
It may be years, not months, before Duplex is available, and the problems remain daunting. People often talk quickly, change subjects, use verbal shortcuts, and talk amid loud background noises.
But Google sees this application of AI as a way to make the technology invisible, and indispensable, to our daily lives. Rather than devise a new device or interface, Google wants to slip this seamlessly into our lives and businesses. Google’s Assistant, search tools, maps and photos products, will be a collective intelligence that teaches itself, and interacts with us like a human.
Where have we seen that before?