OpenAI, the artificial intelligence research company founded by tech heavyweights including Elon Musk and Peter Thiel, says it’s developed the most advanced language-processing algorithm so far.
Sample outputs suggest that the AI system is an extraordinary step forward, producing text rich with context, nuance and even something approaching humor. It’s so good, in fact, that OpenAI says it’s not releasing its code to the public because its researchers are scared it could be misused, according to a new blog post.
The algorithm, GPT-2, was trained on some 8 million web pages, according to the new research. Given a prompt, GPT-2 is tasked with predicting the next word based how those words have been used on the websites it read. In the end, the algorithm churns out passages of text that are far more coherent than past attempts to build AI with contextual knowledge of language.
In the blog, the OpenAI researchers concede that GPT-2 works only about half the time. But the examples that the team showcased on the blog post were so well-written that you’d be hard pressed to say whether it was written by a human.
In one example, the researchers prompted their algorithm with the opening of a fictional news article about scientists who discovered unicorns.
“In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English,” the researchers wrote.
There are occasional glitches and incoherent sentences in the AI-written story, but by and large the algorithm did pretty well.
“The scientist named the population, after their distinctive horn, Ovid’s Unicorn. These four-horned, silver-white unicorns were previously unknown to science,” reads the first of nine AI-written paragraphs of the article, some of which include made-up quotes by fake scientists.
Have you read?
But not everything is unicorns and discoveries. OpenAI chose to keep GPT-2 in-house because the algorithm could easily be used to generate misleading news articles, impersonate people, or other shady things.
Wired tested out GPT-2, and with nothing more than the prompt “Hillary Clinton and George Soros,” OpenAI’s algorithm churned out the sort of political conspiracy nonsense that regularly appears on non-credible right-wing websites.
“It could be that someone who has malicious intent would be able to generate high quality fake news,” David Luan, OpenAI’s vice president of engineering told Wired.