Industries in Depth

Combining machine learning and Romeo and Juliet to explain what makes a good story

A woman reads a book at her open air book store in Skopje April 24, 2014. Macedonians will cast their ballots on Sunday April 27 in the second round of the presidential vote, overshadowed by the general elections. Macedonian voters look likely to hand conserva

Researchers have mapped out the perfect story. Image: REUTERS

Ana Swanson
Our Impact
What's the World Economic Forum doing to accelerate action on Industries in Depth?
The Big Picture
Explore and monitor how Media, Entertainment and Sport is affecting economies, industries and global issues
A hand holding a looking glass by a lake
Crowdsource Innovation
Get involved with our crowdsourced digital platform to deliver impact at scale
Stay up to date:

Media, Entertainment and Sport

A good story often takes its reader on an emotional journey from despair to elation, and hits many notes in between.

It’s something that writers and readers understand intuitively. But it also can be analyzed, quantified and graphed. In a fascinating new study, researchers use machine-learning techniques to analyze 1,327 literary works — including “Romeo and Juliet,” “Frankenstein” and Harry Potter — and reveal what exactly it is about popular stories that makes us love them most.

The idea of graphing the emotional arc of popular stories is not a new one. The author Kurt Vonnegut famously wrote about it in a master’s thesis, which he called his “prettiest contribution to the culture.” The University of Chicago actually rejected the thesis, but Vonnegut went on to write and speak more about the idea.

In one of his most popular lectures, Vonnegut stands before a blackboard and graphs the emotional roller coasters of Cinderella, Hamlet and the Bible. In Vonnegut’s graphs, the horizontal axis tracks the story from beginning to end, while the vertical axis reflects positive or negative change in the characters’ fortunes. When Cinderella gets a fairy godmother, the line rises. When Romeo and Juliet quaff their poison, it plummets.


In their new paper, the researchers from the University of Vermont and the University of Adelaide draw on Vonnegut’s template, but add a hefty dose of modern computing power.

The research draws on a kind of glossary of emotion they created by crowdsourcing emotional ratings for 10,000 of the most common words in the English language. Words such as “death,” “rape,” “cancer” and “die” rank at the bottom of the scale, while words like “love,” “laugh” and “happiness” are at the top. The researchers have used this scale to create other fascinating visualizations, like the graphic below that shows how the emotional content of Twitter changed over the past year.

 Average happiness for Twitter
Image: Washington Post

The researchers use the glossary to create a snapshot of more than a thousand literary works, mostly fiction, available from the free digital library Project Gutenberg. The result is thousands of graphs of what Andrew Reagan, one of the researchers, calls “the emotional experience of the reader.”

 Harry Potter
Image: Washington Post

The graphic below shows an example: the emotional range of “Harry Potter and the Deathly Hallows,” the final book in J.K. Rowling’s popular series. The emotional content of the story rises and falls with the book's sub-narratives, peaking when Harry hangs out at the home of his friend Ron Weasley, and rising again for the story’s happy-ever-after conclusion.

For their recent study, the researchers fed the emotional arcs of the more than 1,000 literary works back into a machine-learning algorithm, which then sorted them into broad clusters. As the Harry Potter graph above demonstrates, individual stories may have very complex emotional arcs. But analyzing the emotional arcs very broadly, they found that there were six types that fit 85 percent of the books they had analyzed, Reagan said.

Roughly one-third of the stories were either rags-to-riches stories, in which the emotional arc rises through the bulk of the story, or the opposite, riches-to-rags stories, in which it broadly falls. “Romeo and Juliet” and many of Shakespeare’s tragedies show up in this second category.

 Romeo and Juliet
Image: Washington Post/Regan et al

The researchers also find evidence for one of the categories that Vonnegut identified, called “Man-in-a-hole,” where the emotional arc of a story falls, then rises. (“Somebody gets into trouble, gets out again. People love that story! They never get sick of it,” Vonnegut says in his lecture.)

The researchers say this story type, which includes “The Adventure of Sherlock Holmes,” graphed below, accounts for nearly another one-third of the stories they analyzed.

 Sherlock Holmes
Image: Washington Post

The researchers also find a subcategory of stories in which the emotional arc rises, then falls, which they label Icarus, after the Greek mythological figure who falls into the sea after flying too close to the sun. Another arc, where emotions rise, then fall, then rise again, is labeled “Cinderella,” after the fairy godmother tale. Its opposite, a fall-rise-fall pattern, is labeled “Oedipus,” after the Greek tragedy in which a king unwittingly kills his father and marries his mother. (“Frankenstein,” graphed below, fits the bill.)

Image: Washington Post

Part of Vonnegut’s original idea was that people are drawn to certain story arcs more than others, and that these proclivities vary from culture, just as pottery or musical styles would.

The researchers say there is much more work to be done to compare the popularity of story arcs across cultures and time. But to investigate whether certain story types are more popular than others, they analyze how often stories with certain emotional arcs are downloaded from Project Gutenberg, and find that stories with the Icarus, Oedipus and Man-in-a-hole arcs are downloaded most.

While the researchers admit this is a rough proxy for success, they say the emotional rise and fall of these stories might help them forge a particular connection with readers. “We tend to prefer narratives that fit into certain molds,” Reagan says.

Don't miss any update on this topic

Create a free account and access your personalized content collection with our latest publications and analyses.

Sign up for free

License and Republishing

World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.

The views expressed in this article are those of the author alone and not the World Economic Forum.

Related topics:
Industries in DepthArts and Culture
World Economic Forum logo
Global Agenda

The Agenda Weekly

A weekly update of the most important issues driving the global agenda

Subscribe today

You can unsubscribe at any time using the link in our emails. For more details, review our privacy policy.

How these 5 steel producers are taking action to decarbonize steel production

Mandy Chan and Daniel Boero Vargas

June 25, 2024

About Us



Partners & Members

  • Sign in
  • Join Us

Language Editions

Privacy Policy & Terms of Service

© 2024 World Economic Forum