Artificial Intelligence

Mapping Emotional Dynamics From Movie Scripts

Updated on December 9, 2022

Canadian researchers have used thousands of movie scripts to develop a machine learning framework that can track emotional arcs of speakers by interpreting the emotional temperature of their dialogue as it evolves over the course of the narrative.

The research, from Carleton University at Ottawa, is entitled Emotion Dynamics in Movie Dialogues and includes analysis of central characters in well-known movies such as The Shining and Casino. It's intended as a potential basis for machine learning analysis and mapping of real-world discourse in diverse channels such as social media threads and transcripts of psychology consultations.

The work proposes a framework of utterance emotion dynamics (UED) based on similar metrics from research in psychology, and is the first to model emotions from story dialogue on a per-character basis, rather than calculating average emotional temperature in aggregated dialogue across the breadth of a movie.

A word-map derived from Jack Nicholson's dialogue in The Shining (1980), color-mapped to valence against the character's resting emotional state. Source: https://arxiv.org/pdf/2103.01345.pdf

The components of UED include home base (a typical or ‘resting' emotional state); variability (the extent to which emotions are volatile and likely to change quickly); and rise/recovery rates (the ability of characters to regulate challenging emotions).

In this abstract example, the character represented by the black line has a far lower recovery rate from a disturbed or discordant event-driven emotional state than a corresponding character.

In this abstract example from the paper, the character represented by the black line has a far lower recovery rate from a disturbed or discordant event-driven emotional state than a corresponding character.

The work is engineered to help answer some challenging issues in literary theory, including: the extent to which characters verbalize their emotions directly over a narrative; the extent to which a plot can be inferred directly from dialogue; the identification of a point in a narrative where the central characters are most at odds with each other; and the difference between the characters' ability to negotiate difficult emotions, and their consequences.

In analyzing dialogue from the two central characters of The Shining, line-color indicates narrative time, deepening to red as the narrative draws to a close. Black dotted lines indicate major and minor axes of an ellipse encapsulating main characters 95% of the duration (not shown in the graph, for clarity).

Following The Scripts

The data was generated from 1,123 openly available movie scripts from the Internet Movie Script Database (IMSDB). Only characters with a minimum of 50 inter-character exchanges per movie were considered, which left 2,687 character study subjects from a total of 54, 518 characters contained in the corpus of script material.

The text was processed with the NLTK WordNet Lemmatizer, producing 5,673,201 natural language processing (NLP) word tokens, with each character left with around 1,376 tokens per movie.

The researchers note that the evaluation of words in this fashion only takes into context the explicit emotional value of word, rather than its relationship to surrounding words (either from the same character or from another character in the scene). However the researchers argue that most words have a dominant primary sense, and that the aggregate word capture compensates for this lack of context.

Emotional Variability

In developing a reduced 0>100 graph representing the emotional variability of characters across the extracted movie scripts, the paper notes Sharon Stone's character from Casino (1995), though Jill Ritchie's character from Little Athens (2005) tops the league of volatility, with Devin Brochu's character in Hesher (2010) in second place.

Perhaps predictably, Brent Spiner's android creation Cmdr. Data from the Star Trek movie franchise displays the least emotional variability among the characters studied, though only narrowly beating human crew-mate Riker (Jonathan Frakes' character in the series).

The paper confirms our instinctive understanding that emotions are likely to peak and resolve in some way (negatively or positively) in the final 10-15% of the narrative, where developed conflict must in some way be addressed. The research found that negative utterances by characters in a movie increase by 2% over its duration, rising to 91% at the climax of the narrative, while positive words also decrease, though less markedly, over the same time frame.

Other Factors

The researchers intend to develop the work in order to apply it to a range of domains, including public policy, public health and social sciences. They note that the findings of the work should not be considered a complete matrix for emotional state evaluation, and provide a 7-point ethical guidelines template that should be considered in utilizing these techniques.

As noted by recent research from the Swedish Media council, there are many non-textual factors that should be taken into consideration when attempting to gauge the emotional temperature of a narrative, since context, music, visual cues and unspoken temporal factors (such as silence) contribute greatly to the meaning of discourse.

Context is particularly important: one would, for example, learn very little about the emotional state of Keir Dullea's stranded astronaut in Stanley Kubrick's 2001: A Space Odyssey (1968) by studying the script, since that character has been extensively trained to retain a problem-solving mindset in highly stressful circumstances. Additionally, many emotionally discursive movies make sparse use of dialogue.