Overview

An overview article by McKinsey on “AI in storytelling: Machines as cocreators”.

Paper

ICDM 2017 short paper: PDF

ICCV 2017 workshop paper (slightly longer, but contains more detail): PDF

Cite as:

Eric Chu and Deb Roy. “Audio-visual Sentiment Analysis for Learning Emotional Arcs in Movies”. Data Mining (ICDM), 2017 IEEE 17th International Conference on. IEEE, 2017.

The Spotify dataset contains a number of audio features such as valence, energy, speechiness, etc. for over 600,000 songs. The full list of features and their descriptions can be found on the Spotify developer site. Links to the 30-second song samples are provided and can be retrieved using the Spotify API. These songs can also be cross referenced with the Million Song Dataset (MSD) for further features.

The Movie clip dataset contains emotional labels for ~1000 30-second clips, taken from ~175 movies. Downloadable links to the movie clips are provided within the csv files. Each clip is annotated by 3 crowdsourced workers, answering the following questions:

1. How positive or negative is this video clip? (1 being most negative, 7 being most positive)
2. How confident are you in your previous answer? (1 being least confident, 10 being most confident)
3. Which emotion(s) does this video clip contain or convey? (check all that apply or none of the above)
    - Options: anger, anticipation, disgust, fear, joy, sadness, surprise, trust, none of the above
4. Which of the following contributed to your decisions? (check all that apply)
    - Options: audio, dialogue, visual (actions, scene, setting)