"

Carmen Tu

Can Movie Characteristics Predict Audience Reception?

Hollywood studios and movie producers are keenly interested in determining what types of story scripts will resonate with audiences and critics. A few budding screenwriters approach you, a social psychology researcher specializing in qualitative content analysis, to conduct an analysis of movie plots in order to help them determine the types of storylines that may appeal to mainstream viewers. 150 movies from the last five years were randomly selected and analyzed for plot and character traits. You decide to analyze the following six narrative characteristics for the exploratory study: 1) genre, 2) plot shape, 3) protagonist goal type, 4) protagonist agency, 5) protagonist cooperativeness, and 6) protagonist assertiveness. You are interested to see how these characteristics relate to the following outcomes: 1) average critic rating of the movie (as a percentage score) and 2) the net profit of the movie (in US dollars).

To analyze these six narrative characteristics, you adopt Brown & Tu’s (2020) scheme for plot and Berry & Brown’s (2017) classification scheme for literary characters. The coding scheme for the five narrative characteristics are as follows:

  1. Genre
Label Code
Drama 1
Comedy 2
Romance 3
Action 4
Horror 5

 

  1. Plot Shape
Label Code
Fall-Rise 1
Fall-Rise-Fall 2
Rise-Fall 3
Rise-Fall-Rise 4

 

  1. Protagonist Goal
Label Code
Striving 1
Coping 2

 

  1. Protagonist Cooperativeness
Label Code
High 1
Medium 2
Low 3

 

  1. Protagonist Assertiveness
Label Code
High 1
Medium 2
Low 3

You also recruit a second coder in order to determine whether there is inter-rater reliability in your coding method.

  1. Load the datafile “P07_dataset.csv”. Run descriptive statistics (e.g., measures of frequency, measures of central tendency including mean, median, and mode where applicable) on each of the six narrative characteristics, using Rater 1 (R1)’s coding data. Answer the following questions:
    • What is the most common type of movie genre in the corpus?
    • What is the most common type of plot shape in the corpus?
    • What is the most common type of protagonist goal in the corpus?
    • What is the most common type of protagonist agency in the corpus?
    • What is the most common type of protagonist cooperativeness in the corpus?
    • What is the most common type of protagonist assertiveness in the corpus?
    • What is the mean protagonist agency across the 150 films in the corpus?
    • What is the mean protagonist cooperativeness across the 150 films in the corpus?
    • What is the mean protagonist assertiveness across the 150 films in the corpus?
  1. Answer the following questions about the two outcome variables:
    • Which are the five films with the highest mean critic rating?
    • Which are the five films with the greatest net profit?
    • Which are the five films with the lowest net profit?
  1. The six narrative variables are a mix of nominal data (genre, plot shape, protagonist goal) and ordinal data (protagonist agency, cooperativeness, assertiveness). Genre is the only variable that is not coded by raters. What are the relationships between the nominal variables? To determine this, please answer the following questions:
    • Which genre of film has the highest percentage of type 1 (fall-rise) plot shapes?
    • What is the percent distribution of plot shapes in each genre?

4. Visualize these plot shape distributions across each genre in a grouped bar plot. Be sure to label the y-axis, x-axis, and legend.

    • As practice, create grouped bar graphs between any other pair of variables in order to visualize any interactions between the variables.

5. Run comparative analyses to see which, if any, of the six characteristics are related to one another. Because the six characteristics are not normally distributed, use non-parametric tests, such as Chi-Square. For example, is the relationship between genre and plot statistically significant?

    • The null hypothesis is that the difference between the observed data and expected data is due to chance.
    • A significant Chi-square result will allow us to reject the null hypothesis and consider an alternative hypothesis where the difference may be due to the relationship between the two variables.

6. Run an analysis of variance test to see which, if any, of the five rater-coded variables are related with either of the two outcome variables. Consider if you should run a 2-way or 3-way factorial ANOVA. What assumptions do you need to consider and test before running an ANOVA?

7. Run a cluster analysis to see if which, if any, of the categories in the six narrative characteristics variables cluster together.

8. Determine the inter-rater reliability between the two coders for each of the five rater-coded variables. A Cohen’s Kappa score of greater than 0.8 reflects strong inter-rate agreement.

Files to Download:

  1. P07_dataset.csv
References for further reading

Berry, M., & Brown, S. (2017). A classification scheme for literary characters. Psychological Thought10(2).

Brown, S., & Tu, C. (2020). The shapes of stories: A “resonator” model of plot structure. Frontiers of Narrative Studies6(2), 259-288.