Quantified Feminism and the Bechdel Test

Project Summary

Selen Berkman (ECE, CompSci), Sammy Garland (Math), and Aaron VanSteinberg (CompSci, English) spent ten weeks undertaking a data-driven analysis of the representation of women in film and in the film industry, with special attention to a metric called the Bechdel Test. They worked with data from a number of sources, including fivethirtyeight.com and the-numbers.com.

Themes and Categories
Year
2017
Contact
Paul Bendich
Mathematics
bendich@math.duke.edu

Project Results: The team showed that movies which pass the Bechdel test, which requires that a movie have at least two female named characters who talk to each other about something other than a man, have a statistically significant higher return on investment when compared to films that do not.

They also demonstrated a clear positive relationship between the number of women working on the production of a film and the percentage of female dialogue. Finally, the team expanded the binary Bechdel test into a continuous score called the Bechdel score, and they demonstrated a clear positive correlation between this score and the percentage of overall film dialogue spoken by female characters. 

Click here for the Executive Summary

Faculty Lead & Project Manager: Stefan Waldschmidt

Room 351: Sharing Project Space and Coding

"I gained valuable program management experience. Given that after the program was over I got hired as a consultant manager at CollegeVine, I'd say it paid off." — Stefan Waldschmidt, Faculty Lead, Project Manager and PhD Candidate, English at Duke University

Related People

Related Videos

Related Projects

Producing oil and gas in the North Sea, off the coast of the United Kingdom, requires a lease to extract resources from beneath the ocean floor and companies bid for those rights. This team will work with ExxonMobil to understand why these leases are acquired and who benefits. This requires historical data on bid history to investigate what leads to an increase in the number of (a) leases acquired and (b) companies participating in auctions. The goal of this team is to create a well-structured dataset based on company bid history from the U.K. Oil and Gas Authority; data which will come from many different file structures and formats (tabular, pdf, etc.). The team will curate these data to create a single, tabular database of U.K. bid history and work programs.

Producing oil and gas in the Gulf of Mexico requires rights to extract these resources from beneath the ocean floor and companies bid into the market for those rights. The tops bids are sometimes significantly larger than the next highest bids, but it’s not always clear why this differential exists and some companies seemingly overbid by large margins. This team will work with ExxonMobil to curate and analyze historical bid data from the Bureau of Ocean Energy Management that contains information on company bid history, infrastructure, wells, and seismic survey data as well as data from the companies themselves and geopolitical events. The stretch goal of the team will be to see if they can uncover the rationale behind historic bidding patterns. What do the highest bidders know that other bidders to not (if anything)? What characteristics might incentivize overbidding to minimize the risk of losing the right to produce (i.e. ambiguity aversion)?

In this project, we are interested in creating a cohesive data pipeline for generating, modeling and visualizing basketball data. In particular, we are interested in understanding how to extract data from freely available video, how to model such data to capture player efficiency, strength and leadership, and how to visualize such data outcomes. We will have four separate teams as part of this project working on interrelated but separate goals:

Team 1: Video data extraction

This team will explore different video data extraction techniques with the goal of identifying player locations, ball location and events at any given time during a basketball game. The software developed as part of this project will be able to generate a usable dataset of time-stamped basketball plays that can be used to model the game of basketball.

Teams 2 & 3: Modeling basketball data: offense and defense

The two teams will explore different models for the game of basketball. The first team will concentrate on modeling offensive plays and try to answer questions such as: How does the ball advance? What leads to successful plays? The second team will concentrate on defensive plays: What is an optimal strategy for minimizing opponent scoring opportunities? How should we evaluate defensive plays?

Team 4: Visualizing basketball data

This team will work on dynamic and static visualization of elements of a basketball game. The goal of the visualization is to capture information about how players and the ball move around the court. They will develop tools to represent average trajectories be in these settings that can also capture uncertainty about this information.

Faculty Leads: Alexander Volfovsky, James Moody, Katherine Heller

Project Managers: Fan Bu, Greg Spell, 2 more TBD