Queens of Antiquity

Project Summary

Understanding how to generate, analyze, and work with datasets in the humanities is often a difficult task without learning how to code or program. In humanities centered courses, we often privilege close reading or qualitative analysis over other methods of knowing, but by learning some new quantitative techniques we better prepare the students to tackle new forms of reading. This class will work with the data from the HathiTrust to develop ideas for thinking about how large groups and different discourse communities thought of queens of antiquity like Cleopatra and Dido.

Please refer to https://sites.duke.edu/queensofantiquity/ for more information.

Themes and Categories

Graduate Student: Grant Glass

Faculty: Dr. Charlotte Sussman

Course: “Queens of Antiquity” (English 390S-7; Spring 2018)

Grant Glass taught this Data Expedition activity to students in ENGL 290, a spring 2019 course aimed at undergraduates. This experience exemplified that by introducing simple “distant reading” or qualitative concepts in a humanities undergraduate classroom, students would be able to use these tools to drive new types of research questions and think about how reading can include qualitative analysis.

The goals were to give students an introduction to “distant reading,” show how data and collections are created, what algorithms we can apply to those collections, and what types of analysis we can do from the results.

Over the course of two, 1.5-hour class sessions, 10 undergraduates were given the opportunity to create their own datasets and explore the results. For the end product, students created posts to discuss how the visualizations created from their collections helped them better understand.

Guiding Questions

  • What visualization is the most useful? Why?
  • What does the visualization help you understand about the corpus? What does it obscure?
  • What research questions can you generate from the visualization?

The Dataset


Elizabeth 1



In-Class Exercises

Creating Collections with Hathitrust

Understanding the Visualizations

Related People

Related Projects

Fluid mechanics is the study of how fluids (e.g., air, water) move and the forces on them. Scientists and engineers have developed mathematical equations to model the motions of fluid and inertial particles. However, these equations are often computationally expensive, meaning they take a long time for the computer to solve.


To reduce the computation time, we can use machine learning techniques to develop statistical models of fluid behavior. Statistical models do not actually represent the physics of fluids; rather, they learn trends and relationships from the results of previous simulation experiments. Statistical models allow us to leverage the findings of long, expensive simulations to obtain results in a fraction of the time. 


In this project, we provide students with the results of direct numerical simulations (DNS), which took many weeks for the computer to solve. We ask students to use machine learning techniques to develop statistical models of the results of the DNS.


Female baboons occasionally exhibit large swellings on their behinds. Although these ‘sexual swellings’ may evoke disgust from human on-lookers, they provide important information to group members about a female’s reproductive state. To figure out what these sexual swellings mean and whether male baboons notice, we need to look at the data.  

This data expedition explores the relationship between female baboon sexual swellings, female estrogen concentrations, and male mating success. The expedition uses long-term data collected on wild baboons by the Amboseli Baboon Research Project. After learning background information about baboon social lives and reproduction, students generate testable predictions for two hypotheses about baboon reproduction. Students then learn how to use the popular R packages dplyr and ggplot2 to calculate descriptive statistics about the dataset. Finally, students perform data visualization to understand and explore patterns in animal mating behavior and sexual signals.

Learning objectives

  • Learn basics of exploratory data analysis (descriptive statistics, generating plots) in R
  • Learn basics of popular R packages dplyr and ggplot2
  • Increase understanding of association between hormones and mating behavior
  • Increase science literary skills (e.g. generating predictions, interpreting results)

Course Materials


The lesson started with a brief Powerpoint presentation to introduce the class to basic information on baboon sociality and reproduction. At the end of the Powerpoint, students were introduced to 2 key hypotheses about baboon reproduction that they then explored using R. The class was divided into small groups where students worked together to propose possible predictions to test these hypotheses (and filled out the first section of the provided worksheet) before seeing the provided predictions.

Students then worked through the provided R script and accompanying dataset to test these predictions. Students with prior experience with R were able to skip ahead by following instructions on the R script, while most of the class worked through the script step-by-step with guidance from instructors. The course instructors walked the students through most of the script, then let students work independently to complete Data Visualization Part 2. Students filled out the worksheet as they went along.

At the end of the R script, students ultimately replicated Figure 1 from Gesquiere et al. 2007. This figure is included as the final slide of the Powerpoint presentation. The end of the class session was used to interpret the figure and discuss how it relates to the 2 project hypotheses.

Student level

This lesson is designed for undergraduate students who have little to no exposure to R or other programming software. It could be easily adjusted for students who are familiar with R or other programming software. This lesson takes about 75 minutes to complete.

The dataset and the Amboseli Baboon Research Project

The dataset for this expedition is a subset of the long-term database of the Amboseli Baboon Research Project, a project co-directed by Drs. Jeanne Altmann, Susan Alberts, Beth Archie, and Jenny Tung. The Amboseli Baboon Research Project has collected demographic, behavioral, genetic, and endocrinological data on a population of wild baboons since 1971 in order to study questions related to animal behavior, life history, behavioral ecology, genetics, and physiology. The project’s database is managed by Jake Gordon at Duke University and Niki Learn at Princeton University.

The unit of analysis for this dataset is a fecal estrogen sample from a cycling1 female. For each fecal sample (n = 843), 6 variables are recorded:

  1. female - identity of the female baboon
  2. cyle_day - day of her reproductive cycle
  3. estrogen - fecal estrogen concentration
  4. swelling_size - sexual swelling size2
  5. alpha_consort - whether or not the female consorted3 with an alpha male4 on that day
  6. nonalpha_consort - whether or not the female consorted with a non-alpha male on that day

This dataset includes data from 93 female baboons, with approximately 10 fecal estrogen samples per female. Minor differences between this dataset and the the dataset used in Gesquiere et al. 2007 are due to small, incremental changes in the database over time.


1 cycling: sexually mature but not pregnant or lactating
2 female yellow baboons exhibit exaggerated sexual swellings (an enlargement/engorgement of the genital and perineal skin) around ovulation
3 consortship: a period in which a male mate-guards a female. Virtually all matings and conceptions occur during consorts

A large and growing trove of patient, clinical, and organizational data is collected as a part of the “Help Desk” program at Durham’s Lincoln Community Health Center. Help Desk is a group of student volunteers who connect with patients over the phone and help them navigate to community resources (like food assistance programs, legal aid, or employment centers). Data-driven approaches to identifying service gaps, understanding the patient population, and uncovering unseen trends are important for improving patient health and advocating for the necessity of these resources. Disparities in food security, economic stability, education, neighborhood and physical environment, community and social context, and access to the healthcare system are crucial social determinants of health, which studies indicate account for nearly 70% of all health outcomes.