Data & Digital Humanities

The humanities-based projects within iiD couple big data analysis with the interpretive work usually done by humanists.

The data sets for these projects include collections of texts, images, videos, and audio—in other words, they are digital archives broadly understood. From analyzing the numerous editions of Defoe’s Robinson Crusoe to understanding the narrative created by the thousands of photojournalistic depictions of Syrian refugees to virtually restoring medieval art, these groups ask traditional humanistic questions, but explore them with quantitative as well as qualitative analysis.

Discussing a Data+ and Digital Humanities project

Why Data+ for the Humanities?

These humanities projects originate from English, art history, and mathematics faculty and graduate students. The sponsors and mentors direct projects that represent the historical, methodological, and theoretical interests of their own research and teaching areas. But by developing these projects through Data+, they are able to work collaboratively with undergraduate students to meet time consuming technical and computational challenges through skill sets that are often outside the usual humanities repertoire. At the same time, undergraduate students are introduced to humanistic studies outside of the usual classroom setting, learning how to work attentively and closely with archives and conceptual tools for ten weeks over the summer.

Data+ Projects

A team of students will use a variety of data sets and mapping technologies to determine a feasible location for a deep-sea memorial to the transatlantic slave trade. While scholars have studied the overall mortality of the slave trade, little is known about where these deaths occurred. New mapping technologies can begin to supply this data. Led by English professor Charlotte Sussman, in association with the Representing Migrations Humanities Lab, this team will create a new database that combines previously-disparate data and archival sources to discover where on their journeys enslaved persons died, and then to visualize these journeys. This project will employ the resources of digital technologies as well as the humanistic methods of history, literature, philosophy, and other disciplines. The project welcomes students from a broad range of disciplines: computer science; mathematics; English and literature; history; African and African American studies; philosophy; art history; visual and media studies; geography; climatology; and ocean science.

 

Image credit:

J.M.W. Turner, Slave Ship, 1840, Museum of Fine Arts, Boston (public domain)

Faculty Lead: Charlotte Sussman

Project Manager: Emma Davenport

The American public first encountered the term “genocide” in a Washington Post op-ed published in 1944; since then, the word’s meaning has been circulated, debated, and shaped by numerous forces, especially by words and images in newspapers. With the support of Dr. Priscilla Wald (English), a team of students led by Nora Nunn (English graduate student) and Astrid Giugni (English and ISS) will analyze how U.S. mass media—particularly newspapers—enlist text and imagery such as press photographs to portray genocide, human rights, and crimes against humanity from World War II to the present. From the Holocaust to Cambodia, from Rwanda to Myanmar, such representation has political consequences. If time allows, students will also study the representation of collective violence in Hollywood film, querying the relationship between human rights and genre. The implications of these findings could inform future coverage of human rights-related issues at home and abroad.

Faculty Leads: Nora Nunn, Astrid Giugni

How Much Profit is Too Much Profit?

A team of students led by history professor Sarah Deutsch will do data mining in newspaper and Congressional databases to investigate the dynamics behind the excess profits tax laws Congress passed between 1918 and 1948 and the concept of price gouging which continues to shape legislation today. As of 2018 numerous states have price gouging laws. Why? How did they define what was excessive? How did this critique of profit-making become mainstream without endangering capitalism? By searching extant newspaper and Congressional databases for the frequency and context of particular words and phrases, the project will begin to uncover the logic and language and the partisanship or lack of it used to critique profits at three moments in U.S. History that resulted in government action to limit profit-making.

(cartoon from The Masses July 1916)

Faculty Lead: Sarah Deutsch

Project Manager: TBD

Have you ever read or watched a movie and realized that you have seen the same story before?  How do you know if you are watching an adaptation? A team of students led by UNC-Chapel Hill graduate student Grant Glass, will develop means to track the movement of adaptations within contemporary culture through machine learning techniques. Drawing upon a variety of textual information drawn from historical and digital sources, the project team will have the opportunity to work with many different types of data. Students will identify features of different master narratives, which will be used to demonstrate how certain stories are modified and retold over and over again. By creating this training dataset, the team will use algorithms to identify adaptations in previously unidentified works. This will allow scholars to better understand at scale how certain narratives are adapted into new stories and forms.

Faculty Lead: Grant Glass

Project Manager: TBD

Nathan Liang (Psychology, Statistics), Sandra Luksic (Philosophy, Political Science),and Alexis Malone (Statistics) began their 10-week project as an open-ended exploration how women are depicted both physically and figuratively in women's magazines, seeking to consider what role magazines play in the imagined and real lives of women.

Click here to read the Executive Summary

In tracing the publication history, geographical spread, and content of “pirated” copies of Daniel Defoe’s Robinson Crusoe, Gabriel Guedes (Math, Global Cultural Studies), Lucian Li (Computer Science, History), and Orgil Batzaya (Math, Computer Science) explored the complications of looking at a data set that saw drastic changes over the last three centuries in terms of spelling and grammar, which offered new challenges to data cleanup. By asking questions of the effectiveness of “distant reading” techniques for comparing thousands of different editions of Robinson Crusoe, the students learned how to think about the appropriateness of myriad computational methods like doc2vec and topic modeling. Through these methods, the students started to ask, at what point does one start seeing patterns that were invisible at a human scale of reading (reading one book at a time)? While the project did not definitively answer these questions, it did provide paths for further inquiry.

The team published their results at: https://orgilbatzaya.github.io/pirating-texts-site/

Click here for the Executive Summary

Ashley Murray (Chemistry/Math), Brian Glucksman (Global Cultural Studies), and Michelle Gao (Statistics/Economics) spent 10 weeks analyzing how meaning and use of the work “poverty” changed in presidential documents from the 1930s to the present. The students found that American presidential rhetoric about poverty has shifted in measurable ways over time. Presidential rhetoric, however, doesn’t necessarily affect policy change. As Michelle Gao explained, “The statistical methods we used provided another more quantitative way of analyzing the text. The database had around 130,000 documents, which is pretty impossible to read one by one and get all the poverty related documents by brute force. As a result, web-scraping and word filtering provided a more efficient and systematic way of extracting all the valuable information while minimizing human errors.” Through techniques such as linear regression, machine learning, and image analysis, the team effectively analyzed large swaths of textual and visual data. This approach allowed them to zero in on significant documents for closer and more in-depth analysis, paying particular attention to documents by presidents such as Franklin Delano Roosevelt or Lyndon B. Johnson, both leaders in what LBJ famously called “The War on Poverty.”

Click Here for the Executive Summary

Robbie Ha (Computer Science, Statistics), Peilin Lai  (Computer Science, Mathematics), and Alejandro Ortega (Mathematics) spent ten weeks analyzing the content and dissemination of images of the Syrian refugee crisis, as part of a general data-driven investigation of Western photojournalism and how it has contributed to our understanding of this crisis.

Selen Berkman (ECE, CompSci), Sammy Garland (Math), and Aaron VanSteinberg (CompSci, English) spent ten weeks undertaking a data-driven analysis of the representation of women in film and in the film industry, with special attention to a metric called the Bechdel Test. They worked with data from a number of sources, including fivethirtyeight.com and the-numbers.com.

Liuyi Zhu (Computer Science, Math), Gilad Amitai (Masters, Statistics), Raphael Kim (Computer Science, Mechanical Engineering), and Andreas Badea (East Chapel Hill High School) spent ten weeks streamlining and automating the process of electronically rejuvenating medieval artwork. They used a 14th-century altarpiece by Francescussio Ghissi as a working example.

Spenser Easterbrook, a Philosophy and Math double major, joined Biology majors Aharon Walker and Nicholas Branson in a ten-week exploration of the connections between journal publications from the humanities and the sciences. They were guided by Rick Gawne and Jameson Clarke, graduate students from Philosophy and Biology.

Data Expeditions Projects

This data expedition introduced students to “sliding windows and persistence” on time series data, which is an algorithm to turn one dimensional time series into a geometric curve in high dimensions, and to quantitatively analyze hybrid geometric/topological properties of the resulting curve such as “loopiness” and “wiggliness.”

What drove the prices for paintings in 18th Century Paris?