Prolific Pigs? Mating Reproductive Capacity with Market Price in Early Twentieth Century Pig Breeding

Project Summary

Yanmin (Mike) Ma, mathematics/economics major, and Manchen (Mercy) Fang, electrical and computer engineering/computer science major, spent ten weeks studying historical archives and building a model to predict the price of pigs, relative to a number of interesting factors.

Themes and Categories
Contact
Paul Bendich
Mathematics
bendich@math.duke.edu

Project Results

Berkshire pig transaction price is influenced by factors including the age of the pig, the number of offspring and the prevailing pork price. And we can conclude that within 1910 - 1920, breeders and buyers did price Berkshire pig rationally to a large extent.

Download the executive summary (PDF).

Disciplines Involved

  • Economics
  • History

Project Team

Undergraduates: Yanmin (Mike) Ma, mathematics/economics major, and Manchen (Mercy) Fang, electrical and computer engineering/computer science major

Client: Gabriel Rosenberg, Assistant Professor, Women's Studies Program

Graduate student mentor: Chris Glynn, Department of Statistical Science


From left to right: Manchen (Mercy) Fang; Chris Glynn; Yanmin (Mike) Ma

"Working with Mercy Fang and Mike Ma has been a pleasure. Throughout the Data+ experience, Mike and Mercy demonstrated creativity, diligence, and technical ability in their research on Berkshire pigs. Our primary objective was to build a data set that would cross-reference genealogical, market, and geographic data on individual Berkshire pigs. To collect this data, Mike and Mercy worked with original scans of archived primary source publications. Together, they developed efficient algorithms and error-correcting mechanisms to extract data from PDF documents. Mike and Mercy taught me a lot about working with text-data, the perils of optical character recognition, and the factors that drove hog markets in the early 1900s.

The pride that they took in their research is evident in their Data+ seminar talks and their final poster. Most impressively, Mike and Mercy were extremely independent and self-reliant. They formulated their own questions, developed strategies for answering them, and implemented solutions on their own. They are very promising young researchers." — Chris Glynn, graduate student mentor

Related People

Related Projects

United Nations Sustainable Development Goal 7 calls for universal access to affordable, reliable, sustainable, and modern energy. Researchers and practitioners around the world have responded to this call by producing a wealth of energy access data. While many data gaps still exist, are we capturing the fullest potential from the information and research we do have, and what it tells us about how to accelerate energy access? Power for All’s Platform for Energy Access Knowledge (PEAK) is an interactive knowledge platform designed to automatically curate, organize, and streamline large, growing bodies of data into digestible, sharable, and useable knowledge through automated data capture, indexing, and visualization. A team of students led by Rebekah Shirley will consult with Power for All to creatively visualize PEAK’s library, and to explore machine learning and natural language processing tools that can enable auto-extraction and visualization of data for more effective science communication.

Are there relative value opportunities in the global corporate bond markets?  
A team of students will work with Professor Emma Rasiel to understand whether an analysis of credit spreads on bonds issued by international firms in multiple countries over time can shed light on potential arbitrage opportunities. The team will have frequent opportunities to interact with analytics professionals at a leading financial advisory and asset management firm.

 

A team of students will consult with a leading financial advisory and asset management firm that is seeking to understand how big data can shed light on the secondary market for construction machinery. Students will explore a combination of publicly-available datasets that describe the used-machinery market and its potential implications as an indicator for the business cycle. There will be frequent interactions with analytical professionals from the firm.