Alumni Gifts and Data Analysis

Project Summary

Building off the work of a 2016 Data+ teamYu Chen (Economics), Peter Hase (Statistics), and Ziwei Zhao (Mathematics), spent ten weeks working closely with analytical leadership at Duke's Office of University Development. The project goal was to identify distinguishing characteristics of major alumni donors and to model their lifetime giving behavior.

Themes and Categories
Year
2017
Contact
Paul Bendich
Mathematics
bendich@math.duke.edu

Project Results: Working with a large data set of anonymized alumni donors, the team visualized the distinction between minor and major alumni donors in a number of ways, did an in-depth evaluation of the Development office's current donor affinity metric, and clustered donors into groups by giving behavior. 

Sponsored by Alumni Affairs and Development at Duke.

Click here for the Executive Summary

Faculty Leads: Robert Calderbank, Paul Bendich

Project Leads:

Stephen Bayer

Natalie Spring

Ian Conlon

Project Manager: Sheng Jiang

"Very impressed. Smart kids, driven. Asked some really great questions and showed keen interest in the work and potential application. Beyond the intangible benefit of connecting more with the student experience and academic mission of the university, we benefited from having new eyes on our data, fresh questions and insights, and, ultimately, a starting point for future analysis." — Ian Conlon, Associate Director, Prospect Management and Analytics, Duke Office of Prospect Research, Management and Analytics

Related People

Related Projects

A large and growing trove of patient, clinical, and organizational data is collected as a part of the “Help Desk” program at Durham’s Lincoln Community Health Center. Help Desk is a group of student volunteers who connect with patients over the phone and help them navigate to community resources (like food assistance programs, legal aid, or employment centers). Data-driven approaches to identifying service gaps, understanding the patient population, and uncovering unseen trends are important for improving patient health and advocating for the necessity of these resources. Disparities in food security, economic stability, education, neighborhood and physical environment, community and social context, and access to the healthcare system are crucial social determinants of health, which studies indicate account for nearly 70% of all health outcomes.

The Air Force’s F-15E Strike Eagle jets have parts that wear down and break, causing unscheduled maintenance events that take away valuable time in the air for critical missions and training. Our team, Limitless Data, is working with Seymour Johnson Air Force Base to mine manually entered maintenance data to visualize and predict aircraft failures. We created a prototype data visualization product that will enable maintainers on the flight line and help them identify and repair critical failures before they happen, keeping jets ready to fly, fight and win.

 

Faculty Lead: Dr. Emma Rasiel

Client Lead: Lt. Devon Burger

Project Manger:  Vignesh Kumaresan

Most phenomena that data scientists seek to analyze are either spatially or temporally correlated. Examples of spatial and temporal correlation include political elections, contaminant transfer, disease spread, housing market, and the weather. A question of interest is how to incorporate the spatial correlation information into modeling such phenomena.

 

In this project, we focus on the impact of environmental attributes (such as greenness, tree cover, temperature, etc.) along with other socio-demographics and home characteristics on housing prices by developing a model that takes into account the spatial autocorrelation of the response variable. To this aim, we introduce a test to diagnose spatial autocorrelation and explain how to integrate spatial autocorrelation into a regression model

 

 

In this data exploration, students are provided with data collected from remote sensing, census, and Zillow sources. Students are tasked with conducting a regression analysis of real-estate estimates against environmental amenities and other control variables which may or may not include the spatial autocorrelation information.