2020 Projects
Traditional Human Activity Recognition (HAR) utilizes accelerometry (movement) data to classify activities. This summer, Team #4 examined using physiological sensors to improve HAR accuracy and generalizability. The team developed ML models that are going to be available open source in the Digital Biomarker Discovery Pipeline (DBDP) to enable other researchers...
Our project is about building a food recommendation system for Avoidant/Reactive Intake Disorder (ARFID) patients and understanding the relationship between ARFID and clinical variables. Our stakeholders include young picky eaters and their parents, as well as clinicians who work with ARFID patients. We created an interactive visualization for ARFID patients...
We all need water to survive but how many of us really know where our water comes from? Team 3 has created a functioning website from scratch to give consumers more easily accessible information about who provides their water and how supply looks in relation to the past 30 years....
The Disease Emergence and Richness in Primates team uses existing databases to quantify parasite richness across primates and to identify ecological predictors of parasitism. By integrating phylogenetic generalized least squares regression and network based approaches, the team ultimately aims to predict missing interactions between primates and parasites, which, combined with...
Between 1935 and 1945, rural electricity access shot up from roughly 10% to 90%. During this time, the Rural Electrification Administration funded an Electric Farm Equipment (EFE) Roadshow as part of its mission to expand electricity access and demand. Digitizing massive amounts of archival data, our team has sought to...
Duke’s enrollment data over the past 50 years speaks volumes about the evolution of the University. Using physical scans courtesy of the Duke University Archives, we manipulate decades of demographic and geospatial data across all Duke schools to create an interactive application (previously available at http://sites.duke.edu/bluedevilbreakdown). Giving users control over how...
The Protecting American Investors project investigates the evolving structure and content of financial advice from the early 20th century to the birth of the Internet. By converting and cleaning thousands of investment advice columns from historical newspapers and magazines, we assembled a large corpus to address our research questions. Through...
We apply word embedding models to corpora from the start of the Early Modern period, when the market economy began to dramatically expand in England. Word embedding models use neural networks to map vectors to words so that semantic relationships are preserved within the vectors’ geometry. Such models have been...
This project aims to analyze assessment and performance data collected from baseball players to make predictions about baseball performance based on vision and physical abilities. We use hierarchical regression analyses to identify characteristics that correlate with batting performance in order to inform scouts about the likely production of developmental prospects....
In light of Duke’s reopening amidst the COVID-19 pandemic, this project aims to track the movement of foot traffic in and around Bryan Center by analyzing Wifi log data from all users connected to wireless networks in the center during February 2020. Our team employed Markov Chains, Kernel Density Estimations,...
This project involves predicting the incidence of blindness in glaucoma patients at Duke Eye Center (DEC) — specifically, the likelihood of a patient presenting legally blind (i.e. with very advanced disease) at their first visit. We will assemble a novel data set of electronic health records from thousands of DEC...
In collaboration with Data and Analytics Practice at OIT, our team has completed a series of critical analyses aiding Duke Facilities Management in further optimizing campus energy usage. Data cleaning tools, imputation techniques, and a variety of time series prediction methods ranging from autoregressive models to deep learning networks have...
Astronomers from the Dark Energy Survey rely on images of deep space to understand the nature of the universe, but these images are often polluted with “space junk”: asteroids, comets, satellites, or other objects from our own solar system obstructing the telescope’s view. In order to perform their analysis, scientists...
We trained an object detection model to locate wind turbines in overhead satellite imagery. Because these deep learning models require large amounts of training data, and satellite imagery of wind turbines is rare and expensive to collect, we created synthetic satellite imagery using 3D modeling software. We then supplemented our...
Our team used years of unanalyzed data in a cloud computing environment to conduct exploratory data analysis using natural language processing techniques, as well as visualizations, for Fleet Management Limited. Through this, and preliminary predictive modelling, we hope to help management decrease the number of preventable incidents as each one...
Our group aims to reveal the effects of urban and agricultural land use on metabolic productivities of rivers through statistical manipulation and visualization. During this summer, we classified sites and conducted covariate analyses based on patterns of metabolism, and produced reproducible code that can be used by researchers with similar...
Showing 1-20 of 34 results