2020 Projects
This data expeditions module used three full course sessions to introduce undergraduate hydrology students with minimal programming background to: Public water data (water quantity and chemistry) Spatial analysis of water data 2 core, spatial datasets produced by the USGS that enable spatial analysis The programming language R R based tools for water...
Fluid mechanics is the study of how fluids (e.g., air, water) move and the forces on them. Scientists and engineers have developed mathematical equations to model the motions of fluid and inertial particles. However, these equations are often computationally expensive, meaning they take a long time for the computer to...
The Air Force’s F-15E Strike Eagle jets have parts that wear down and break, causing unscheduled maintenance events that take away valuable time in the air for critical missions and training. Our team, Limitless Data, is working with Seymour Johnson Air Force Base to mine manually entered maintenance data to...
Most phenomena that data scientists seek to analyze are either spatially or temporally correlated. Examples of spatial and temporal correlation include political elections, contaminant transfer, disease spread, housing market, and the weather. A question of interest is how to incorporate the spatial correlation information into modeling such phenomena. In this...
Over the course of two, one and a half hour sessions we led students in the Duke Marine Lab Marine Ecology class (Biology 273LA) on a data expedition using the statistical programming environment R. We gave an introduction to big data, the role of big data in ecology, important things...
This project aims to improve the computational efficiency of signal operations, e.g., sampling and multiplying signals. We design machine learning-based signal processing modules that use an adaptive sampling strategy and interpolation to generate a good approximation of the exact output. While ensuring a low error level, improvements in computational efficiency...
Mapping History has focused on the categorizing, labelling, digitization, and 3D reconstruction of 16th & 17th century maps & atlases of London and Lisbon. Over the course of the summer, the Mapping History team has developed its own unique analytical dataset by painstakingly labelling every element contained within these maps, used python...
For our Data+ project, we partnered with Rewriting the Code (RTC), a non-profit organization committed to empowering and fostering a community of college women with a passion for technology. We developed company and industry profiles for the recruitment process that included information ranging from interview and offer rate to negotiation...
Our team used artificial intelligence to help Duke University Management Company (DUMAC) operate more efficiently by building a cost optimization tool and analyzing and visualizing venture capital data. In our first project centered around cost optimization, we designed a Python script that suggests optimal cash transfers between prime brokers. In...
We utilize elements of data science and analysis in order to scour weblogs for potential malicious attacks on Duke’s servers. Additionally, we seek to identify patterns within the data that could be indicative of malicious intent and hope to apply these to real-time data. Project Leads: Phillip Batton, Nick Tripp Project...
Predictive Churn Models for Duke Season Ticket Holders and Annual Donors is centered around understanding which annual donors are most likely to churn, i.e. not donate the following year. To solve this problem the team built different models to predict the profiles and timing of donor churn. The team made...
Our team members have spent the summer working with the North Carolina Division of Public Health Occupational and Environmental Epidemiology Branch to build a pilot environmental public health data dashboard, with the hope that the pilot tool will be used in DPH’s grant proposal to the CDC for a fully-funded...
Carrying forward the work of a 2019-20 Bass Connections team, our Data+ team has worked to better understand the state of the home mortgage market leading up to the financial crisis. The team has built a more in-depth analysis of North Carolina to understand its different regions. We have also...
Our group aims to reveal the effects of urban and agricultural land use on metabolic productivities of rivers through statistical manipulation and visualization. During this summer, we classified sites and conducted covariate analyses based on patterns of metabolism, and produced reproducible code that can be used by researchers with similar...
Our team used years of unanalyzed data in a cloud computing environment to conduct exploratory data analysis using natural language processing techniques, as well as visualizations, for Fleet Management Limited. Through this, and preliminary predictive modelling, we hope to help management decrease the number of preventable incidents as each one...
We trained an object detection model to locate wind turbines in overhead satellite imagery. Because these deep learning models require large amounts of training data, and satellite imagery of wind turbines is rare and expensive to collect, we created synthetic satellite imagery using 3D modeling software. We then supplemented our...
Astronomers from the Dark Energy Survey rely on images of deep space to understand the nature of the universe, but these images are often polluted with “space junk”: asteroids, comets, satellites, or other objects from our own solar system obstructing the telescope’s view. In order to perform their analysis, scientists...
In collaboration with Data and Analytics Practice at OIT, our team has completed a series of critical analyses aiding Duke Facilities Management in further optimizing campus energy usage. Data cleaning tools, imputation techniques, and a variety of time series prediction methods ranging from autoregressive models to deep learning networks have...
Showing 1-20 of 34 results