PLUS PROGRAM EXPO 2021 August 6th!

The Data+, Code+, and CS+ undergraduate summer programs will hold an online expo to showcase student projects leveraging big data, mobile app and web development, and computer science on August 6th, 2021, from 10-12:00 p.m. and 1-3:00 p.m. EST


Each 10-week co-curricular summer program provides students with valuable experience with professionals in their industries as well as Duke faculty and staff to produce results or products that address a question or solve an issue relevant to the Duke community and beyond.


Student teams -- over 150 students in all -- will present their projects in "Zoom rooms" hosted by program administrators and project leads, with time for Q&A.





AUGUST 6th, 2021 10:00 a.m. – 12:00 p.m. and 1:00 – 3:00 p.m.


Meeting ID: 926 4918 5685

*For the most seamless Zoom experience, please make sure your software is up to date and you are signed into your Zoom account!*


10-10:30 am:

Breakout Room 1:

Team 1: Rubenstein Library’s Card Catalog File

Breakout Room 2:

Team 12: Understanding Voting Patterns and Interactions with Gerrymandering

Breakout Room 3:

Team 18: Pivers Island Coastal Observatory


10:30 am - 11:

Breakout Room 1:

Team 7: Learning More about Durham Public Schools

Breakout Room 2:

Team 9: Feature Engineering for Unknown Web Attacks



11-11:30 am:

Breakout Room 1:

Team 11: Predicting Baseball Players’ Athletic Performance

Breakout Room 2:

Team 16: Using Natural Language Processing to Facilitate Keyword Search & Context Extraction from PDFs

Breakout Room 3:

Team 22: Ethical Consumption Before Capitalism


11:30 am - noon:

Breakout Room 1:

Team 2: Rainforest XPRIZE: Community Biodiversity Data Collection

Breakout Room 2:

Team 5: The Impact of Virtualized Museum Collections

Breakout Room 3:

Team 23: American Predatory Lending


1-1:30 pm:

Breakout Room 1:

Team 10: Developing Predictive Models for COVID-19 with Wearables Data

Breakout Room 2:

Team 14: Dashboard for Public Investments

Breakout Room 3:

Team 17: Constructing Utopias in Restoration London


1:30-2 pm:

Breakout Room 1:

Team 3: Mental Health and the Justice System in Durham County

Breakout Room 2:

Team 8: From Farm to Table: Data analysis and tools for more equitable and sustainable food systems

Breakout Room 3:

Team 19: Creating artificial worlds with AI to improve energy access data


2-2:30 pm:

Breakout Room 1:

Team 4: Mapping the Training and Career Trajectories of Duke Graduate Students

Breakout Room 2:

Team 13: Agile Waveform Design in the Presence of Opposition: Quantifying Performance

Breakout Room 3:

Team 15: Data Mapping & Visualization to Prioritize Child & Family Health


2:30 -3 pm:

Breakout Room 1:

Team 6: Modeling microbial growth and resilience

Breakout Room 2:

Team 20: Racial Disproportionality and Disparities in the Child Welfare System: Child Protective Services and Foster Care

Breakout Room 3:

Team 21: Applications of Graph Similarity Scoring and Matching


All sessions are open to registrants.


To register for these events, please visit our Data+, Code+ and CS+ event pages:

For Data+

For CS+

For Code+


Join us as we celebrate the efforts and accomplishments of our program participants!


*For the most seamless Zoom experience, please make sure your software is up to date and you are signed into your Zoom account!*


Data+ is a full-time ten week summer research experience that welcomes Duke undergraduate and masters students interested in exploring new data-driven approaches to interdisciplinary challenges. It is suitable for students from all class years and from all majors.

Students join small project teams (at most 3 undergrads and 1 masters per team), working alongside other teams in a communal environment. They learn how to marshal, analyze, and visualize data, while gaining broad exposure to the modern world of data science. The projects (see below) come from an extremely diverse set of subject areas.i It is our hope that students will be able to both work deeply into their specific project and get a very broad picture of most of the skills needed for modern data science.

Participants will receive a $5,000 stipend, out of which they must arrange their own housing and travel . Funding and infrastructure support are provided by a wide range of departments, schools, and initiatives from across Duke University, as well as by outside industry and community partners.

Data+ is typically a program where students have dedicated workspace within Gross Hall at Duke University. Last summer (2020), Data+ ran entirely remotely due to the pandemic, and was quite successful. We are currently awaiting university guidance for Summer 2021, and hope to update this site shortly as to whether Data+ 2021 will be in-person or remote. In either case, it will happen!

The program runs from Monday, June 1st until Friday, August 6th 2021.

You will find the projects planned for summer 2021 below. Click on the project names to learn more. 

Check out what some of our 2020 Data+ students had to say about their projects and their experience in Data+ below:




  • I learned there’s much more to it then looking at data. It’s also a way of thinking and organizing what you have analyzed to help others who aren’t able to look at data in such a way to understand it. It’s also a bit of storytelling in a way.

    - Jessica Ho, Math and Neuroscience ‘22
    Predicting Baseball Players’ Athletic Performance Utilizing Baseline Assessments of Vision

  • I didn't really know how data science research applied to social science, but Data+ showed me that it can be a really successful avenue for discovery and change.

    - Nick Datto, Neuroscience, Computer Science, and Cultural Anthropology ‘23
    Race and Housing in Durham over the Course of the 20th Century

  • I’ve learned how interdisciplinary data science is, and how a team of people with many different academic trajectories can work together on the same project, something that I don't think happens very often in other areas.

    - Anonymous

  • What I have discovered is that a majority of data research is about communication. How you interact with your teammates and superiors is just as important, if not more important, than being a genius in your field.

    -  Andrew Scofield, Computer Science ’22, Birmingham-Southern College
    For love of greed: tracing the early history of consumer culture

  • I had expected it to be very analytical, but I was surprised at the creativity that was also required. I enjoyed this aspect a lot.

    - Amber Potter, Computer Science ‘23
    Predicting Baseball Players’ Athletic Performance Utilizing Baseline Assessments of Vision

  • I've gained a lot of valuable insight into the career fields of environmental health and epidemiology. I've also learned a lot about project workflow and how to work through the different phases of a long term project with a team. In addition, my skills in R coding and Tableau have improved a ton.

    - Anonymous

  • My group has been focused on cybersecurity and automation methods to prevent and seek out attackers to keep Duke websites and accounts from being compromised. I have learned a lot about cybersecurity, a field that I otherwise might not have pursued. It has been a very interesting and enlightening experience so far and I am excited to continue learning from the Duke OIT staff.

    - John Taylor, Computer Science ‘21
    Applying Security Orchestration, Automation & Response (SOAR) to security threat hunting with Duke’s ITSO

  • I have gained so much knowledge and confidence! And it is not limited to the area of technology, although I have learned to code in R, navigate PACE, and so much more. I have better discovered the benefit of working with a team and received motivation and mentors by seeing female-identifying students, like myself, succeed. Hearing their success stories via panels or team meetings has given me so much more confidence as a young woman wanting to pursue a career in STEM. I see that it is possible! I also have never worked with data before Data+, but never felt behind in my lack of knowledge as my team is super supportive. They Zoom me outside of the workday and send me resources to help me complete my assignments. I've also realized that I do have an interest in Data Science and feel like I'm making a difference in the world through this program. Knowing that my project (Predicting Blindness in Duke's Glaucoma Patient Population) is going to help so many clinicians, government officials, patients and more is so empowering. It is crazy to think that I am just 19 years old and working on such an advanced project with beyond accomplished students, doctors, and professors, but I'm doing it! Data+ truly has given me the opportunity to expand my knowledge and network in a safe environment. I find these takeaways pretty impressive, especially since it is all remote this year.

    - Sydney Hunt, Engineering ‘23
    Predicting Blindness in Duke’s Glaucoma Patient Population

  • Beyond solid technical machine learning skills, I've received a greater appreciation for data science as a tool to understand everything--from aircraft maintenance to the humanities. Before, I'd never expected that conducting humanities research would teach me how to wield and utilize the most cutting-edge research in machine learning and natural language processing. My team is using new package libraries and research papers written by lead researchers this year to conduct our analysis of ancient texts. In Data+, New meets Old.

    - Albert Sun, Computer Science and Public Policy ‘23
    For love of greed: tracing the early history of consumer culture

  • Working remotely has made coordination much more difficult. However, we really have been embracing GitHub and box to overcome these challenges. I have learned a lot about RNNs and the applications of GRUs and LSTM's and how to implement such layers, in addition to learning how to use pytorch as previously I only used tensor flow.

    - Nathan Warren, MIDS
    Human Activity Recognition using Physiological Data from Wearables

  • As a Biology pre-med, I made the mistake of thinking that coding was irrelevant to me. That changed when I took a biology class where we used R to analyze lab data. That was when I realized that coding (and the problem solving skills that come with it) is invaluable in research. It was difficult at first to jump into Data+, but doing this has benefited me a few ways. Having to learn python on my own, in a very short amount of time, with almost no prior coding experience (I didn't even know what a package was) and quickly turning around and using those skills taught me that I am capable of flexibility and learning on the job. Coding also requires an immense amount of problem solving and independence. Although my mentors are fantastic, it's up to me to figure out where I want to take the project and how I want to do it. Finally, Data+ has been a really invaluable exercise in teamwork. This has been especially challenging with remote learning. However, I still feel like our team has grown very close in working toward a common goal.

    - Ellen Mines, Biology and Philosophy ‘21
    Computational Tools to Improve Healthy and Pleasurable Eating in Young Children

  • My coding skills and machine learning knowledge had a huge leap. I learned how to better work in a team as well.

    - Noah Lanier, Psychology ‘22
    Human Activity Recognition using Physiological Data from Wearables

  • I learned a lot about data science and using code to manipulate data. I learned how to properly use a terminal, deep learning/machine learning, pandas, and many other skills. Also, I gained collaboration skills when it comes to developing code.

    - Pavani Jairam, Physics ‘23
    Finding Space Junk with the World’s Biggest Telescopes

  • I’ve learned to work through the entire process of a data science project, from assembling data sources all the way through presenting our findings. I’ve also developed insight into working in a team with people of different backgrounds and interests, which enabled us to contribute to the project in different ways. I’ve taken various lessons and hard skills that will carry with me into my future academic and professional endeavors.

    - Benjamin Chen, Computer Science, Economics ‘22
    Protecting American Investors? Financial Advice from before the New Deal to the Birth of the Internet

  • Data+ absolutely changed my perception of data science research. Learning data science has been more intuitive than expected. There are also resources all over the Internet in addition to team members that are able to provide assistance when one is facing difficulty with an aspect of a project. Data science is also able to be applied to many more scenarios than I expected; I look forward to continuing data science research in the future.

    - Malik Scott, Global Health ‘22
    Predicting Baseball Players’ Athletic Performance Utilizing Baseline Assessments of Vision

  • I gained concrete skills in R and Tableau, the ability to collaborate in a virtual environment, and a better understanding of what data science actually means. I also got a glimpse into the public health field and got to learn what many different public health careers might actually entail.

    - Anna Zolotor, Undeclared
    Piloting an Environmental Public Health Tracking Tool for North Carolina

  • I have gained a significant amount of knowledge of the cybersecurity industry and attack methods due to the nature of the background research I had to do for my project. In addition, I was able to apply my knowledge of statistical analysis to real data and learn new techniques to arrange data such as time series analysis.

    - Matthew Feder, Computer Science ‘22
    Applying Security Orchestration, Automation & Response (SOAR) to security threat hunting with Duke’s ITSO

  • Since I've never participated in research before, especially not research this independently oriented, the main thing I feel I've gained from this experience is confidence. I feel like I have a much better understanding of my own capabilities, and I honestly feel much less intimidated by the idea of pursuing research, not just in Data Science.

    - Donald Pepka, Math, Political Science, and Creative Writing ‘21
    For love of greed: tracing the early history of consumer culture

  • I learned a number of hard skills in terms of coding languages as well as some soft skills along the lines of working with a team and coordinating with a client.

    - Benjamin Williams, ECE ‘21
    ABOUT-US – A BOundary Update Tool for Utility Services

  • I definitely gained a lot of experience in R and in Tableau, but I also learned a ton about the fields of data science and public health. We had several interviews with community partners that helped me learn a lot about the different types of careers in data science, environmental advocacy, and environmental health.

    - Leah Roffman, Environmental Science ‘23
    Piloting an Environmental Public Health Tracking Tool for North Carolina

  • I learned about team communication and organizational skills, time management, and I think I have a greater appreciation for how socio-cultural analysis from a humanities perspective can work in tandem with STEM based modes of collecting information/data.

    - Luci Jones, Environmental Studies Brown University
    When Black Stories Go Global: Analyzing the Translation of African-American Literature and Film

  • Through the program, I not only developed my technical skills with regards to programming and data visualization, but I also learned a lot more about finance and the intersections of finance and data science. This program really incited my love for programming and problem-solving with data, and has made me even more interested in studying statistical science and data science at Duke. Finally, I learned how to effectively collaborate and communicate with a team in a virtual environment.

    - Helen Chen, Statistics ‘23
    AI in the Investment Office

weeks during the summer
undergraduates per team
grad student mentors
projects sharing ideas and code

Related Videos


This team is part of an ongoing project dedicated to exploring how states and local communities responded to the causes of the 2007-09 Global Financial Crisis. Led by faculty from the Global Financial Markets Center at Duke Law the Data+ team  will conduct analysis of multiple states mortgage enforcement databases to gain a better understanding of how state regulators were, or were not, enforcing existing state law pertaining to mortgages leading up to the crisis. Our website has an example of what this will look like, as last year we analyzed North Carolina’s mortgage enforcement actions and displayed them by topic.

Project Lead: Lee Reiners

Project Manager: Malcolm Smith Fraser

Nationally there is a disproportionate number of children of color (African American & Latino) in the child welfare system. Durham County is no different. However, reviewing this problem through the lens of data has not been done to formulate or implement possible solutions. Durham County Department of Social Services Child & Family Services would like to evaluate systems to identify where and how disproportionality and disparity are occurring. It is occurring at the entry point of Reporting child abuse and neglect? Is it occurring at the case decision? Is our reunification time different for African American children? Or Does it take longer for a child of color to achieve permanence through adoption? Organizing the data to show us our “hot spots” would facilitate further discussion and focus on solutions to an age-old systemic problem.

Faculty Lead: Greg Herschlag

Project Lead: Jovetta L Whitfield

Student teams will develop a benchmark dataset and explore its efficacy in an in house competition where they will put new innovative techniques such as machine learning to the test through a series of challengesA team of students will develop benchmark data pertaining to network performance in the presence of intentional and non-intentional degradation, ranging from sensor failure and additive noise to adversarial interference.  The students will analyze the baseline performance of the network, and measure performance of the degraded network with and without the inclusion of robust techniques that shore up robustness.  Students will have the opportunity to present findings to scientists & engineers from the Air Force Research Laboratory.

Faculty leads: Robert Calderbank, Vahid Tarokh, Ali Pezeshki

Client leads: Dr. Lauren Huie, Dr. Elizabeth Bentley, Dr. Zola Donovan, Dr. Ashley Prater-Bennette, Dr. Erin Trip

Project Manger: Suya Wu

A team of students will use visual reporting tools, such as Tableau or Power Bi, to create a dynamic dashboard that will enable investment professionals at Duke University Management Company (DUMAC) to review and better understand fund managers’ exposures and positioning across various dimensions.  Students will collaborate with teams at DUMAC to develop an intuitive, visual dashboard to help the investment team review individual and portfolio-level exposures across numerous asset classes.  This dashboard will be connected to DUMAC’s data warehouse, which refreshes on a daily basis as new data comes in for DUMAC’s public market accounts.

Project Lead: Yi Wang

Project Manager: Christopher Ritter

A team of students will explore how using natural language processing and other data focused tools to direct the input of files and depository flow can be used to support the investment office at Duke University Management Company (DUMAC).  Students will collaborate with investment professionals to investigate and potentially develop tools to facilitate intuitive keyword search and context extraction from various documents sources, which would support investment analysis and the legal review process.

Finding similarity and node correspondence among networks has been widely applied in social network analysis, network de-anonymization, protein-protein interaction (PPI) network alignment, and shape matching in computer vision. A team of students led by Professor Jiaming Xu and Ph.D. student Sophie Yu will implement our recently developed graph matching and similarity scoring algorithms, test their empirical performance in real datasets, and further improve the algorithm design by incorporating domain knowledge. Multiple datasets will be explored such as Facebook ego network, Covid PPI network, Isobase PPI network, computer vision datasets, and Wikipedia article networks. A visualization app will be built to facilitate researchers and practitioners to apply graph matching in relevant applications.

A team of students led by researchers at Duke University and UC Davis will visualize data on child and family health from Yolo County, California. Data varies from single words or numbers per variable (e.g. gender, age) to more complex (e.g. crime and violence, social integration, housing/homeless impact). The visualization dashboard will be used by academic researchers and community service providers in addition to Yolo County community members. The overall goal of the research is to reduce health disparities through strengthening academic-community partnerships.

Project Lead: Leigh Ann Simmons (UC Davis)

Along with the Energy Data Analytics Lab and the Sustainable Energy Transitions Initiative, students will explore how machine learning can be developed to better identify and characterize energy infrastructure and scale up its application across geographies. Using synthetic data generation such as style transfer and/or 3D models to create images of artificial cities and communities, we will examine how well an algorithm trained on one set of training data is able to adapt to new places and types of energy infrastructure. This research will increase the speed and scale of assessment towards an automated, global assessment of energy infrastructure. The resulting data would empower decisionmakers and energy system planners to make better decisions for planning electricity system development for communities with limited electricity access or reliability.

Project Lead: Kyle Bradbury

Project Manager: Wei Hu

This project seeks to understand voting patterns and their effect on election outcomes across geography and time. This involves examining precinct-level votes across a large array of historical votes (including 2020 and as far back as 2012). Students will employ a variety of techniques in dimension reduction to uncover large-scale voting patterns and investigate the evolution of voting patterns across the decade. This work will help answer questions like "did the suburbs vote with the cities?" The students will use voting patterns to explore the "stability" of gerrymandering as they compare election outcomes under certain maps compared with large ensembles of non-partisan maps.

This project is part of an ongoing set of projects by the sponsoring faculty around Voting, Gerrymandering and Democracy. See their blog ( for more information and projects from previous years of Data+ and Bass Connections.

Project Leads: Greg Herschlag, Jonathan Mattingly

The goal of this project is to use evaluation data collected on baseball athletes to make predictions about their on-field performance in competitive games. Evaluation data includes visual skills assessed with eye tracking, movement screening data and simulated batting performance metrics collected in virtual reality that are measured as part of the USA Baseball Prospect Development Pipeline. Using regression and Bayesian statistical analyses we will identify characteristics of these data that correlate with batting performance in order to inform scouts about the likely production of developmental prospects. The final product will be an application that uses an athlete's assessment results to produce performance summary graphs for the individual compared to other athletes and inferential models for the relationships between assessments and performance.

Project Leads: Marc Richard, Suhail Mithani, Greg Appelbaum

Project Mangers: Billy Carson and Hunter Klein

A team of students led by researchers in the BIG IDEAS lab in the biomedical engineering department will build and validate machine learning techniques to classify longitudinal illness trajectories of individuals with infections such as COVID-19 or flu. Students will construct a pipeline to query survey and wearable device data from our newly constructed database in the Microsoft Azure environment and modify existing machine learning and deep learning algorithms for wearables data analysis. This project will build upon the work accomplished by the Duke Bass Connections team and the Duke MIDS capstone project.

Project Lead: Jessilyn Dunn

The Root Causes Fresh Produce Program, led by an interdisciplinary team of graduate and undergraduate students at Duke and UNC, leverages the power of student volunteers and community partnerships to address food insecurity in our community by delivering fresh and locally sourced produce to the doorsteps of patients in the Duke, Lincoln, and Samaritan Community health systems. Our program leadership is looking for team members to join us in exploring the use of data visualization and analysis to improve upon the processes that allow us to carry out our student-run service. Specific projects will include improving the program’s delivery route optimization software, improving the crop assessment and inventory management tools of small farmers and wholesale markets, as well as developing a dashboard integrating client location data and GIS information to identify location-specific service providers. Together, in collaborating with our team of interdisciplinary students as well as local Durham agencies, we hope to give you an opportunity to make an impact on our local community and food system while gaining a deeper appreciation for the role of data visualization and analysis through a population health intervention.

Project Lead: Willis Wong

A team of students led by Zackary Johnson (Associate Professor Nicholas School of the Environment and Biology) and supported by other faculty in NSOE, Statistics, Biology and Engineering, will perform analyses of a 10+ year oceanographic time‐series dataset sampled near the Duke Marine Laboratory.  With >1000 observations and >400 fields, the team will first mature the MATLAB based data wrangler and analysis scripts.  Using this enhanced tool, the team will then focus on clustering, classification and forecasting towards the interpretation of variability and trends of key variables measured by the Pivers Island Coastal Observatory.  The long term goal of this project is to understand the proximal drivers of variability in coastal marine ecosystems as well as longer term changes associated with climate change.

Project Lead: Zackary Johnson

A team of students led by members of the Duke IT Security Office will develop machine learning features that can be used to identify previously unknown web attacks.  Attacks targetting websites and web applications are continuous across the internet and protecting Duke's web infrastructure is part of the IT Security Office's mission.  The team will have a large dataset of logs from web traffic to explore possible features and evaluate their usefulness for identifiying new types of web attacks.  The results from this project may be incorporated into Duke's IT security infrastructure to help protect Duke's network in the future.

Project Lead: Eric Hope

Project Manager: Pranav Manjunath

This project is inspired by an inter-institutional Bass Connections team from Duke and North Carolina Central University that is committed to developing more responsible and imaginative ways of partnering with Durham Public Schools (Link: Students will use existing data sets combined with historic and contemporary city context to better understand the complex and nuanced details of different school communities.  Students will prioritize the public schools that most commonly partner with each respective university.  Our research aims to inform future pre-service trainings for university students, support local neighborhood schools in visualizing their communities, and help varied university offices articulate what “community” actually looks like.

Project Lead: Alec Greenwald

Project Manager: Nicolas Restrepo Ochoa

The goal of this Data+ project is to apply and extend custom analytics solutions to discover how life remains resilient in extreme environments. An explosion of data has resulted from recent discoveries of tiny single-celled life hiding out in the most extreme places on Earth. These single celled creatures, or microbes, thrive in hot springs at boiling temperatures, in brine lakes saturated with salt, and in deserts once thought to be sterile. Studies in the Schmid lab at Duke have generated large data sets tracking how microbes grow and change their gene expression when faced with extreme stress. However, the majority of these data remain to be analyzed. In this Data+ project, students will apply and extend the statistical models to ask how extreme microbes grow and change during exposure to nutrient deprivation, extreme stress, and genetic mutations.

(photo credit: Peter Tonner et al., 2020 PLoS Computational Biology )

Project Lead: Amy Schmidt

Project Manager: David Buch

A team of students led by faculty of biology and directors of the largest virtual museum in the world (a Duke-base web repository called MorphoSource) will develop tools for assessing the societal and scholarly impact and importance of museum specimens made available as 3D digital resources over the web. They analyze patterns of user data from 12,000 users interacting with 140,000 datasets and communicate directly with museum curators, researchers and the general public to understand use cases before developing a number of new tools to collect and provision new kinds of use data. The ultimate goal is to generate use statistics that (1) show the research and education value of virtual museum objects in novel ways to motivate greater sharing and to increase appreciation of museum collections; (2) show how to strategize data storage based on use in a way that minimizes storage costs and maximizes sustainability. They will learn valuable skills in data science, museum informatics, and web development; the project provides an opportunity to contribute to a professional resource utilized by almost a thousand  museums around the world and to set the trajectory of online museum research practices for years to come.

Project Leads: Doug Boyer, Julia Winchester

Who should get to decide what a utopian society looks like? After London was razed to the ground in the Great Fire of 1666, its reconstruction into the “emerald gem of Europe” was heavily influenced by the monarchy and aristocratic elites. In building a utopian epicenter focused on political and economic interests, immense sacrifices had to be made by London’s most marginalized citizens. A team of students led by Nicholas Smolenski (PhD Candidate, Musicology) and Dr. Astrid Giugni (Lecturing Fellow, English) will thus explore how London was rebuilt into a utopia; by employing topic modeling and applying the resulting lexicon to seventeenth-century architectural sketches, students will demonstrate how a language of progress became inextricably linked to its own image while also exposing the paradoxes entrenched in utopic representation. This project will additionally show how its framework can apply to current political discourse, as the tearing down of statues and monuments over the past three years has highlighted inescapable tensions between a governing power, a nation’s history, and its people.

A team of students led by Francisco Ramos and Edward Balleisen will explore survey data from degree completers and alumni to establish important correlations, document key patterns and longitudinal trends, and develop visualizations that can inform institutional decision-making. Combining statistics with data science to analyze large datasets, this research will provide a deep understanding of how doctoral training and education can be improved across Duke University.

Mental illness is often over-represented among the incarcerated population.  Durham county leaders are aware of this, and have taken many steps in recent years to support incarcerated people with mental illness and also to reduce incarceration in this population.  This work requires a community-wide effort, and Duke University Health System, as the major health care provider in the county, has a potential role to play. This Data + team will use data from the Durham County Detention Facility and Duke Health System to examine patterns of health-service utilization in the incarcerated population, including those with and without mental illness diagnoses.  We will use statistical methods such as longitudinal modeling to analyze the effects of the many interventions recently implemented, including increased mental health services, medication-assisted treatment for addiction, among others.  

Project Lead: Nicole Schramm-Sapyta

Project Manager: Ruth Wygle

A team of students will collaborate with the Duke Rainforest XPRIZE team led by Martin Brooke (ECE) and Stuart Pimm (Nicholas School) to use machine learning and image processing to automate conversion of the large amount of biodiversity data the team will generate into images and sound bites that will be suitable for iNaturalist community identification of individual species.  

Is there a right type and amount of consumption? The idea of ethical consumption has gained prominence in recent discourse, both in terms of what we purchase (from fair trade coffee to carbon off-sets) and how much we consume (from rechargeable batteries to energy efficient homes). Concern with the morality of consumption is not new to capitalist societies; we can find debates on the ethics of consumption in the Middle Ages and the Renaissance. A team of students led by Dr. Astrid Giugni (English, Duke) and Dr. Jessica Hines (English, Birmingham Southern College) will analyze approximately 60,000 Medieval and Renaissance digitized texts by performing topic modelling, which will allow them to track associations between the languages of consumer culture and ethical practice.The team will organize their results in a series of visualizations that trace the development of the ethics of ‘consumerism’ in terms of frequency of use, timeline of relevant events, authorship, location of publication and name of printer/printing press, and genres.

Using the digitized cards from the David M. Rubenstein Rare Book and Manuscript Library’s old card catalogs, a team of students will explore extracting structured data to develop searchable and sortable descriptions of manuscript and archival collections. They will use textual analysis tools and natural language processing techniques to prepare an indexed digital collection of structured card catalog metadata for publication in Duke’s Digital Repository. The team will then develop ways to visualize and search this dataset based on different research topics or terms. Ultimately, the Rubenstein Library is seeking a tool that allows users to search, tag, and export description from the card catalog’s dataset. This work is a critical piece of a broader initiative within the Rubenstein Library to find and describe historically marginalized voices in our archival collections, particularly those collections documenting BIPOC history.

Project Leads: Julia Winchester, Doug Boyer

Project Manager: Jocelyn Triplett

Past Projects

The Air Force’s F-15E Strike Eagle jets have parts that wear down and break, causing unscheduled maintenance events that take away valuable time in the air for critical missions and training. Our team, Limitless Data, is working with Seymour Johnson Air Force Base to mine manually entered maintenance data to visualize and predict aircraft failures. We created a prototype data visualization product that will enable maintainers on the flight line and help them identify and repair critical failures before they happen, keeping jets ready to fly, fight and win.

This project aims to improve the computational efficiency of signal operations, e.g., sampling and multiplying signals. We design machine learning-based signal processing modules that use an adaptive sampling strategy and interpolation to generate a good approximation of the exact output. While ensuring a low error level, improvements in computational efficiency can be expected for digital signal processing systems using the implemented self-adjusting modules.

Mapping History has focused on the categorizing, labelling, digitization, and 3D reconstruction of 16th & 17th century maps & atlases of London and Lisbon. Over the course of the summer, the Mapping History team has developed its own unique analytical dataset by painstakingly labelling every element contained within these maps, used python to digitize this dataset, and, now in the projects final stage, has begun the process of reconstructing these historical perspectives in a 3D game engine.