
Students join small project teams (at most 3 undergrads and 1 master’s student per team), working alongside other teams in a communal environment. They learn how to marshal, analyze, and visualize data, while gaining broad exposure to the modern world of data science. The projects come from an extremely diverse set of subject areas. It is our hope that students will be able to both work deeply into their specific project and get a very broad picture of most of the skills needed for modern data science.
Participants will receive a $5,000 stipend, out of which they must arrange their own housing and travel. Funding and infrastructure support are provided by a wide range of departments, schools, and initiatives from across Duke University, as well as by outside industry and community partners.
Data+ has returned to 100% in-person participation. Participants may not accept employment or take classes during the program; this requirement is strictly enforced and non-negotiable.
Due to the nature of the data involved in some of the projects, human subjects research training will be required of all participants and will be provided after admission to the program. With each project, we have attempted to list potential majors and/or interests that might be most interested in the project, but these should not be seen as requirements in any way! Quantitative STEM majors like mathematics, computer science, statistics, and electrical engineering are relevant to all.
Applied Ethics+ is a new full-time ten-week summer learning experience hosted by the Duke Initiative for Science and Society. Interdisciplinary teams of Duke undergraduate and graduate students will work with ground-breaking clients addressing pressing real-world challenges in policy, technology, research, and ethics. Extensive experience in policy, technology, research, and ethics is not a prerequisite to apply but is strongly encouraged. Technical data and computer science experience are also not necessary as projects will focus on research and policy challenges.
The program runs from Monday, May 29 through Friday, August 4, 2023. Students must be able to participate in the entire program, with no exceptions. Applications open the second week of January 2023.
Get Involved
2023 Data+ projects have been posted! See the list.
Partners
Industry partners are essential to Data+. Learn how to become a partner.

From Our Data+ Students
Data+ by the Numbers
weeks during the summer
undergraduates per team
grad student mentors per team
projects sharing ideas and code
Data+ Projects
Students will collaborate to develop a document that outlines the current policy, research, and practice landscape of Whole Genome Sequencing for New Born Screening. Students will have the opportunity to conduct a thorough review of the existing literature and consult with relevant stakeholders to synthesize a clear and concise report....
Students on this project will assist Illumina with the research and assessment of Community-Based Review Boards to enable greater transparency in the ethical collection and use of patient data. Students should expect to familiarize themselves with the context of different types of ethics review boards and their role in the...
Students on this project will assist Illumina with the development of good genomic data-sharing principles to help garner effective communication and trust among patients, physicians, and genomic researchers. Specifically, students will research and propose data-sharing principles to encourage the ethical collection, sharing, and communication of patient genomic data with their...
Students will coordinate with the Organisation for Economic Co-operation and Development (OECD) Working Party on Bio-, Nano-, and Converging Technology to develop strategies to anticipate the development of emerging technologies and novel applications beginning with the field of Synthetic Biology (synbio). The development of these strategies will help the OECD...
Students will coordinate with the Organisation for Economic Co-operation and Development (OECD) Working Party on Bio-, Nano-, and Converging Technology to support the development of impact assessment tools for the use of neurotechnologies. This project would constitute Phase I of developing a “neurotech” impact assessment tool that analyses the effects...
Students will coordinate with the Organisation for Economic Co-operation and Development (OECD) Working Party on Bio-, Nano-, and Converging Technology to develop a governance framework to both encourage the development of and mitigate the potential harms of emerging technologies. Framework development will require creative and interdisciplinary research to identify different dimensions of risks posed...
A team of students led by researchers within the Saltwater Intrusion and Sea Level Rise (SWISLR) Research Coordination Network will be tasked with creating a geospatial database summarizing the current extent of SWISLR and the current knowledge on SWISLR within the North American Coastal Plain. Students will be responsible for...
A team of students led by Civil & Environmental Engineering Professor Helen Hsu-Kim will develop a resource reserves database of coal ash wastes stored in hundreds of legacy disposal sites in the United States. The team will extract key information from historical datasets on coal energy production, incorporate geochemical information...
Researchers with the Duke River Center and the Watershed Biogeochemistry Lab will investigate patterns of anoxia, or periods of little to no oxygen, in rivers. Oxygen is a necessary element for many organisms to live in rivers, but researchers know little about the timing, duration, and magnitude of low oxygen...
Interested in using satellites and machine learning to help keep decision-makers better informed about climate change? Interested in learning about cutting-edge computer vision techniques for analyzing satellite imagery and how to scale them up globally? Come join a student team collaborating with the Energy Data Analytics Lab as we democratize...
A team of students led by researchers in the Social Science Research Institute and Departments of Mathematics and Statistics will curate a unique video data set for studying how different aspects of social interactions relate to social and psychiatric variables, like trust, empathy, and scores on clinical social competence and...
A team of students led by researchers in the Computer Science Department and the coaching staff of the Duke Women’s Soccer (DWS) team will develop analytical tools to provide quantitative evaluations of individual players and whole teams. By applying machine learning techniques and other data science methods to deep event-level...
The goal of this Data+ project is to apply and extend custom analytics solutions to understand and predict microbial population growth. An explosion of data has resulted from tracking the growth of bacteria in high throughput devices. These data were generated to understand how microbes grow. Better models that fit...
Using data from the Durham Compass and the NC School Report Card among many other sources, this team will continue the development of an interactive R Shiny dashboard that permits exploration of school statistical data. The team aims to explore Durham Public School (DPS) zones through an asset-based lens to support ethical...
A team of students led by researchers from the Nicholas School of the Environment will develop an app to estimate the distance of animals from a camera trap. Students will merge camera trap data with terrestrial lidar (3D imagery of forest around the camera trap) to locate the position of...
A team of students led by researchers in the BIG IDEAs Lab will optimize and further develop an existing cloud-based infection detection platform that populates and translates wearable data from a variety of sources. The project will involve working with existing wearable data pipelines (e.g., APIs) to collect, process, and...
Using digitized card catalogs from the David M. Rubenstein Rare Book and Manuscript Library, a team of students will explore extracting structured data from over 115,000 subject cards to develop searchable and sortable descriptions of manuscript and archival collections. They will prepare the digitized subject cards for online access in...
A team of students led by a data scientist at NetApp will develop means to evaluate technical documentation through machine learning techniques. Students will identify features of language and documents that can be used to demonstrate how effective that documentation is at communicating technical specifications. Additionally, students will also apply...
Students will collaborate with CEE Professors David Carlson and Mike Bergin to model the effects of land use on the urban heat island effect using satellite imagery and ground-level temperature measurements. Students will use machine learning to segment satellite images of Durham, North Carolina by land use. They will then...
Showing 1-20 of 232 results