Research

Research projects at Rhodes iiD focus on building connections. We encourage crosspollination of ideas across disciplines, and to develop new forms of collaboration that will advance research and education across the full spectrum of disciplines at Duke. The topics below show areas of research focus at Rhodes iiD. See all of our research.

Producing oil and gas in the North Sea, off the coast of the United Kingdom, requires a lease to extract resources from beneath the ocean floor and companies bid for those rights. This team will work with ExxonMobil to understand why these leases are acquired and who benefits. This requires historical data on bid history to investigate what leads to an increase in the number of (a) leases acquired and (b) companies participating in auctions. The goal of this team is to create a well-structured dataset based on company bid history from the U.K. Oil and Gas Authority; data which will come from many different file structures and formats (tabular, pdf, etc.). The team will curate these data to create a single, tabular database of U.K. bid history and work programs.

Producing oil and gas in the Gulf of Mexico requires rights to extract these resources from beneath the ocean floor and companies bid into the market for those rights. The tops bids are sometimes significantly larger than the next highest bids, but it’s not always clear why this differential exists and some companies seemingly overbid by large margins. This team will work with ExxonMobil to curate and analyze historical bid data from the Bureau of Ocean Energy Management that contains information on company bid history, infrastructure, wells, and seismic survey data as well as data from the companies themselves and geopolitical events. The stretch goal of the team will be to see if they can uncover the rationale behind historic bidding patterns. What do the highest bidders know that other bidders to not (if anything)? What characteristics might incentivize overbidding to minimize the risk of losing the right to produce (i.e. ambiguity aversion)?

A team of students led by researchers in the Energy Access Project will develop means to evaluate non-technical electricity losses (theft) in developing countries through machine learning techniques applied to smart meter electricity consumption data. Students will use data from smart meters installed at transformers and households through a randomized control trial. Students will develop algorithms that can be used to detect anomalies in the electricity consumption data and create a dataset of such indicators.  This project will provide researchers with new ways of incorporating electricity consumption data and applications for electricity utilities in developing country settings.

Saltwater intrusion and sea level rise are issues of serious concern for people throughout the coastal plain. Our Data+ team will collaborate with researchers to create an interactive data visualization platform that compiles remotely sensed estimates of vegetation change throughout the coastal plain and links this data with field salinity estimates. The team will have the opportunity to build educational website content that a) explains how saltwater incursion occurs; b) describes the consequences for coastal forests; c) links this understanding with likely scenarios of coastal climate for the next decade. In each case, we would like to illustrate this content with interactive data graphics.

Faculty Leads: Justin Wright, Emily Bernhardt

Project Manager: Emily Ury

This team will explore how to develop machine learning techniques for analyzing satellite imagery data for identifying energy infrastructure that can be trained once and applied almost anywhere in the world. Led by researchers from the Energy Data Analytics Lab and the Sustainable Energy Transitions Initiative, the team will design two datasets: the first containing satellite imagery from diverse geographies with all energy infrastructure labeled, and the second a synthetic version of the same imagery. These data will enable research into whether synthetic imagery may be used to adapt algorithms to new domains. The better these techniques adapt to new geographies, the more information can be provided to researchers and policymakers to design sustainable energy systems and understand the impact of electrification on the welfare of communities. 

Faculty Lead: Kyle Bradbury

Project Manager: TBD

A team of students led by Jim Heffernan, Nick Bruns, and partners at UNC and EPA will create interactive data visualizations of water quality data in rivers and lakes of the United States. These tools will aid environmental scientists, managers, policy-makers, and students who want to investigate patterns of water pollution across broad scales of space and time. Students will gain experience with manipulation of large data sets, geospatial analysis, and remote sensing of water quality parameters. Opportunities include developing visualization tools to represent spatial and temporal coverage of water quality parameters, georeferencing field observations and remote sensing satellite overpasses with field observations, and assessing spatial and temporal gaps in observations for a variety of water quality parameters.

Faculty Lead: Jim Heffernan

Project Manager: Nick Bruns

Duke must reduce its energy footprint as Duke strives for Carbon Neutrality by 2024. To help this cause, a team of students will review troves of utility usage data and attempt to build an attractive and practical monthly energy use report for every building and school at Duke. This report will not only show historical usage but also develop an energy benchmark for comparison and conservation tips for local administrators to take action. Duke Energy has been using a similar report to encourage conservation at the residential level for years. It is time to bring energy use transparency to the broader Duke community and inspire action.

Faculty Lead: Billy Pizer

Project Manager: TBD

Data-enabled approaches present new opportunities to analyze responses of aquatic ecosystems to stressors and to illustrate scientific findings in new formats that are more widely accessible. Our goal is to create a web-based storytelling platform that illustrates the results of freshwater ecosystem studies conducted at the IISD-Experimental Lakes Area in Canada (https://www.iisd.org/ela/). Students on our team will process historical datasets and develop interactive data visualization tools for public outreach on freshwater ecology and conservation. This project is led by water resources professor Kateri Salk (Nicholas School of the Environment) and staff at the IISD-Experimental Lakes Area.

Faculty Lead: Kateri Salk

Project Manager: TBD

A team of students led by faculty and students in Duke's River Center will manipulate, model and visualize time series data derived from hundreds of rivers throughout the world. Students will gain experience working with large datasets derived from environmental sensors and will be able to direct the data project based on their learning interests. Opportunities include developing machine learning tools for data processing and pattern recognition, building software and web interfaces to enable cloud computing, and creating interactive graphics aimed at explaining scientific concepts using Big Data. Tools developed through this project will be hosted on the StreamPulse web platform (streampulse.org).

Faulty Leads: Emily Bernhardt, Jim Heffernan

Project Manager: TBD

Modern Energy Group (MEG) finances and operates various distributed energy resources operating in wholesale energy markets, ranging from solar panels to residential smart thermostats. MEG also does financial trading when it identifies arbitrage opportunities in these markets. One of MEG's main operational risks is the very high volatility in wholesale real time (or spot) energy prices. Where stock markets consider a 30% change in price large, energy markets routinely face changes in price on the order of 300%. This high volatility comes from three main "shocks": 1. power demand changes, due to unpredictable weather, industrial patterns, or human consumption; 2. fuel shortages, driven by trade, extraction/exploration, and gathering/transportation economics; 3. electrical transmission outages, driven by operational failure, extreme weather events, and human behavior.

First, this project team will identify what should be considered an "extreme" price shock from 5-10 years of historical data in PJM. Second, the team will work to automatically identify potential causes for the rare events from news articles, public filings, and MEG's own structured data. Third, the team will build reasonable priors for the occurrences of these rare events, and incorporate potential covariance between the events using copulas or similar methods. Finally, the team will create a simple classifier such as logistic regression to predict the likelihood of a price shock on a given day. The model needs to be evaluated with a walk-forward backtest, training on about 3 years of data at a time, and shifting forward the training window in approximately one-month increments, to smooth out potential bias and overfitting in the model. 

Project Lead: Eric Butter, Modern Energy Group

Project Manager: TBD

Large publicly available environmental databases are a tremendous resource for both scientists and the general public interested in climate trends and properties. However, without the programming skills to parse and interpret these massive datasets, significant trends may remain hidden from both scientists and the public. In this data exploration, students, over the course of three hours, accessed two large, publicly available datasets, each with greater than 4 million observations. They learned how to use R and RStudio to effectively organize, visualize and statistically explore trends in deep sea physical oceanography.  

Brooke Erikson (Economics/Computer Science), Alejandro Ortega (Math), and Jade Wu (Computer Science) spent ten weeks developing open-source tools for automatic document categorization, PDF table extraction, and data identification. Their motivating application was provided by Power for All’s Platform for Energy Access Knowledge, and they frequently collaborated with professionals from that organization.

Click here to read the Executive Summary

 

Varun Nair (Mechanical Engineering), Tamasha Pathirathna (Computer Science), Xiaolan You (Computer Science/Statistics), and Qiwei Han (Chemistry) spent ten weeks creating a ground-truthed dataset of electricity infrastructure that can be used to automatically map the transmission and distribution components of the electric power grid. This is the first publicly available dataset of its kind, and will be analyzed during the academic year as part of a Bass Connections team.

Click here to read the Executive Summary

Marine mammals exhibit extreme physiological and behavioral adaptions that allow them to dive hundreds to thousands of meters underwater despite their need to breathe air at the surface. Through the development of new remote monitoring technologies, we are just beginning to understand the mechanisms by which they are able to execute these extreme behaviors. Long- term animal-borne tags can now record location, dive depth, and dive duration and then transmit these data to satellite receivers, enabling remote access to behavior occurring both many kilometers out to sea and several kilometers below the ocean surface. 

Understanding of how to manipulate, analyze, and display large datasets is an essential skill in the life sciences. Introducing students to the concepts of coding languages and showing them the diversity of tasks that can be accomplished using a flexible coding scheme like R is an important step in the training of any life sciences professional. For students taking lab-based courses, who are often required to analyze the datasets they produce in class, learning these techniques can be helpful both in the short-term (i.e., during the semester) and for their future careers.

Sophie Guo, Math/PoliSci major, Bridget Dou, ECE/CompSci major, Sachet Bangia, Econ/CompSci major, and Christy Vaughn spent ten weeks studying different procedures for drawing congressional boundaries, and quantifying the effects of these procedures on the fairness of actual election results.

Anna Vivian (Physics, Art History) and Vinai Oddiraju (Stats) spent ten weeks working closely with the director of the Durham Neighborhood Compass. Their goal was to produce metrics for things like ambient stress and neighborhood change, to visualize these metrics within the Compass system, and to interface with a variety of community stakeholders in their work.

Sharrin ManorArjun DevarajanWuming Zhang, and Jeffrey Perkins explored a lage collection of imagery data provided by the U.S. Geological Survey, with the goal of identifying solar panels using image recognition. They worked closely with the Energy Data Analytics Lab, part of the Energy Initiative at Duke.

ECE majors Mitchell Parekh and Yehan (Morton) Mo, along with IIT student Nikhil Tank, spent ten weeks understanding parking behavior at Duke. They worked closely with the Parking and Transportation Office, as well as with Vice President for Administration Kyle Cavanaugh.

David Clancy, a Stats/Math/EnvSci major, and Tianyi Mu, an ECE/CompSci major, spent ten weeks studying the effects of weather, surroundings, and climate on the operational behavior of water reservoirs across the United States. They used a large dataset compiled by the U.S. Army Corps of Engineers, and they worked closely with Lauren Patterson from the Water Policy Program at Duke's Nicholas Institute for Environmental Policy Solutions. Project mentorship was provided by Alireza Vahid, a postdoctoral candidate in Electrical Engineering.

Devri Adams (Environmental Science), Annie Lott (Statistics), and Camila Vargas Restrepo (Visual Media Studies, Psychology) spent ten weeks creating interactive and exploratory visualizations of ecological data. They worked with over sixty years of data collected at the Hubbard Brook Experimental Forest (HBEF) in New Hampshire.

Graduate Students: Kendra Kaiser and John Mallard

Faculty: Michael O’Driscoll

Course: Landscape Hydrology, EOS 323/723

Boning Li (Masters Electrical and Computer Engineering), Ben Brigman (Electrical and Computer Engineering), Gouttham Chandrasekar (Electrical and Computer Engineering), Shamikh Hossain (Computer Science, Economics), and Trishul Nagenalli (Electrical and Computer Engineering, Computer Science) spent ten weeks creating datasets of electricity access indicators that can be used to train a classifier to detect electrified villages. This coming academic year, a Bass Connections Team will use these datasets to automatically find power plants and map electricity infrastructure.

William Willis (Mechanical Engineering, Physics) and Qitong Gao (Masters Mechanical Engineering) spent ten weeks with the goal of mapping the ocean floor autonomously with high resolution and high efficiency. Their efforts were part of a team taking part in the Shell Ocean Discovery XPRIZE, and they made extensive use of simulation software built from Bellhop, an open-source program distributed by HLS Research.

Joy Patel (Math and CompSci) and Hans Riess (Math) spent ten weeks analyzing massive amounts of simulated weather data supplied by Spectral Sciences Inc. Their goal was to investigate ways in which advanced mathematical techniques could assist in quantifying storm intensity, helping to augment today's more qualitatively-based methods.

The team built a ground truth dataset comprising satellite images, building footprints, and building heights (LIDAR) of 40,000+ buildings, along with road annotations. This dataset can be used to train computer vision algorithms to determine a building’s volume from an image, and is significant contribution to the broader research community with applications in urban planning, civil emergency mitigation and human population estimation.

Computer Science majors Erin Taylor and Ian Frankenburg, along with Math major Eric Peshkin, spent ten weeks understanding how geometry and topology, in tandem with statistics and machine-learning, can aid in quantifying anomalous behavior in cyber-networks. The team was sponsored by Geometric Data Anaytics, Inc., and used real anonymized Netflow data provided by Duke's Information Technology Security Office.

Graduate student: Hamza Ghadyali          

Faculty instructor: Dr. Paul Bendich

Molly Rosenstein, an Earth and Ocean Sciences major and Tess Harper, an Environmental Science and Spanish major spent ten weeks developing interactive data applications for use in Environmental Science 101, taught by Rebecca Vidra.