Data Analysis for Ecological Modeling

Project Summary

Ecological data comes in various shapes and sizes. When conducting an ecological study, it is common to have population data (such as snail counts) and continuous sensor data (such as stream temperature with 35,000 data points collected each year!). Ecologists must reconcile data collected at different spatial and temporal scales in order to make inferences about their study systems. Luckily, there are standard practices and toolsets that ecologists use. In this data expedition, we ingest, arrange and query data collected in the field through various methods into formats that can be analyzed. We then use different plot types, data transformations and statistical tests, such that our analyses are appropriate for the type of data. We examine both field data collected by students and also large open-source datasets that can be scraped from the web and analyzed locally.


Each year, the Field ecology students measure physical, chemical, and biological characteristics of the Eno River. The Eno River also has been continuously monitored for numerous environmental parameters as part of the StreamPulse project (Duke and other collaborators worldwide). StreamPulse collects data from instream sensors, such as temperature and dissolved oxygen to estimate ecosystem processes such as metabolism. So, we are able to compare data collected in the field course to long term monitoring efforts.

Themes and Categories

Graduate Students: Emily Ury and Alice Carter

Faculty: Dr. Justin Wright (and Dr. Emily Bernhardt helped with original proposal)

Undergraduate Course: "Field Ecology" (BIO 361) 


Part 1: Observations at the Eno River

  • Students will learn how to ingest their own data into the R programming environment

  • Students will become familiar with different types of ecological data

  • Students will use linear regression and multiple linear regression to examine and predict ecological data

  • Students will try different transformations and statistical tests to examine their data

Part 2: A Year of Eno River Data

  • Students will explore the StreamPulse project data platform and R package

  • Students will download and examine a year of Eno River monitoring data

  • Students will begin to examine how long-term monitoring data is used to understand field observation data for ecological analysis. 

Here are some examples of the plots we made:

Binning stream parameters to understand population distributions:

Snail distribution graph

Trying out various visual, statistical and modeling approaches:

Different modeling approaches graph

Graph of a year of oxygen levels at the Eno River

Student Feedback

“I’ve never used R before, so I learned how to input data, make plots, and do regression analyses (single + multiple)...Stream data was really cool!”

“Thank you! Super helpful. Always so much to learn with R.”

“[I] learned how to fit a linear trendline to a graph.”

“I learned how to customize the data I am working with.”

Student feedback cards

Attached materials for the lesson

Two R markdown files:

One data file:

Photos from the class

Students in class

Students in class

Students in class

Related Projects

KC and Patrick led two hands-on data workshops for ENVIRON 335: Drones in Marine Biology, Ecology, and Conservation. These labs were intended to introduce students to examples of how drones are currently being used as a remote sensing tool to monitor marine megafauna and their environments, and how machine learning can be used to efficiently analyze remote sensing datasets. The first lab specifically focused on how drones are being used to collect aerial images of whales to measure changes in body condition to help monitor populations. Students were introduced to the methods for making accurate measurements and then received an opportunity to measure whales themselves. The second lab then introduced analysis methods using computer vision and deep neural networks to detect, count, and measure objects of interest in remote sensing data. This work provided students in the environmental sciences an introduction to new techniques in machine learning and remote sensing that can be powerful multipliers of effort when analyzing large environmental datasets.

This two-week teaching module in an introductory-level undergraduate course invites students to explore the power of Twitter in shaping public discourse. The project supplements the close-reading methods that are central to the humanities with large-scale social media analysis. This exercise challenges students to consider how applying visualization techniques to a dataset too vast for manual apprehension might enable them to identify for granular inspection smaller subsets of data and individual tweets—as well as to determine what factors do not lend themselves to close-reading at all. Employing an original dataset of almost one million tweets focused on the contested 2018 Florida midterm elections, students develop skills in using visualization software, generating research questions, and creating novel visualizations to answer those questions. They then evaluate and compare the affordances of large-scale data analytics with investigation of individual tweets, and draw on their findings to debate the role of social media in shaping public conversations surrounding major national events. This project was developed as a collaboration among the English Department (Emma Davenport and Astrid Giugni), Math Department (Hubert Bray), Duke University Library (Eric Monson), and Trinity Technology Services (Brian Norberg).

Understanding how to generate, analyze, and work with datasets in the humanities is often a difficult task without learning how to code or program. In humanities centered courses, we often privilege close reading or qualitative analysis over other methods of knowing, but by learning some new quantitative techniques we better prepare the students to tackle new forms of reading. This class will work with the data from the HathiTrust to develop ideas for thinking about how large groups and different discourse communities thought of queens of antiquity like Cleopatra and Dido.

Please refer to for more information.