Exploring lemur olfactory communication via statistical analyses in R

Project Summary

Questions asked: Do males and females scent mark equally? Do lemurs scent mark equally in breeding and non-breeding seasons?

Themes and Categories
Year
Contact
Paul Bendich
bendich@math.duke.edu

Graduate students: Lydia Greene and Kendra Smyth

Faculty instructor: Julie Teichroeb

Course: EVANTH 246: Sociobiology

Data set: The frequency of scent-marking behavior in the Coquerel’s sifaka

Dependent variable: scent-marking frequency

Potential explanatory variables: sex, season, age, group size, free ranging, amount of time observed, individual identity

  • Step 1: Visualizing data and testing for normalcy (histograms, dotcharts, box plots, Shapiro test)
  • Step 2: Choosing an appropriate distribution and test
  • Step 3: Applying the test in R (Wilcoxon tests and GLMMs)
  • Step 4: Interpreting results

Model <- glmmadmb(Scentmark ~ Sex + Season + Group.size + Age + FR + (1|Individual) + offset(log(Obs..Time)), data=data, family=”nbinom”, zeroInflation=TRUE)

Related Projects

We led a 75-minute class session for the Marine Mammals course at the Duke University Marine Lab that introduced students to strengths and challenges of using aerial imagery to survey wildlife populations, and the growing use of machine learning to address these "big data" tasks.

Most phenomena that data scientists seek to analyze are either spatially or temporally correlated. Examples of spatial and temporal correlation include political elections, contaminant transfer, disease spread, housing market, and the weather. A question of interest is how to incorporate the spatial correlation information into modeling such phenomena.

 

In this project, we focus on the impact of environmental attributes (such as greenness, tree cover, temperature, etc.) along with other socio-demographics and home characteristics on housing prices by developing a model that takes into account the spatial autocorrelation of the response variable. To this aim, we introduce a test to diagnose spatial autocorrelation and explain how to integrate spatial autocorrelation into a regression model

 

 

In this data exploration, students are provided with data collected from remote sensing, census, and Zillow sources. Students are tasked with conducting a regression analysis of real-estate estimates against environmental amenities and other control variables which may or may not include the spatial autocorrelation information.

 

Over the course of two, one and a half hour sessions we led students in the Duke Marine Lab Marine Ecology class (Biology 273LA) on a data expedition using the statistical programming environment R. We gave an introduction to big data, the role of big data in ecology, important things to consider when working with data (quality control, metadata, etc.), dealing with big data in R, what the Tidyverse is, and how to organize tidy data (see class PowerPoint). We then led a hands-on coding workshop where we explored an open-access citizen science dataset of aquatic plants along U.S. east coast (see dataset details below).