Duke TRIPODSTRIPODS@Duke was established in 2019 with a grant from the National Science Foundation to further our understanding of foundational principles in data science and to identify opportunities for innovation. Since then, as many as 25 faculty members – as well as graduate students and postdoctoral fellows – are working across disciplines to develop new tools and methods and to create dialogues in data science throughout North Carolina’s “Research Triangle.”

Project collaborators represent Duke departments including Computer Science, Electrical and Computer Engineering, Mathematics, and Statistical Science, and the team anticipates forging additional partnerships through engagement with the Rhodes Information Initiative at Duke and the regional Statistical and Mathematical Sciences Institute (SAMSI).


This initiative is part of the NSF’s Harnessing the Data Revolution (HDR) Big Idea activity, and all research and outreach efforts will focus on the three key themes below. Duke scientists have already produced significant scholarship on these topics, as reflected in the lists of recent publications and media coverage.

i. Scalable Algorithms with Uncertainty

In most application areas, it is of paramount importance to obtain an accurate quantification of uncertainty in conducting machine learning and statistical inferences.
This is particularly true for high-dimensional and complex data as the data collection process is inherently prone to uncertainty. Most methodology developed for these data focus on producing point estimates without any characterization of uncertainty. The lack of quantifying uncertainty leads to the risk of over-interpretation and contributes to the replicability crisis in science.

ii. Human Machine Interface

Data analytic and artificially intelligent (AI) systems are increasingly directly impacting society and human lives. Two technology based factors are driving this influence. The first factor is the application of data analytics to high-stakes application domains both at the individual level as well as the societal level through applications such as automated bail and parole decisions and the data analytics used to form gerrymandered voting districts, respectively. The second factor is the ubiquity and increased complexity of sensing technologies from social media to wearable devices to 3D imaging and their impact on data analysis from causal inference to medical applications such as clinical trials.

iii. Fundamental Limits

The fundamental limits of the algorithms and models proposed in the previous themes.. The types of analysis we consider are lower and upper bounds on conditions with theoretical guarantees, minimax rates for estimators, and approximation theory results for the complexity or expressibility of function classes induced by AI algorithms. Specifically, the challenges we consider include: (a) The fundamental limits of robust optimization with uncertain inputs; (b) Characterizing the statistical and approximation power of deep neural network architectures; and (c) The fundamental limits of causal inference in observational studies.



Principal Investigators

Sayan Mukherjee Sayan Mukherjee, PI
Professor of Statistical Science, Mathematics and Computer Science
Robert Calderbank

Robert Calderbank, Co-PI
Director of the Rhodes Information Initiative at Duke
Charles S. Sydnor Distinguished Professor of Computer Science, Professor of Mathematics and Professor of Electrical and Computer Engineering

Cynthia Rudin Cynthia Rudin, Co-PI
Professor of Computer Science
Prediction Analysis Lab
Jianfeng Lu Jianfeng Lu, Co-PI
Associate Professor of Mathematics and Chemistry
Rong Ge Rong Ge, Co-PI
Assistant Professor of Computer Science


Senior Personnel

Pankaj Agarwal Fan Li Galen Reeves
Wilkins Aquino Jonathan Mattingly Guillermo Sapiro
James Berger Ezra Miller Rebecca Steorts
Xiuyuan Cheng Kamesh Munagala Vahid Tarokh
Ingrid Daubechies Debmalya Panigrahi Alex Volfovsky
Katherine Heller Henry Pfister Hau-Tieng Wu


Visiting Research Scholars 

Nicolas Fraiman

I am an assistant professor in the Department of Statistics and Operations Research at the University of North Carolina at Chapel Hill, where I am a member of the probability group.



Student and Postdoctoral Research

Youngsoo Baek
My research is in analyzing medical image data to improve clinical diagnoses.  In particular, I am interested in drawing ideas from machine learning methods to address computational burden and likelihood intractability that pose challenges to applying classical statistical models.

Alina Barnett

My research is in designing inherently interpretable models for computer vision. By providing interpretable algorithms, I enable scientists to validate their models and ensure that their models are making predictions based on meaningful representations rather than on confounding information.

Jordan Bryan
I work with cancer genomics data, with a particular eye towards using tensor models for datasets with several experimental modalities. More recently, I have been spending time thinking about statistical inference problems using ideas from optimal transport.

Michele Caprio
My research is focused on imprecise probabilities. I am also interested in more foundational topics, like extended probabilities, and in the theoretical properties of neural networks.

Abraham Frandsen
I work on theoretical machine learning, with a focus in the areas of representation learning, control theory, and reinforcement learning.  

Henry Kirveslahti 

I work on applied topology, Bayesian statistics and their applications to imaging and biology. I am particularly interested in highly dissimilar data and soft tissue.

Anilesh Kollagunta Krishnaswamy 

I am primarily interested in questions regarding the design of fair and robust algorithms for machine learning, using ideas from optimization and game theory

Holden Lee 

I am primarily interested in questions regarding the design of fair and robust algorithms for machine learning, using ideas from optimization and game theory

Zining Ma (Past Participant)

In my research, we propose a method to generate a specific form of 3D shapes representation that could be applied in statistical analysis. We use Euler Characteristic curves to create the representation of shapes and utilize scaling function bases from diffusion wavelet to generate the representation. We discuss the details of our method and in the last we apply our method on a shape classification problem to test the performance of the representation.

Ezinne Nwankwo (Past Participant)
My primary research interests lie in improving the trustworthiness and reliability of machine learning models and applying machine learning to address societal challenges. Currently, my research focuses on using causal inference methods to identify and mitigate algorithmic bias/discrimination.
Xiang Wang
I am interested in the theory of deep learning, especially the optimization of neural networks.

Lihan Wang (Past Participant)
I am currently interested in the analytic behavior and numerical performance of Markov Chain Monte Carlo sampling algorithms that arise from stochastic processes.


Data Science Pathways in Education

The educational goal of TRIPODS@Duke is to develop and implement a Data Science Pathway that targets the entire “data to knowledge to action” pipeline to prepare undergraduates, graduate students, and postdoc trainees in cutting-edge data science research, being able to wrangle and analyze data in a responsible and ethical manner, and being able to work effectively in interdisciplinary teams.


View our interactive Data Science interdisciplinary course map.

Departmental Pathways

Interdepartmental Majors (IDMs) combine two academic disciplines in Trinity College, drawing 7 courses from each to create a major.

Research Opportunities
Reading Groups
Theoretical Aspects of Deep Learning

Fall 2022 date and time to be announced - via Zoom
Contact Jianfeng Lu to join:
The reading group meets and discusses recent works in mathematical and theoretical analysis of deep learning models and algorithms.

Diversity, Equity, and Inclusion

Our work cannot be complete unless all are given the genuine opportunity to succeed regardless of race, ethnicity, religion, national origin, gender identity, sexual orientation, or disability. We are well aware that past and current injustices have prevented far too many people from living up to their potential. We are committed to maintaining a culture that is welcoming and nurturing to all. We expect that everyone conducts themselves in a manner that is consistent with these values and free from any form of discrimination or harassment.

*This material is based upon work supported by the National Science Foundation under Grant Number 1934964. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Related Videos