Predicting Pancreatic Cancer

Predicting Pancreatic Cancer

2016

Albert Antar (Biology), and Zidi Xiu (Biostatistics) spent ten weeks leveraging Duke Electronic Medical Record (EMR) data to build predictive models of Pancreatic ductal adenocarcinoma (PDAC). PDAC is the 4th leading cause of cancer deaths in the US, and is most often is diagnosed in stage IV, with a survival rate of only 1% and life expectancy measured in months. Diagnosis of PDAC is very challenging due of deep anatomical placement, and significant risk imposed by traditional biopsy. The goal of this project is to utilize EMR data to identify potential avenues for diagnosing PDAC in the early treatable stages of disease.

Project Results

The team first constructed a patient timeline leading up to PDAC using diagnostic codes in the EMR data. They then applied a supervised topic model to the diagnosis code data, resulting in highly interpretable groups of diagnoses, and a promising predictive model for pancreatic cancer. The study team is following up with clinical colleagues in the Duke Department of Medicine to initiate further studies based on the team’s work.

Download the Executive Summary (PDF)

Faculty Sponsors

Project Manager

  • Shaobo Han, post-doc, Electrical and Computer Engineering

Participants

  • Albert Antar, Duke University Biology
  • Zidi Xiu, Student Mentor, Master’s student Biostatistics, Duke University

Disciplines Involved

  • Biostatistics
  • Pre-med
  • All quantitative STEM

Related People

Associate Director of Health System Operations

Mathematics

Biology

Electrical and Computer Engineering

Biostatistics