Validating a Topic Model that Predicts Pancreatic Cancer from Latent Structures in the Electronic Medical Record

Project Summary

Furthering the work of a 2016 Data+ team in predictive modeling of pancreatic cancer from electronic medical record (EMR) data, students Siwei Zhang (Masters Biostatistics) and Jake Ukleja (Computer Science) spent ten weeks building a model to predict pancreatic cancer from Electronic Medical Records (EMR) data. They worked with nine years worth of EMR data, including ICD9 diagnostic codes, that contained records from over 200,000 patients.

Themes and Categories
Year
2017
Contact
Paul Benich
bendich@math.duke.edu

Project Results: The team began with exploratory data analysis that illustrated median times of appearance and frequency of specific ICD9 codes, with an eye toward understanding the relation between these statistics and pancreatic cancer diagnosis. They then trained a topic model which predicted past pancreatic cancer diagnosis with high accuracy (93 percent AUC) from ICD9 codes. Finally, they used the topic model outcomes to identify a pool of high-risk patients for potential future study.

Click here for the Executive Summary

Project Leads:

Lisa Satterwhite, PhD

James Abbruzzese, MD

Joseph Lucas, PhD

Project Manager: Tyler Massaro

Related People

Related Projects

A team of students collaborating with Duke School of Medicine's Root Causes Fresh Produce Program, community members, and physicians throughout the Duke Health network will help integrate data from food deliveries to Duke Health patients with patient health record data and other available data sources to create a dashboard that can analyze, predict, and manage the Root Causes' "Food as Medicine" program. Specific outcomes will contribute to improving the Program's quantitative evaluation of its health impact as well as efficiency and satisfaction for its patients. Students will be assisted with IRB approval and mentorship from faculty and community advisors.

Project Leads: Esko Brummel, Willis Wong

 

A team of students led by researchers at the Duke Marine Lab will explore the changing distribution of krill around the Antarctic Peninsula. Krill are a key prey species in this ecosystem, supporting a number of animals including whales, seals, and penguins, but they are dependent on winter sea ice and may be in trouble as climate change progresses. Using data from acoustic zooplankton surveys, students will create maps and other products to visualize the spatial distribution of krill over the past 20 summers, then create metrics that allow us to quantify the way that krill distribution around the Antarctic Peninsula is changing as the climate shifts and ice melts. These results will be key to our understanding of the impacts of climate change on this polar ecosystem.

 

Project Lead: Douglas Nowacek

Project Manager: Amanda Lohmann

 

A team of students will partner closely with the City of Durham's newly formed Community Safety Department.  The Community Safety Department's mission is to identify, implement, and evaluate new approaches to enhance public safety that may not involve a law enforcement response or the criminal justice system. The student team will (1) analyze and identify geographic and temporal patterns in 911 calls for service, (2) conceptualize and build an abstracted data pipeline and tools that would enrich currently available 911 data with other social, economic, and health-related data, (3) explore associations between areas of high call volume, indicators of mental health distress, and histories of dispossession; and (4) identify methods by which future researchers could examine connections between varied 911 incident responses (e.g. police response, unarmed response, joint police, and mental health response) and life trajectories (e.g. arrest, jail time, hospitalization, unemployment, etc.).

 

Project Lead: Greg Herschlag, Anise Van, City of Durham

Project Manager: Deekshita Saikia