Open Data for Tobacco Retailer Mapping

Project Summary

Felicia Chen (Computer Science, Statistics), Nikkhil Pulimood (Computer Science, Mathematics), and James Wang (Statistics, Public Policy) spent ten weeks working with Counter Tools, a local nonprofit that provides support to over a dozen state health departments. The project goal was to understand how open source data can lead to the creation of a national database of tobacco retailers.

Themes and Categories
Year
2017
Contact
Paul Bendich
Mathematics
bendich@math.duke.edu

Project Results: The team performed a feasibility study involving questions of technical accuracy and cost-effectiveness. Working mostly in R, they used a combination of web-scraping for data collection, machine-learning and text mining for data classification, and MTurk for human validation, and were able to construct a viable dataset for North Carolina.

They presented findings at an informal briefing of civic leaders and planning officials.

Partially funded by Counter Tools

Click here for the Executive Summary

Project Lead & Project ManagerMike Dolan Fliss, Counter Tools

 
 


 

"Coming in, I had little knowledge about what data science research entailed. Participating in Data+ was a great step and helped me better realize my career goals. I learned a host of interdisciplinary skills - ranging from web scraping to survey design – that can definitely be applied to future projects." — Felicia Chen, Computer Science & Public Policy

Related People

Related Projects

A team of students, led by Electrical and Computer Engineering professor Vahid Tarokh, will develop methods to improve the efficiency of information processing with adaptive decisions according to the structure of new incoming data. Students will have the opportunity to explore data-driven adaptive strategies based on neural networks and statistical learning models, investigate trade-offs between error threshold and computational complexity for various fundamental operations, and implement software prototypes. The outcome of this project can potentially speed up many systems and networks involving data sensing, acquisition, and computation.

Project Leads: Yi Feng, Vahid Tarokh

A team of students will explore new ways of reading pre-modern maps and perspectival views through image tagging, annotation and 3D modeling. Each student will build a typology of icons found in these early maps (for example, houses, churches, roads, rivers, etc.). By extracting, modeling, and cataloging these features, the team will create a library of 2D and 3D objects that will be used to (a) identify patterns in how space and power are represented across these maps, and (b) to create a model for “experiencing” these maps in 3D, using the Unity game engine platform. This is a combined Data+ / Bass Connections project that will instruct students in qualitative and quantitative mapping techniques, basic 3D modeling and the history of cartography.

Project Lead: Philip Stern, Ed Triplett

Project Manager: Sam Horewood

A team of students will explore ways in which data science can help support the mission of Rewriting the Code, a national non-profit organization dedicated to empowering a community of college women with a passion for technology.

In particular, students will perform statistical analyzes of past survey data, build out interactive dashboards that help visualize trends in student experience, and help design future survey questions.

Project Lead: Sue Harnett

Faculty Lead: Alexandra Cooper

Project Manager: Imari Smith