Open Data for Tobacco Retailer Mapping

Open Data for Tobacco Retailer Mapping

2017

Felicia Chen (Computer Science, Statistics), Nikhil Pulimood (Computer Science, Mathematics), and James Wang (Statistics, Public Policy) spent ten weeks working with Counter Tools, a local nonprofit that provides support to over a dozen state health departments. The project goal was to understand how open source data can lead to the creation of a national database of tobacco retailers.

Project Results: The team performed a feasibility study involving questions of technical accuracy and cost-effectiveness. Working mostly in R, they used a combination of web-scraping for data collection, machine-learning and text mining for data classification, and MTurk for human validation, and were able to construct a viable dataset for North Carolina.

They presented findings at an informal briefing of civic leaders and planning officials.

Partially funded by Counter Tools

Click here for the Executive Summary

Project Lead & Project ManagerMike Dolan Fliss, Counter Tools

 

state facet graph

ct partner map

 

“Coming in, I had little knowledge about what data science research entailed. Participating in Data+ was a great step and helped me better realize my career goals. I learned a host of interdisciplinary skills – ranging from web scraping to survey design – that can definitely be applied to future projects.” — Felicia Chen, Computer Science & Public Policy

Contact

Mathematics

Related People

Computer Science, Mathematics

Statistics, Public Policy

Computer Science, Statistics

Related News

New maps created by Data+ students this summer suggest that tobacco retailers are disproportionately located in low-income neighborhoods and living in a neighborhood with easy access to stores that sell tobacco...