Open Data for Tobacco Retailer Mapping

Project Summary

Felicia Chen (Computer Science, Statistics), Nikkhil Pulimood (Computer Science, Mathematics), and James Wang (Statistics, Public Policy) spent ten weeks working with Counter Tools, a local nonprofit that provides support to over a dozen state health departments. The project goal was to understand how open source data can lead to the creation of a national database of tobacco retailers.

Themes and Categories
Year
2017
Contact
Paul Bendich
Mathematics
bendich@math.duke.edu

Project Results: The team performed a feasibility study involving questions of technical accuracy and cost-effectiveness. Working mostly in R, they used a combination of web-scraping for data collection, machine-learning and text mining for data classification, and MTurk for human validation, and were able to construct a viable dataset for North Carolina.

They presented findings at an informal briefing of civic leaders and planning officials.

Partially funded by Counter Tools

Click here for the Executive Summary

Project Lead & Project ManagerMike Dolan Fliss, Counter Tools

 
 


 

"Coming in, I had little knowledge about what data science research entailed. Participating in Data+ was a great step and helped me better realize my career goals. I learned a host of interdisciplinary skills - ranging from web scraping to survey design – that can definitely be applied to future projects." — Felicia Chen, Computer Science & Public Policy

Related People

Related Projects

The Air Force’s F-15E Strike Eagle jets have parts that wear down and break, causing unscheduled maintenance events that take away valuable time in the air for critical missions and training. Our team, Limitless Data, is working with Seymour Johnson Air Force Base to mine manually entered maintenance data to visualize and predict aircraft failures. We created a prototype data visualization product that will enable maintainers on the flight line and help them identify and repair critical failures before they happen, keeping jets ready to fly, fight and win.

 

Faculty Lead: Dr. Emma Rasiel

Client Lead: Lt. Devon Burger

Project Manger:  Vignesh Kumaresan

This project aims to improve the computational efficiency of signal operations, e.g., sampling and multiplying signals. We design machine learning-based signal processing modules that use an adaptive sampling strategy and interpolation to generate a good approximation of the exact output. While ensuring a low error level, improvements in computational efficiency can be expected for digital signal processing systems using the implemented self-adjusting modules.

Project Leads: Yi Feng, Vahid Tarokh

 

Click here to view the project team's poster

 

Watch the team's final presentation (on Zoom) here:

 

Mapping History has focused on the categorizing, labelling, digitization, and 3D reconstruction of 16th & 17th century maps & atlases of London and Lisbon. Over the course of the summer, the Mapping History team has developed its own unique analytical dataset by painstakingly labelling every element contained within these maps, used python to digitize this dataset, and, now in the projects final stage, has begun the process of reconstructing these historical perspectives in a 3D game engine.

Project Lead: Philip Stern, Ed Triplett

Project Manager: Sam Horewood

 

View the team's final poster here

Watch the team's final presentation (on Zoom) below: