An Integrated System for Accessing Large-Scale, Confidential Social Science Data

Project Summary

Large-scale databases from the social, behavioral, and economic sciences offer enormous potential benefits to society. However, as most stewards of social science data are acutely aware, wide-scale dissemination of such data can result in unintended disclosures of data subjects' identities and sensitive attributes, thereby violating promises–and in some instances laws to protect data subjects' privacy and confidentiality. 

Themes and Categories
Year
Contact
Jerry Reiter
Statistical Science
jerry@stat.duke.edu

Supported by a grant from the National Science Foundation Data Infrastructure Building Blocks program, we are developing an integrated system for disseminating large-scale social science data. The system includes:

(i) Capability to generate highly redacted, synthetic data intended for wide access, coupled with

(ii) Means for approved researchers to access the confidential data via secure remote access solutions, glued together by

(iii) A verification server that allows users to assess the quality of their analyses with the redacted data so as to be more efficient with their use of remote data access.

Related People

Related Projects

A team of students collaborating with Duke School of Medicine's Root Causes Fresh Produce Program, community members, and physicians throughout the Duke Health network will help integrate data from food deliveries to Duke Health patients with patient health record data and other available data sources to create a dashboard that can analyze, predict, and manage the Root Causes' "Food as Medicine" program. Specific outcomes will contribute to improving the Program's quantitative evaluation of its health impact as well as efficiency and satisfaction for its patients. Students will be assisted with IRB approval and mentorship from faculty and community advisors.

Project Leads: Esko Brummel, Willis Wong

 

A team of students will partner closely with the City of Durham's newly formed Community Safety Department.  The Community Safety Department's mission is to identify, implement, and evaluate new approaches to enhance public safety that may not involve a law enforcement response or the criminal justice system. The student team will (1) analyze and identify geographic and temporal patterns in 911 calls for service, (2) conceptualize and build an abstracted data pipeline and tools that would enrich currently available 911 data with other social, economic, and health-related data, (3) explore associations between areas of high call volume, indicators of mental health distress, and histories of dispossession; and (4) identify methods by which future researchers could examine connections between varied 911 incident responses (e.g. police response, unarmed response, joint police, and mental health response) and life trajectories (e.g. arrest, jail time, hospitalization, unemployment, etc.).

 

Project Lead: Greg Herschlag, Anise Van, City of Durham

Project Manager: Deekshita Saikia

 

The QSIDES Institute has pulled thousands of pages of police records from the Williamstown Police Department.  This project will analyze and identify patterns in police behavior in the town, work with journalists and other activists to use the data to develop action plans to address problems and enact public safety reform, and help to build an abstracted process and suite of tools that can be used by other small towns and municipalities to analyze their own data and empower police reform efforts across the nation. This project will interface with the Small Town Policing Accounting, or SToPA Lab. The overall goal for this project is to use the instance of the Williamstown Policing data to build out a toolkit that can be used in other small municipalities for policing transparency.
 

Project Leads: Jude Higdon, Greg Herschlag

Project Manager: Devin Cornell