An Integrated System for Accessing Large-Scale, Confidential Social Science Data

Project Summary

Large-scale databases from the social, behavioral, and economic sciences offer enormous potential benefits to society. However, as most stewards of social science data are acutely aware, wide-scale dissemination of such data can result in unintended disclosures of data subjects' identities and sensitive attributes, thereby violating promises–and in some instances laws to protect data subjects' privacy and confidentiality. 

Themes and Categories
Year
Contact
Jerry Reiter
Statistical Science
jerry@stat.duke.edu

Supported by a grant from the National Science Foundation Data Infrastructure Building Blocks program, we are developing an integrated system for disseminating large-scale social science data. The system includes:

(i) Capability to generate highly redacted, synthetic data intended for wide access, coupled with

(ii) Means for approved researchers to access the confidential data via secure remote access solutions, glued together by

(iii) A verification server that allows users to assess the quality of their analyses with the redacted data so as to be more efficient with their use of remote data access.

Related People

Related Projects

Neuroscience evidence (e.g., brain scans, mental-illness diagnosis, etc.) is increasingly being used in criminal cases to explain criminal behavior and lessen responsibility. A team of students led by researchers within the Science, Law, and Policy Lab to explore a national set of criminal cases in which neuroscience evidence is used to see what aspects of the criminal trial (i.e., offense, age of offender, etc.) may predict the outcome of future cases. Additionally, with the use of our comprehensive 10-year judicial opinion data set (2005-2015), the team will collaborate on creating a computer algorithm to assist in locating and coding online judicial opinions to build upon our comprehensive list of opinions. This tool will provide a strong foundation in the work of understanding neuroscience’s role within a criminal court setting.

Faculty Lead: Nita Farahany

Project Manager: William Krenzer

Students will collaborate with staff at DataWorks NC and the Eviction Diversion Program to explore and develop means of using evictions data to drive meaningful policy change that help Durham residents stay in their homes. Students will clean and assess the quality of evictions data, look for seasonal and geographic variation in eviction rates, analyze the relationship between evictions, rents, wages and other economic indicators, develop metrics for the real financial cost of evictions, and build static visualizations or a data dashboard to communicate their results. This project will help housing advocates in Durham assess the impact of their current work, and understand which future interventions will be most impactful.

Project Leads: Tim Stallmann, John Killeen, Peter Gilbert

Project Manager: Libby McClure

The American public first encountered the term “genocide” in a Washington Post op-ed published in 1944; since then, the word’s meaning has been circulated, debated, and shaped by numerous forces, especially by words and images in newspapers. With the support of Dr. Priscilla Wald (English), a team of students led by Nora Nunn (English graduate student) and Astrid Giugni (English and ISS) will analyze how U.S. mass media—particularly newspapers—enlist text and imagery such as press photographs to portray genocide, human rights, and crimes against humanity from World War II to the present. From the Holocaust to Cambodia, from Rwanda to Myanmar, such representation has political consequences. If time allows, students will also study the representation of collective violence in Hollywood film, querying the relationship between human rights and genre. The implications of these findings could inform future coverage of human rights-related issues at home and abroad.

Faculty Leads: Nora Nunn, Astrid Giugni