Statistical Disclosure Limitation in the Presence of Edit Rules (2013)

Abstract:

We articulate and investigate issues associated with performing statistical disclosure limitation (SDL) for data subject to edit rules. The central problem is that many SDL methods generate data records that violate the constraints. We propose and study two approaches. In the first, existing SDL methods are applied, and any constraint-violating values they produce are replaced by means of a constraint-preserving imputation procedure. In the second, the SDL methods are modified to prevent them from generating violations. We present a simulation study, based on data from the Colombian Annual Manufacturing Survey, that evaluates several SDL methods from the existing literature. The results suggest that (i) in practice, some SDL methods cannot be implemented with the second approach, and (ii) differences in risk-utility profiles across SDL approaches dwarf differences across the two approaches. Among the SDL strategies, microaggreggation followed by adding noise and partially synthetic data offer the most attractive risk-utility profiles.

Keywords:

Confidentiality, Imputation, Survey, Synthetic data 

Author: 
Hang Joon KimAlan F. KarrJerome P. Reiter
Publication Date: 
Tuesday, October 1, 2013
File Attachment: 
PDF icon tr184.pdf
Report Number: 
184