Preserving Confidentiality of High-dimensional Tabulated Data: Statistical and Computational Issues (2002)

Abstract:

Dissemination of information derived from large contingency tables formed from confidential data is a major problem faced by statistical agencies. In this paper we present solutions to several computational and algorithmic issues that arise in the dissemination of cross-tabulations (marginal sub-tables) from a single underlying table. These include data structures that exploit sparsity and support efficient computation of marginals as well as algorithms such as iterative proportional fitting, and a generalized form of the shuttle algorithm that computes sharp bounds on (small, confidentiality threatening) cells in the full table from arbitrary sets of released marginals. We give examples illustrating the techniques.

Keywords:

Branch and bound; Contingency tables; Disclosure limitation; Integer programming; Marginal bounds; Shuttle algorithm. 

Author: 
Adrian DobraAlan F. KarrAshish Sanil
Publication Date: 
Friday, November 1, 2002
File Attachment: 
PDF icon tr130.pdf
Report Number: 
130