NISS
Data Swapping Toolkit (DSTK)
The NISS DSTK is a comprehensive software package for performing
and analyzing data swapping on categorical data within a risk-utility framework.
The DSTK contains software for:
-
Single swaps, using a graphical user interface to perform
data swapping on categorical databases of essentially unlimited size,
with user-specified swap rates, swap attributes and (optionally) equality
or inequality constraints on unswapped attributes. The swapping is performed
on the user's local machine, allowing the DSTK to be used on confidential
databases.
-
Batch swaps: in-depth investigations of multiple swap
rates and multiple choices of swap attributes–all one-attribute
swaps, all two-attribute swaps, or both.
-
Risk-utility calculations: for both single and batch swaps,
the DSTK calculates disclosure risk and data utility, the latter as absence
of data distortion.
-
Record weights, which are employed in data distortion
calculations. This allows DSTK users to quantify distortion caused by
data swapping in population-level estimates derived from survey data.
-
Visualization and manipulation of results of batch swaps,
including visualization of (distortion, risk) frontiers.
The DSTK provides a Java class library for performing customized
data swapping tasks.
The DSTK was produced by the Digital Government Research Program
at the NISS, with support from the National Science Foundation and the National
Center for Education Statistics. It was written by Ashish Sanil, William "Jimmy"
Fulp, Shanti Gomatam, Chunhua "Charlie" Liu and Alan Karr.
File |
Brief
Description |
| NISSDSTK.zip |
Complete DSTK distribution, including Java executables, a complete set
of demonstration files and extensive documentation (both a user guide
and detailed descriptions of the software). |
| dstk-doc.pdf |
User documentation, in PDF format. |
Events | Programs | Projects | Publications | People | Software | About
NISS | Home