BAYESIAN INSIGHTS ON DISCLOSURE LIMITATION: MASK OR IMPUTE?
Statistical agencies seek to disseminate useful data while keeping low the risk of statistical confidentiality disclosure. Recognizing that reidentification of data is generally inadequate to protect its confidentiality against attack by a data snooper, agencies restrict the data they release for general use. Typically, these restricted data procedures have involved transformation or masking of the original, collected data through such devices as adding noise, topcoding, data swapping, and recoding. Recently, proposals have been put forth for the release of synthetic data, simulated from models constructed from the original data. This paper gives a framework for the comparison of masking and synthetic data as two approaches to disclosure limitation. Particular attention is paid to data utility and disclosure risk. Examples of instantiation of masking and of synthetic data construction are provided to illustrate the concepts. Particular attention is paid to data swapping. Insights drawn from the Bayesian paxadigm are provided.
- Research Organization:
- Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
- Sponsoring Organization:
- US Department of Energy (US)
- DOE Contract Number:
- W-7405-ENG-36
- OSTI ID:
- 766734
- Report Number(s):
- LA-UR-00-3771; TRN: AH200038%%610
- Resource Relation:
- Conference: Conference title not supplied, Conference location not supplied, Conference dates not supplied; Other Information: PBD: 1 Oct 2000
- Country of Publication:
- United States
- Language:
- English
Similar Records
Comparative Study of Differentially Private Data Synthesis Methods
SU-E-T-603: Analysis of Optical Tracked Head Inter-Fraction Movements Within Masks to Access Intracranial Immobilization Techniques in Proton Therapy