A Method of Masking for 2005 Korean Census Microdata

Jeong, Dong-Myeong;Jeong, Mi-Ock;

doi:10.5351/KJAS.2008.21.2.313

The Korean Journal of Applied Statistics (응용통계연구)

Volume 21 Issue 2
/
Pages.313-325
/
2008
/
1225-066X(pISSN)
/
2383-5818(eISSN)

The Korean Statistical Society (한국통계학회)

DOI QR Code

A Method of Masking for 2005 Korean Census Microdata

인구주택총조사 마이크로자료의 개인정보 노출제한방법

Jeong, Dong-Myeong (Statistics Research Institute, KNSO, Government Complex Daejeon) ;
Jeong, Mi-Ock (Statistics Research Institute, KNSO, Government Complex Daejeon)

정동명 (정부대전청사, 통계청 통계개발원 연구기획실) ;
정미옥 (정부대전청사, 통계청 통계개발원 연구기획실)

Published : 2008.04.30

https://doi.org/10.5351/KJAS.2008.21.2.313 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Large amounts of information on individuals is available to many organizations and data users and government agencies release microdata files from their survey data or administrative records data. However, if a microdata file is released without any limitation, an invasion of privacy is likely to occur. Therefore, in creating a microdata file, agencies attempt to eliminate disclosure risk of the file while maintaining maximum utility of the data. In this paper, we introduce the concept of disclosure risk, identification and uniqueness. Also, we show the method for creating a 2% microdata file using the 2005 Korean census microdata.

통계이용자들의 마이크로자료 제공요구가 갈수록 증가하고 있으며 통계작성기관도 마이크로자료의 제공을 위해 노력을 기울이고 있는 실정이다. 그러나 마이크로자료에는 응답자의 개인정보가 많이 담겨 있으므로 자료를 그대로 제공할 경우 개인정보가 노출 될 가능성이 높기 때문에 자료제공시 적절한 방법으로 노출을 제한시켜 주어야만 한다. 본 논문에서는 마이크로자료 제공시 발생하는 응답자의 정보노출에 대한 개념과 이를 제한하는 방법 등을 소개하고, 2005년에 통계청에서 실시한 인구주택총조사의 2% 마이크로자료 제공을 위해 다양한 노출제한방법을 적용하여 자료파일을 작성하는 과정을 설명하였다. 즉, 10% 표본조사결과를 모집단으로 하고 계통추출한 표본을 대상으로 외부인이 식별할 가능성이 높은 12개 항목을 key 변수로 선정한 후, 각 변수의 조합별 유일성을 파악하고 노출위험을 계산하였다. 그 결과 2% 표본을 통한 정보의 축소는 물론 그룹화, 코딩 등을 포함한 일련의 방법들을 적용함으로써 인구주택총조사 마이크로자료의 개인정보 노출을 제한하는데 상당한 효과가 있음을 알 수 있었다.

Keywords

References

정동명, 김저익, 강동환 (2007). 인구센서스자료의 비밀보호방법, <통계연구>, 12, 95-121
통계청 (2005) 인구주택총조사 조사지침서, 통계청
Bethlehem, J. G., Keller, W. J. and Pannekoek, J. (1990). Disclosure control of microdata, Journal of the American Statistical Association, 85, 38-45 https://doi.org/10.2307/2289523
Dalenius, T. and Reiss, S. P. (1982). Data-swapping: A technique for disclosure control, Journal of Statistical Planning and Inference, 6, 73-85 https://doi.org/10.1016/0378-3758(82)90058-1
Fuller, W. A. (1993). Masking procedures for microdata disclosure limitation, Journal of Official Statistics, 9, 383-406
Kim, J. (1986). A method for limiting disclosure in microdata based on random noise and transformation, In Proceedings of the Section on Survey Research Methods, American Statistical Association, 370-374
Kim, J. and Jeong, D. M. (2007). The Application of the Concept of Uniqueness for Creating Public Use Microdata Files, In Proceedings of the Joint UNECE/Eurostat Work Session on Statistical Data Confidentiality, UNECE, To appear
Skinner, C., Marsh, C., Openshaw, S. and Wymer, C. (1994). Disclosure control for census microdata, Journal of Offcial Statistics, 10, 31-51

Cited by

Test for Theory of Portfolio Diversification vol.24, pp.1, 2011, https://doi.org/10.5351/KJAS.2011.24.1.001
Statistical disclosure control for public microdata: present and future vol.29, pp.6, 2016, https://doi.org/10.5351/KJAS.2016.29.6.1041
Application of a Statistical Disclosure Control Techniques Based on Multiplicative Noise vol.24, pp.1, 2011, https://doi.org/10.5351/KJAS.2011.24.1.127

The Korean Journal of Applied Statistics (응용통계연구)

A Method of Masking for 2005 Korean Census Microdata

인구주택총조사 마이크로자료의 개인정보 노출제한방법

Abstract

Keywords

References

Cited by

Detail Search