DOI QR코드

DOI QR Code

Development of Simulation Tool to Support Privacy-Preserving Data Collection

프라이버시 보존 데이터 수집을 지원하기 위한 시뮬레이션 툴 개발

  • Kim, Dae-Ho (Department of Computer Science, Sangmyung University) ;
  • Kim, Jong Wook (Department of Computer Science, Sangmyung University)
  • Received : 2017.12.10
  • Accepted : 2017.12.25
  • Published : 2017.12.31

Abstract

In theses days, data has been explosively generated in diverse industrial areas. Accordingly, many industries want to collect and analyze these data to improve their products or services. However, collecting user data can lead to significant personal information leakage. Local differential privacy (LDP) proposed by Google is the state-of-the-art approach that is used to protect individual privacy in the process of data collection. LDP guarantees that the privacy of the user is protected by perturbing the original data at the user's side, but a data collector is still able to obtain population statistics from collected user data. However, the prevention of leakage of personal information through such data perturbation mechanism may cause the significant reduction in the data utilization. Therefore, the degree of data perturbation in LDP should be set properly depending on the data collection and analysis purposes. Thus, in this paper, we develop the simulation tool which aims to help the data collector to properly chose the degree of data perturbation in LDP by providing her/him visualized simulated results with various parameter configurations.

빅데이터 시대의 도래로 다양한 데이터들이 발생되고 있다. 많은 산업 부분에서는 이러한 데이터들을 수집하여 분석하고자 한다. 하지만 사용자 정보 수집은 직접적인 개인정보 유출을 초래할 수 있다. 구글(Google) 사에서 제안한 지역 차분 프라이버시 기법은 데이터 변조를 통해 사용자 정보 수집에 있어 발생할 수 있는 개인정보 유출을 방지한다. 이러한 데이터 변조를 통한 개인정보 유출 방지는 그 변조되는 정도가 높을수록 개인정보를 강력히 보장하지만 이와 반대로 데이터의 활용도는 현저히 떨어진다. 그래서 데이터 변조의 정도를 데이터 수집목적에 적합하게 설정해야한다. 본 논문에서 제시하는 시뮬레이션 도구는 지역 차분 프라이버시를 만족하는 사용자 정보 수집에 있어 설정해야하는 다양한 변수값을 데이터 수집환경에 맞게 적용함으로써 데이터 수집가가 자신의 환경에 맞는 데이터 수집을 할 수 있도록 지원한다.

Keywords

References

  1. U. Erlingsson, V. Pihur, and A. Korolova. "RAPPOR: Randomized Aggregatable Privacy-Preserving Ordinal Response". in Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, pp.1054-1067, 2014.
  2. C. Dwork. "Differential privacy". in Proceedings of the 33rd International Conference on Automata, Languages and Programming, pp. 338-340, 2006.
  3. C. Dwork, F. McSherry, K. Nissim, and A. Smith. "Calibrating noise to sensitivity in private data analysis". In Proceedings of the Third conference on Theory of Cryptography, 2006.
  4. G. Fanti, V. Pihur, and U. Erlingsson. "Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries". in Proceedings of the Privacy Enhancing Technologies Symposium, pp. 41-61 2016.
  5. Z. Qin, Y. Yang, T. Yu, I. Khalil, X. Xiao, and K. Ren. "Heavy Hitter Estimation over Set-Valued Data with Local Differential Privacy", in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 192-203, 2016.
  6. J.W. Kim, D.H. Kim and B.C. Jang. "Application of Local Differential Privacy to Collection of Indoor Positioning Data", under submission.
  7. S. L. Warner. "Randomized response: a survey technique for eliminating evasive answer bias". Journal of the American Statistical Association, 60(309), 1965.
  8. S. Peng, Y. Yang, Z. Zhang, M. Winslett and Y. Yu. "Query optimization for differentially private data management systems", In Proceedings of the IEEE International Conference on Data Engineering, 2013.
  9. W. Qardaji, W. Yang and N. Li. "Differentially private grids for geospatial data", In Proceedings of the IEEE International Conference on Data Engineering, 2013.
  10. A. Friedman and A. Schuster. "Data Mining with Differential Privacy", in Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 493-502, 2010
  11. A. Pryke, S. Mostaghim, "A. Nazemi. Heatmap Visualisation of Population Based Multi Objective Algorithms", in International Conference on Evolutionary Multi-Criterion Optimization, pp. 361-375, 2007
  12. S.H. Bak, H.B. You, J.H. Bae, T.J. Choi. "Implementation of Public Data Contents Using Big Data Visualization Technology - Focusing on Utilization of Map Visualization Technique", Journal of the Digital Contents Society, Vol. 18, No.7, pp. 1427-1434, 2017 https://doi.org/10.9728/DCS.2017.18.7.1427