• Title/Summary/Keyword: De-Identified Information usefulness measurement

Search Result 2, Processing Time 0.018 seconds

A study on the method of measuring the usefulness of De-Identified Information using Personal Information

  • Kim, Dong-Hyun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.6
    • /
    • pp.11-21
    • /
    • 2022
  • Although interest in de-identification measures for the safe use of personal information is growing at home and abroad, cases where de-identified information is re-identified through insufficient de-identification measures and inferences are occurring. In order to compensate for these problems and discover new technologies for de-identification measures, competitions to compete on the safety and usefulness of de-identified information are being held in Korea and Japan. This paper analyzes the safety and usefulness indicators used in these competitions, and proposes and verifies new indicators that can measure usefulness more efficiently. Although it was not possible to verify through a large population due to a significant shortage of experts in the fields of mathematics and statistics in the field of de-identification processing, very positive results could be derived for the necessity and validity of new indicators. In order to safely utilize the vast amount of public data in Korea as de-identified information, research on these usefulness metrics should be continuously conducted, and it is expected that more active research will proceed starting with this thesis.

Data Quality Measurement on a De-identified Data Set Based on Statistical Modeling (통계모형의 정확도에 기반한 비식별화 데이터의 품질 측정)

  • Chun, Heuiju;Yi, Hyun Jee;Yeon, Kyupil;Kim, Dongrae
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.5
    • /
    • pp.553-561
    • /
    • 2019
  • In this study, the method of quality measurement for the statistical usefulness of de-identified data was examined in terms of prediction accuracy by statistical modeling. In the era of the 4th industrial revolution, effective use of big data is essential to innovation through information and communication technology, but personal information issues are constrained to actively utilize big data. In order to solve this problem, de-identification guidelines have been established and the possibility of actual re-identification of personal information has become very low due to the utilization of various de-identification methods. On the other hand, strong de-identification can have side effects that degrade the usefulness of the data. We have studied the quality of statistical usefulness of the de-identified data by KLT model which is a representative de-identification method, A case study was conducted to see how statistical accuracy of prediction is degraded by de-identification. We also proposed a new measure of data usefulness of the de-identified data by quantifying how much data is added to the de-identified data to restore the accuracy of the predictive model.