A Study on a Differentially Private Model for Financial Data

Kim, Hyun-il;Park, Cheolhee;Hong, Dowon;Choi, Daeseon;

doi:10.13089/JKIISC.2017.27.6.1519

Journal of the Korea Institute of Information Security & Cryptology (정보보호학회논문지)

Volume 27 Issue 6
/
Pages.1519-1534
/
2017
/
1598-3986(pISSN)
/
2288-2715(eISSN)

Korea Institute of Information Security and Cryptology (한국정보보호학회)

DOI QR Code

A Study on a Differentially Private Model for Financial Data

금융 데이터 상에서의 차분 프라이버시 모델 정립 연구

Kim, Hyun-il (Kongju National University) ;
Park, Cheolhee (Kongju National University) ;
Hong, Dowon (Kongju National University) ;
Choi, Daeseon (Kongju National University)

김현일 (공주대학교) ;
박철희 (공주대학교) ;
홍도원 (공주대학교) ;
최대선 (공주대학교)

Received : 2017.09.18
Accepted : 2017.10.24
Published : 2017.12.31

https://doi.org/10.13089/JKIISC.2017.27.6.1519 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Data de-identification is the one of the technique that preserves individual data privacy and provides useful information of data to the analyst. However, original de-identification techniques like k-anonymity have vulnerabilities to background knowledge attacks. On the contrary, differential privacy has a lot of researches and studies within several years because it has both strong privacy preserving and useful utility. In this paper, we analyze various models based on differential privacy and formalize a differentially private model on financial data. As a result, we can formalize a differentially private model on financial data and show that it has both security guarantees and good usefulness.

데이터 비식별화 기법은 데이터 내에 속한 개인 정보에 대한 프라이버시를 만족하면서 동시에 데이터 분석가들에게 유용한 정보를 습득할 수 있게 하는 반드시 필요한 기술 중 하나이다. 그러나 k-익명성과 같은 기존의 비식별화 기법은 공격자의 사전지식(Background knowledge)에 근본적으로 취약한 약점을 지니고 있다. 하지만 차분 프라이버시(Differential privacy)는 기존의 비식별화 기법들과는 다르게 개인 정보에 대한 강력한 안전성을 보장하는 모델로써 최근 들어 이에 대한 연구가 매우 활발히 진행 중에 있다. 본 논문은 이러한 차분 프라이버시가 적용된 기술에 대한 연구 및 분석을 통해 금융 데이터 상에서의 차분 프라이버시 모델을 정립하였으며 이러한 모델들은 금융 데이터 상에서 유용하게 사용될 수 있음을 입증하였다.

Keywords

References

L.Sweeney, "k-anonymity: A model for protecting privacy," International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol. 10, no.5, pp.557-570, 2002. https://doi.org/10.1142/S0218488502001648
Office for government Policy Coordination, Prime Minister's Secretariat, Ministry of the Interior and Safety, Korea Communications Commission, Financial Services Commission, Ministry of Science and ICT, Ministry of Health & Welfare, "Guidelines for data de-identification - Guidance on de-identification standard, support and management system,", https://www.privacy.go.kr/inf/gdl/selectBoardArticle.do?nttId=7187&bbsId=BBSMSTR_000000000044&bbsTyCode=BBST01&bbsAttrbCode=BBSA03&authFlag=Y&pageIndex=1&searchCnd=&searchWrd=&replyLc=0&nttSj, June, 2016.
J.Kim, "Presentation of data linkage case of SK Telecom: Creation and distribution demonstration of personal information de-identification data," Seminar on de-identified demonstaration for big data on the fourth industrial revolution, 2017.
A.Machanavajjhala, D.Kifer, J.Gehrke and M.Venkitasubramaniam, "L-diversity: Privacy beyond k-anonymity," ACM Transactions on Knowledge Discovery from Data (TKDD), vol. 1, no. 1, Article 3, 2007.
C.Wong, J.Li, W.Fu and K.Wang, "(${\alpha}$, k)-anonymity: an enhanced k-anonymity model for privacy-preserving data publishing," Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.754-759, 2006.
N.Li, T.Li, and S.Venkatasubramanian, "t-closeness: Privacy beyond k-anonymity and l-diversity," Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on, pp. 106-115, April, 2007.
N.Mohammed, R.Chen, B.Fung and P.S.Yu, "Differentially private data release for data mining," Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp.493-501, 2011.
A.Narayanan and V.Shmatikov, "Robust de-anonymization of large sparse datasets," Security and Privacy, IEEE Symposium on, pp. 111-125, May, 2008.
P.Ohm, "Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization," UCLA Law Review, Research Information Network, vol.57, no.6, pp-1701-1777, 2009.
C.Dwork, A.Roth, "The algorithmic foundations of differential privacy," Foundations and Trends$^{(R)}$ in Theoretical Computer Science, pp.211-407, 2014.
C.Dwork, F.McSherry, L.Nissim and A.Smith, "Calibrating noise to sensitivity in private data analysis, " Third Theory of Cryptography Conference(TCC), vol.3876, pp.265-284, 2006.
C.Park, D.Hong, C.Seo "Differentially private data release method for general use of data," Korea Computer Congress, pp.1036-1038, 2017.
Financial Security Institue, "Present condition on introduction for domestic and foreign financial machine learning techniques," http://www.fsec.or.kr/user/bbs/fsec/42/312/bbsDataView/899.do, 2017.
K.Ligett, "Introduction to differential privacy, randomized response, basic properties," The 7th BIU Winter School on Cryptography, BIU, 2017.
J.Wang, S.Liu and Y.Li, "A review of differential privacy in individual data release," International Journal of Distributed Sensor Networks, vol.11, no.10, 2015.
F.McSherry and K.Talwar, "Mechanism design via differential privacy,", Foundations of Computer Science, pp.94-103, 2007.
C. Dwork, G. N. Rothblum, and S. P. Vadhan, "Boosting and differential privacy," Foundations of Computer Science, pp 51-60. 2010.
F.McSherry, "Privacy integrated queries: an extensible platform for privacy-preserving data analysis," Communications of the ACM, vol. 53, no. 9, pp. 89-97, 2010. https://doi.org/10.1145/1810891.1810916
S.L.Garfinkel, "NISTIR8053: De-identification of personal information," Technical report, National Institute of Standards Technology, 2015.
B.C.Fung, K.Wang, P.S.Yu, "Top-down specialization for information and privacy preservation, " Data Engineering, Proceedings 21st International Conference on IEEE, pp.205-216, 2005.
J.Gardner, L.Xiong, Y.Xiao, J.Gao, A.R.Post, X.Jiang and L.Ohno-Machado, "SHARE: system design and case studies for statistical health information release," Journal of the American Medical Informatics Association, vol.20, no.1, pp.109-116, 2012.
Y.Xiao, L.Xiong, C.Yuan, "Differentially private data release through multidimensional partitioning,", Secure Data Management, pp.150-168, 2010.
J.L.Bentley, "Multidimensional binary search trees used for associative searching,", Communications of the ACM, vol.18, no.9, pp.509-517, 1975. https://doi.org/10.1145/361002.361007
Y.Lim, "Evaluation and future challenges of de-identification techniques," Big data utilization and privacy protection: Information technology solution for object conflicts, Financial Information Society of Korea, Korea Money and Finance Association, Common policy symposium on spring, 2017.
"https://onthemap.ces.census.gov/", OnTheMap.
A.Machanvajjhala, D.Kifer, J.Abowd, J.Gehrke and L.Vilhuber, "Privacy: Theory meets practice on the map," Data Engineering, IEEE 24th International Conference on, pp.277-286, 2008.
N.Li, W.H.Qardaji and D.Su, "Provably private data anonymization:Or, k-anonymity meets differential privacy, " CERIAS Technical Report, 2010.
Z.Ji, Z.Lipton and C.Elkan, "Differential privacy and machine learning: a survey and review," arXiv preprint, 2014.
J.R. Quinlan, "Induction of decision trees," Machine learning, vol.1, no.1, pp.81-106, 1986. https://doi.org/10.1007/BF00116251
J.R. Quinlan, C4.5: Programs for machine learning, Elsevier, 2014.
S.Fletcher, M.Z.Islam, "Decision tree classfication with differential privacy: A Survey,", arXiv preprint, 2016.
S.P.Kasiviswanathan, H.K.Lee, K.Nissim, S.Raskhodnikova and A.Smith, "What can we learn privately?," SIAM Journal on Computing, vol.40, no.3, pp.793-826, 2011. https://doi.org/10.1137/090756090
U.Erlingsson, V.Pihur and A.Korolova, "RAPPOR: Randomized aggregatable privacy-preserving ordinal response, " Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp.1054-1067, 2014.
Google, "Chrome Privacy Whitepaper, "https://www.google.co.kr/intl/ko/chrome/browser/privacy/whitepaper.html
Apple, "guides and sample code," https://developer.apple.com/library/content/releasenotes/General/WhatsNewIniOS/Articles/iOS10.html
L.Fan and L.Xiong, "Differentially private anomaly detection with a case study on epidemic outbreak detection," Data Mining Workshops, IEEE 13th International Conference on, pp.833-840, 2013.
J.Reed and B.C.Pierce, "Distance makes the types grow stronger: a calculus for differential privacy," ACM Sigplan Notices, vol.45, no.9, pp.157-168, 2010. https://doi.org/10.1145/1932681.1863568
M.Gaboardi, A.Haeberlen, J.Hsu, A.Narayan and B.C.Pierce, "Linear dependent types for differential privacy," ACM SIGPLAN Notices, vol.48, no.1, pp.357-370, 2013.
A.Friedman and A.Schuster, "Data mining with differential privacy," Proceedings of the 16th ACM SIGKDD International Conference on Konwledge Discovery and Data Mining, pp.493-502, 2010.
J.Gardner and L.Xiong, "HIDE: an integrated system for health information DE-identification," Computer-Based Medical Systems, 2008.
Financial Security Institute, "Survey on machine learning technologies," http://www.fsec.or.kr/user/bbs/fsec/42/312/bbsDataView/355.do?page=7&column=&search=&searchSDate=&searchEDate=&bbsDataCategory=
UCI Repository, "German Credit Data, https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29
R.Shokri, M.Stronati, C.Song and V.Shmatikov, "Membership inference attacks against machine learning models," Security and Privacy, IEEE Symposium on, pp.3-18, 2017.
M.Fredrikson, S.Jha and T.Ristenpart, "Model inversion attacks that exploit confidence information and basic countermeasures," Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp.1322-1333, 2015.

Journal of the Korea Institute of Information Security & Cryptology (정보보호학회논문지)

A Study on a Differentially Private Model for Financial Data

금융 데이터 상에서의 차분 프라이버시 모델 정립 연구

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)