• Title/Summary/Keyword: De-anonymity

Search Result 19, Processing Time 0.025 seconds

The De-identification Technique Using Data Grouping in Relational Database (관계형 데이터베이스에서 데이터 그룹화를 이용한 익명화 처리 기법)

  • Park, Jun-Bum;Jin, Seung-Hun;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.3
    • /
    • pp.493-500
    • /
    • 2015
  • Personal information exposed in the Internet is increasing by the public data opening and sharing, vitalization of SNS(Social Network Service) and growth of information shared between users. Exposed personal information in the Internet can infringe upon targeted users using linkage attack or background attack. To prevent these attack De-identification models were appeared a few years ago. The 'k-anonymity' has been introduced in the first place, and the '${\ell}$-diversity' and 't-closeness' have been followed up as solutions, and diverse algorithms have been being suggested for performance improvement nowadays. However, industry or public sectors actually needs a whole solution as a system for the de-identification process rather than performance of the de-identification algorithm. This paper explains a way of de-identification techique for 'k-anonymity', '${\ell}$-diversity', and 't-closeness' algorithm using QI(Quasi-Identifier) grouping method in the relational database.

Automated Classification of Unknown Smart Contracts of Ethereum Using Machine Learning (기계학습을 활용한 이더리움 미확인 스마트 컨트랙트 자동 분류 방안)

  • Lee, Donggun;Kwon, Taekyoung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.28 no.6
    • /
    • pp.1319-1328
    • /
    • 2018
  • A blockchain system developed for crypto-currency has attractive characteristics, such as de-centralization, distributed ledger, and partial anonymity, making itself adopted in various fields. Among those characteristics, partial anonymity strongly assures privacy of users, but side effects such as abuse of crime are also appearing, and so countermeasures for circumventing such abuse have been studied continuously. In this paper, we propose a machine-learning based method for classifying smart contracts in Ethereum regarding their functions and design patterns and for identifying user behaviors according to them.

Analysis of k Value from k-anonymity Model Based on Re-identification Time (재식별 시간에 기반한 k-익명성 프라이버시 모델에서의 k값에 대한 연구)

  • Kim, Chaewoon;Oh, Junhyoung;Lee, Kyungho
    • The Journal of Bigdata
    • /
    • v.5 no.2
    • /
    • pp.43-52
    • /
    • 2020
  • With the development of data technology, storing and sharing of data has increased, resulting in privacy invasion. Although de-identification technology has been introduced to solve this problem, it has been proved many times that identifying individuals using de-identified data is possible. Even if it cannot be completely safe, sufficient de-identification is necessary. But current laws and regulations do not quantitatively specify the degree of how much de-identification should be performed. In this paper, we propose an appropriate de-identification criterion considering the time required for re-identification. We focused on the case of using the k-anonymity model among various privacy models. We analyzed the time taken to re-identify data according to the change in the k value. We used a re-identification method based on linkability. As a result of the analysis, we determined which k value is appropriate. If the generalized model can be developed by results of this paper, the model can be used to define the appropriate level of de-identification in various laws and regulations.

An Effective Anonymization Management under Delete Operation of Secure Database (안전한 데이터베이스 환경에서 삭제 시 효과적인 데이터 익명화 유지 기법)

  • Byun, Chang-Woo;Kim, Jae-Whan;Lee, Hyang-Jin;Kang, Yeon-Jung;Park, Seog
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.3
    • /
    • pp.69-80
    • /
    • 2007
  • To protect personal information when releasing data, a general privacy-protecting technique is the removal of all the explicit identifiers, such as names and social security numbers. De-identifying data, however, provides no guarantee of anonymity because released information can be linked to publicly available information to identify them and to infer information that was not intended for release. In recent years, two emerging concepts in personal information protection are k-anonymity and $\ell$-diversity, which guarantees privacy against homogeneity and background knowledge attacks. While these solutions are signigicant in static data environment, they are insufficient in dynamic environments because of vulnerability to inference. Specially, the problem appeared in record deletion is to deconstruct the k-anonymity and $\ell$-diversity. In this paper, we present an approach to securely anonymizing a continuously changeable dataset in an efficient manner while assuring high data quality.

Why Do People Spread Online Rumors? An Empirical Study

  • Jong-Hyun Kim;Gee-Woo Bock;Rajiv Sabherwal;Han-Min Kim
    • Asia pacific journal of information systems
    • /
    • v.29 no.4
    • /
    • pp.591-614
    • /
    • 2019
  • With the proliferation of social media, it has become easier for people to spread rumors online, which can aggravate the issues arising from online rumors. There are many individuals and organizations that are adversely affected by malicious online rumors. Despite their importance, there has been little research into why and how people spread rumors online, thus inhibiting the understanding of factors that affect the spreading of online rumors. With attention seeking to address this gap, this paper draws upon the dual process theory and the de-individuation theory to develop a theoretical model of factors affecting the spreading of an online rumor, and then empirically tests it using survey data from 211 individuals about a specific rumor. The results indicate that the perceived credibility of the rumor affects the individuals' attitudes toward spreading it, which consequently affects the rumor spreading behavior. Vividness, confirmation of prior beliefs, argument strength, and source credibility positively influence the perceived credibility of online rumors. Finally, anonymity moderates the relationship between attitude toward spreading online rumors and the spreading behavior.

De-identifying Unstructured Medical Text and Attribute-based Utility Measurement (의료 비정형 텍스트 비식별화 및 속성기반 유용도 측정 기법)

  • Ro, Gun;Chun, Jonghoon
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.1
    • /
    • pp.121-137
    • /
    • 2019
  • De-identification is a method by which the remaining information can not be referred to a specific individual by removing the personal information from the data set. As a result, de-identification can lower the exposure risk of personal information that may occur in the process of collecting, processing, storing and distributing information. Although there have been many studies in de-identification algorithms, protection models, and etc., most of them are limited to structured data, and there are relatively few considerations on de-identification of unstructured data. Especially, in the medical field where the unstructured text is frequently used, many people simply remove all personally identifiable information in order to lower the exposure risk of personal information, while admitting the fact that the data utility is lowered accordingly. This study proposes a new method to perform de-identification by applying the k-anonymity protection model targeting unstructured text in the medical field in which de-identification is mandatory because privacy protection issues are more critical in comparison to other fields. Also, the goal of this study is to propose a new utility metric so that people can comprehend de-identified data set utility intuitively. Therefore, if the result of this research is applied to various industrial fields where unstructured text is used, we expect that we can increase the utility of the unstructured text which contains personal information.

An Improved Authentication and Key Agreement scheme for Session Initial Protocol

  • Wu, Libing;Fan, Jing;Xie, Yong;Wang, Jing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.8
    • /
    • pp.4025-4042
    • /
    • 2017
  • Session initiation protocol (SIP) is a kind of powerful and common protocols applied for the voice over internet protocol. The security and efficiency are two urgent requirements and admired properties of SIP. Recently, Hamed et al. proposed an efficient authentication and key agreement scheme for SIP. However, we demonstrate that Hamed et al.'s scheme is vulnerable to de-synchronization attack and cannot provide anonymity for users. Furthermore, we propose an improved and efficient authentication and key agreement scheme by using elliptic curve cryptosystem. Besides, we prove that the proposed scheme is provably secure by using secure formal proof based on Burrows-Abadi-Needham logic. The comparison with the relevant schemes shows that our proposed scheme has lower computation costs and can provide stronger security.

Implementation of efficient L-diversity de-identification for large data (대용량 데이터에 대한 효율적인 L-diversity 비식별화 구현)

  • Jeon, Min-Hyuk;Temuujin, Odsuren;Ahn, Jinhyun;Im, Dong-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.465-467
    • /
    • 2019
  • 최근 많은 단체나 기업에서 다양하고 방대한 데이터를 요구로 하고, 그에 따라서 국가 공공데이터나 데이터 브로커등 데이터를 통해 직접 수집 하거나 구매해야 하는 경우가 많아지고 있다. 하지만 개인정보의 경우 개인의 동의 없이는 타인에게 양도가 불가능하여 이러한 데이터에 대한 연구에 어려움이 있다. 그래서 특정 개인을 추론할 수 없도록 하는 비식별 처리 기술이 연구되고 있다. 이러한 비식별화의 정도는 모델로 나타낼 수가 있는데, 현재 k-anonymity 와 l-diversity 모델 등이 많이 사용된다. 이 중에서 l-diversity 는 k-anonymity 의 만족 조건을 포함하고 있어 비식별화의 정도가 더욱 강하다. 이러한 l-diversity 모델을 만족하는 알고리즘은 The Hardness and Approximation, Anatomy 등이 있는데 본 논문에서는 일반화 과정을 거치지 않아 유용성이 높은 Anatomy 의 구현에 대해 연구하였다. 또한 비식별화 과정은 전체 데이터에 대한 특성을 고려해야 하기 때문에 데이터의 크기가 커짐에 따라 실질적인 처리량이 방대해지는데, 이러한 문제를 Spark 를 통해 데이터가 커짐에 따라서 최대한 안정적으로 대응하여 처리할 수 있는 시스템을 구현하였다.

Privacy Vulnerability Analysis on Shuai et al.'s Anonymous Authentication Scheme for Smart Home Environment (Shuai등의 스마트 홈 환경을 위한 익명성 인증 기법에 대한 프라이버시 취약점 분석)

  • Choi, Hae-Won;Kim, Sangjin;Jung, Young-Seok;Ryoo, Myungchun
    • Journal of Digital Convergence
    • /
    • v.18 no.9
    • /
    • pp.57-62
    • /
    • 2020
  • Smart home based on Internet of things (IoT) is rapidly emerging as an exciting research and industry field. However, security and privacy have been critical issues due to the open feature of wireless communication channel. As a step towards this direction, Shuai et al. proposed an anonymous authentication scheme for smart home environment using Elliptic curve cryptosystem. They provided formal proof and heuristic analysis and argued that their scheme is secure against various attacks including de-synchronization attack, mobile device loss attack and so on, and provides user anonymity and untraceability. However, this paper shows that Shuai et al.'s scheme does not provide user anonymity nor untraceability, which are very important features for the contemporary IoT network environment.

Utility Analysis of Federated Learning Techniques through Comparison of Financial Data Performance (금융데이터의 성능 비교를 통한 연합학습 기법의 효용성 분석)

  • Jang, Jinhyeok;An, Yoonsoo;Choi, Daeseon
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.405-416
    • /
    • 2022
  • Current AI technology is improving the quality of life by using machine learning based on data. When using machine learning, transmitting distributed data and collecting it in one place goes through a de-identification process because there is a risk of privacy infringement. De-identification data causes information damage and omission, which degrades the performance of the machine learning process and complicates the preprocessing process. Accordingly, Google announced joint learning in 2016, a method of de-identifying data and learning without the process of collecting data into one server. This paper analyzed the effectiveness by comparing the difference between the learning performance of data that went through the de-identification process of K anonymity and differential privacy reproduction data using actual financial data. As a result of the experiment, the accuracy of original data learning was 79% for k=2, 76% for k=5, 52% for k=7, 50% for 𝜖=1, and 82% for 𝜖=0.1, and 86% for Federated learning.