Crawlers and Morphological Analyzers Utilize to Identify Personal Information Leaks on the Web System

Lee, Hyeongseon;Park, Jaehee;Na, Cheolhun;Jung, Hoekyung;

한국정보통신학회:학술대회논문집 (Proceedings of the Korean Institute of Information and Commucation Sciences Conference)

한국정보통신학회 (The Korea Institute of Information and Commucation Engineering)

크롤러와 형태소 분석기를 활용한 웹상 개인정보 유출 판별 시스템

Crawlers and Morphological Analyzers Utilize to Identify Personal Information Leaks on the Web System

이형선 (배재대학교) ;
박재희 (배재대학교) ;
나철훈 (국립목포대학교 정보통신공학과) ;
정회경 (배재대학교)

Lee, Hyeongseon (Paichai University) ;
Park, Jaehee (Paichai University) ;
Na, Cheolhun (Dept. of Information & comm. Eng. Mokpo National University) ;
Jung, Hoekyung (Paichai University)

발행 : 2017.10.25

PDF

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

최근 개인정보 유출 문제가 대두됨에 따라 데이터 수집과 웹 문서 분류에 관한 연구들이 이루어지고 있다. 기존 시스템은 개인정보의 유무 여부만 판단하고 동명이인이나 사용자가 게시한 문서에 대한 분류는 이루어지지 않기 때문에 불필요한 데이터가 필터링 되지 않는 문제점이 있다. 본 논문에서는 이를 해결하기 위해 크롤러와 형태소 분석기를 활용하여 유출된 데이터의 유형이나 동음이의어를 식별할 수 있는 시스템을 제안한다. 사용자는 크롤러를 통해 웹상의 개인정보를 수집한다. 수집된 데이터는 형태소 분석기를 통해 분류한 후 유출된 데이터를 확인할 수 있다. 또한 시스템을 재사용 할 경우 정확도가 더 높은 결과를 얻을 수 있다. 이를 통해 사용자는 맞춤형 데이터를 제공 받을 수 있을 것으로 사료된다.

Recently, as the problem of personal information leakage has emerged, studies on data collection and web document classification have been made. The existing system judges only the existence of personal information, and there is a problem in that unnecessary data is not filtered because classification of documents published by the same name or user is not performed. In this paper, we propose a system that can identify the types of data or homonyms using the crawler and morphological analyzer for solve the problem. The user collects personal information on the web through the crawler. The collected data can be classified through the morpheme analyzer, and then the leaked data can be confirmed. Also, if the system is reused, more accurate results can be obtained. It is expected that users will be provided with customized data.

한국정보통신학회:학술대회논문집 (Proceedings of the Korean Institute of Information and Commucation Sciences Conference)

크롤러와 형태소 분석기를 활용한 웹상 개인정보 유출 판별 시스템

Crawlers and Morphological Analyzers Utilize to Identify Personal Information Leaks on the Web System

초록

키워드

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)