• Title/Summary/Keyword: 스팸 필터링

Search Result 85, Processing Time 0.024 seconds

Designed of personalized mail Filtering System using Support vector machines (멀티모델 기반의 개인화된 메일 필터링 시스템)

  • Park, You-Na;Chang, Hwan;Lee, Bog-Ju
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2003.10a
    • /
    • pp.172-174
    • /
    • 2003
  • 전자우편은 인터넷의 성장과 함께 필수적인 점보교환 수단으로 자리잡고 있다. 그 신속성과 용이성을 이용하여 많은 기업과 업체들이 손쉽게 광고 수단으로 이용하여 이로 인하여 개인과 기업에 큰 피해를 초래하고 있다. 필요한 스팸메일을 선정하여 분류하는데 개인과 조직에 많은 정신적 물리적인 스트레스를 요구한다. 본 논문에서는 통계적 학습 방법인 SVM을 이용하여 지속적으로 변화하는 다양한 스팸메일을 분류하고자 한다. 실험결과는 스팸메일 분류에 안정적인 성능을 보여줄 뿐 아니라 다양한 종류의 스팸메일을 카테고리별로 구분해 내는데 높은 성능을 보여준다.

  • PDF

A Technique of Statistical Message Filtering for Blocking Spam Message (통계적 기법을 이용한 스팸메시지 필터링 기법)

  • Kim, Seongyoon;Cha, Taesoo;Park, Jeawon;Choi, Jaehyun;Lee, Namyong
    • Journal of Information Technology Services
    • /
    • v.13 no.3
    • /
    • pp.299-308
    • /
    • 2014
  • Due to indiscriminately received spam messages on information society, spam messages cause damages not only to person but also to our community. Nowadays a lot of spam filtering techniques, such as blocking characters, are studied actively. Most of these studies are content-based spam filtering technologies through machine learning.. Because of a spam message transmission techniques are being developed, spammers have to send spam messages using term spamming techniques. Spam messages tend to include number of nouns, using repeated words and inserting special characters between words in a sentence. In this paper, considering three features, SPSS statistical program were used in parameterization and we derive the equation. And then, based on this equation we measured the performance of classification of spam messages. The study compared with previous studies FP-rate in terms of further minimizing the cost of product was confirmed to show an excellent performance.

A Classification Method for Deformed Words Using Multiple Sequence Alignment (다중서열정렬을 이용한 변형단어집합의 분류 기법)

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.264-266
    • /
    • 2012
  • 인터넷 상에서의 변형 단어들을 처리하는 문제는 정보 검색, 기계 번역, 웹 마이닝, 욕설 및 스팸 필터링과 같은 다양한 분야에서 사용될 수 있다. 특히 단어의 변형 추이를 파악하는 등 데이터 수집 및 분석을 위해서는 주어진 단어가 어떤 변형 단어의 집합으로 이루어진 부류에 포함되는지 여부를 파악해야 할 필요성이 있다. 본 논문에서는 같은 부류에 속한 변형 단어 집합에 대하여 다중 서열 정렬(multiple sequence alignment)을 수행함으로써 해당 집합을 하나의 대표 문자열로 취급하는 변환 기법을 제안하고, 이를 이용해 주어진 단어가 해당 부류에 속하는지 여부를 효과적으로 분류하는 기법을 소개한다. 실험결과 제안 기법의 분류 성능은 민감도 93.4% 수준에서 89.1%의 특이도를 보여 전수 비교를 통한 분류에 비하여 결코 성능은 하락하지 않으면서 분류 속도는 16.5배 향상되었음을 확인할 수 있었다.

Email Classification using Dynamic Category Hierarchy and Non-negative Matrix Factorization (비음수 행렬 분해와 동적 분류체계를 사용한 이메일 분류)

  • Park, Sun;An, Dong Un
    • Annual Conference on Human and Language Technology
    • /
    • 2009.10a
    • /
    • pp.35-39
    • /
    • 2009
  • 이메일의 사용증가로 수신 메일을 효율적이면서 정확하게 분류할 필요성이 점차 증가하고 있다. 현재의 이메일 분류는 베이지안, 규칙 기반 등을 이용하여 스팸 메일을 필터링하기 위한 이원 분류가 주를 이루고 있다. 클러스터링을 이용한 다원 분류 방법은 분류의 정확도가 떨어지는 단점이 있다. 본 논문에서는 비음수 행렬 분해(NMF, Non-negative Matrix Factrazation)를 기반으로 한 자동 분류 주제 생성 방법과 동적 분류 체계(DCH, Dynamic Category Hierachy) 방법을 결합한 새로운 이메일 분류 방법을 제안한다. 이 방법은 수신되는 이메일을 자동으로 분류하여 대량의 메일을 효율적으로 관리할 수 있으며, 분류 결과 사용자의 요구사항을 만족하지 못하면 메일을 동적으로 재분류 하여 분류 정확률을 높일 수 있다.

  • PDF

A Study on Tools for Agent System Development (사회공학적 이메일 공격 대비 모의훈련 시스템 설계)

  • Lim, Il-kwon;Kim, Young-Hyuk;Lee, Jae-Pil;Lee, Jae-Gwang;Nam-Gung, Hyun;Lee, Jae-Kwang
    • Annual Conference of KIPS
    • /
    • 2014.04a
    • /
    • pp.471-474
    • /
    • 2014
  • 사회공학적 공격이란 인간의 심리를 이용하여 보안 위협 상황을 갖게 하는 공격을 말한다. 그렇기 때문에 사회공학적 공격을 막기 위한 보안 솔루션은 그 한계가 있기 마련이다. 그리하여 본 논문에서는 사회공학적 공격에 대비하는 보안훈련시스템을 제안한다. 스팸 및 피싱 이메일을 수집하여, 시그니처 기반 필터링을 이용하여, 최신의 사회공학적 공격 이메일을 분석한 후, 가상으로 사회공학적 이메일 공격을 실시하여 훈련대상자들이 최신의 사회공학적 공격에 대비하는 능력을 갖추게 하는 보안 훈련 시스템을 설계하였다.

Automatic Inter-Phoneme Similarity Calculation Method Using PAM Matrix Model (PAM 행렬 모델을 이용한 음소 간 유사도 자동 계산 기법)

  • Kim, Sung-Hwan;Cho, Hwan-Gue
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.3
    • /
    • pp.34-43
    • /
    • 2012
  • Determining the similarity between two strings can be applied various area such as information retrieval, spell checker and spam filtering. Similarity calculation between Korean strings based on dynamic programming methods firstly requires a definition of the similarity between phonemes. However, existing methods have a limitation that they use manually set similarity scores. In this paper, we propose a method to automatically calculate inter-phoneme similarity from a given set of variant words using a PAM-like probabilistic model. Our proposed method first finds the pairs of similar words from a given word set, and derives derivation rules from text alignment results among the similar word pairs. Then, similarity scores are calculated from the frequencies of variations between different phonemes. As an experimental result, we show an improvement of 10.1%~14.1% and 8.1%~11.8% in terms of sensitivity compared with the simple match-mismatch scoring scheme and the manually set inter-phoneme similarity scheme, respectively, with a specificity of 77.2%~80.4%.

Spear-phishing Mail Filtering Security Analysis : Focusing on Corporate Mail Hosting Services (스피어피싱 메일 필터링 보안 기능 분석 : 기업메일 호스팅 서비스 중심으로)

  • Shin, Dongcheon;Yum, Dayun
    • Convergence Security Journal
    • /
    • v.20 no.3
    • /
    • pp.61-69
    • /
    • 2020
  • Since spear-phishing mail attacks focus on a particular target persistently to collect and take advantage of information, it can incur severe damage to the target as a part of the intelligent and new attacks such as APT attacks and social engineering attacks. The usual spam filtering services can have limits in countering spear-phishing mail attacks because of different targets, goals, and methods. In this paper, we analyze mail security services of several enterprises hosted by midium and small-sized enterprises with relatively security vulnerabilities in order to see whether their services can effectively respond spear-phishing mail attacks. According to the analysis result, we can say that most of mail security hosting services lack in responding spear-phishing mail attacks by providing functions for mainly managing mails including spam mail. The analysis result can be used as basic data to extract the effective and systematic countermeasure.

A Study on Conditional Access System for Data Confidential using Smart-Card (스마트 카드를 이용한 자료 유출 제한 시스템에 대한 연구)

  • 김신홍;이광제
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.37 no.5
    • /
    • pp.125-131
    • /
    • 2000
  • In this paper, we proposed conditional access algorithm for data confidential using smart card. This algorithm is constructed smart card and E-mail gateway for restricting of user's illegal confidential data transmission. After processing of certification procedure in smart card, each E-mail forwarded to E-mail gateway(EG). The EG selects outgoing E-mail and it is sent to fire-wall E-mail processing program, it is checked attached file in transmission mail and if it is attached file, it writes to database. This time, it can be used evidence data about user's illegal confidential data transmission, because of using registered content and smart card certification data in database. in addition to, we can get psychologically effect of prevention to send illegally, and this system can prevent spam mail in EG, also.

  • PDF

Automatic e-mail classification using Dynamic Category Hierarchy and Principal Component Analysis (주성분 분석과 동적 분류체계를 사용한 자동 이메일 분류)

  • Park, Sun;Kim, Chul-Won;Lee, Yang-weon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2009.05a
    • /
    • pp.576-579
    • /
    • 2009
  • The amount of incoming e-mails is increasing rapidly due to the wide usage of Internet. Therefore, it is more required to classify incoming e-mails efficiently and accurately. Currently, the e-mail classification techniques are focused on two way classification to filter spam mails from normal ones based mainly on Bayesian and Rule. The clustering method has been used for the multi-way classification of e-mails. But it has a disadvantage of low accuracy of classification. In this paper, we propose a novel multi-way e-mail classification method that uses PCA for automatic category generation and dynamic category hierarchy for high accuracy of classification. It classifies a huge amount of incoming e-mails automatically, efficiently, and accurately.

  • PDF

Competition Relation Extraction based on Combining Machine Learning and Filtering (기계학습 및 필터링 방법을 결합한 경쟁관계 인식)

  • Lee, ChungHee;Seo, YoungHoon;Kim, HyunKi
    • Journal of KIISE
    • /
    • v.42 no.3
    • /
    • pp.367-378
    • /
    • 2015
  • This study was directed at the design of a hybrid algorithm for competition relation extraction. Previous works on relation extraction have relied on various lexical and deep parsing indicators and mostly utilize only the machine learning method. We present a new algorithm integrating machine learning with various filtering methods. Some simple but useful features for competition relation extraction are also introduced, and an optimum feature set is proposed. The goal of this paper was to increase the precision of competition relation extraction by combining supervised learning with various filtering methods. Filtering methods were employed for classifying compete relation occurrence, using distance restriction for the filtering of feature pairs, and classifying whether or not the candidate entity pair is spam. For evaluation, a test set consisting of 2,565 sentences was examined. The proposed method was compared with the rule-based method and general relation extraction method. As a result, the rule-based method achieved positive precision of 0.812 and accuracy of 0.568, while the general relation extraction method achieved 0.612 and 0.563, respectively. The proposed system obtained positive precision of 0.922 and accuracy of 0.713. These results demonstrate that the developed method is effective for competition relation extraction.