• 제목/요약/키워드: Spam mail filtering

Search Result 54, Processing Time 0.026 seconds

Improved Spam Filter via Handling of Text Embedded Image E-mail

  • Youn, Seongwook;Cho, Hyun-Chong
    • Journal of Electrical Engineering and Technology
    • /
    • v.10 no.1
    • /
    • pp.401-407
    • /
    • 2015
  • The increase of image spam, a kind of spam in which the text message is embedded into attached image to defeat spam filtering technique, is a major problem of the current e-mail system. For nearly a decade, content based filtering using text classification or machine learning has been a major trend of anti-spam filtering system. Recently, spammers try to defeat anti-spam filter by many techniques. Text embedding into attached image is one of them. We proposed an ontology spam filters. However, the proposed system handles only text e-mail and the percentage of attached images is increasing sharply. The contribution of the paper is that we add image e-mail handling capability into the anti-spam filtering system keeping the advantages of the previous text based spam e-mail filtering system. Also, the proposed system gives a low false negative value, which means that user's valuable e-mail is rarely regarded as a spam e-mail.

Analyzing the Effect of Lexical and Conceptual Information in Spam-mail Filtering System

  • Kang Sin-Jae;Kim Jong-Wan
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.6 no.2
    • /
    • pp.105-109
    • /
    • 2006
  • In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the ham (non-spam) mail. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the 2nd phase. According to our results the ham misclassification rate was reduced if more lexical information was used as features, and the spam misclassification rate was reduced when the concept codes were included in features as well.

Analyzing the correlation of Spam Recall and Thesaurus

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.21-25
    • /
    • 2005
  • In this paper, we constructed a two-phase spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mail. The definite information is the mail sender's information, URL, a certain spam list, and the less definite information is the word list and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning in the $2^{nd}$ phase. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

A design of the SMBC Platform using the Fit FA-Finder (Fit-FA Finder를 이용한 SMBC 플랫폼 설계)

  • Park, Nho-Kyung;Han, Sung-Ho;Seo, Sang-Jin;Jin, Hyun-Joon
    • Journal of IKEEE
    • /
    • v.10 no.1 s.18
    • /
    • pp.49-54
    • /
    • 2006
  • Recently, e-mail has become an important way of communications in IT societies, but it creates various social problems due to increase of spam mails. Even though many organizations and cooperation have been trying researches to develop spam mail blocking technologies, a lot of cost and system complexities are required because of varieties of spam blocking technologies. In this paper, we designed of the SMBC(Spam Mail Blocking Center) using the Fit FA(Filtering Algorithm) Finder. Fit-FA Finder that search and applises spam mail filtering algorithm of the most suitable confrontation according to type of spam mail. The system of spam mail filtering is decided performance of the system by procedure that spam filter is used. Go through designed Fit-FA Finder and reduced unnecessary filtering process and processing time and load than appointment order filter application way of existent spam mail interception system.

  • PDF

Spam-mail Filtering based on Lexical Information and Thesaurus (어휘정보와 시소러스에 기반한 스팸메일 필터링)

  • Kang Shin-Jae;Kim Jong-Wan
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.11 no.1
    • /
    • pp.13-20
    • /
    • 2006
  • In this paper, we constructed a spam-mail filtering system based on the lexical and conceptual information. There are two kinds of information that can distinguish the spam mail from the legitimate mil. The definite information is the mail sender's information, URL, a certain spam keyword list, and the less definite information is the word lists and concept codes extracted from the mail body. We first classified the spam mail by using the definite information, and then used the less definite information. We used the lexical information and concept codes contained in the email body for SVM learning. According to our results the spam precision was increased if more lexical information was used as features, and the spam recall was increased when the concept codes were included in features as well.

  • PDF

A Development of the SMBC platform for supporting advanced performance of blocking spam-mails (향상된 차단 성능 지원을 위한 SMBC 플랫폼 개발)

  • Sso, Sang-Jin;Jin, Hyun-Joon;Park, Noh-Kyung
    • Journal of Internet Computing and Services
    • /
    • v.8 no.2
    • /
    • pp.89-94
    • /
    • 2007
  • Even though lots of research have been doing about spam mail blocking technologies and their systems, the emergence of spam mails of new types causes the spam mail filtering rate to decrease and the occurrences of false-positive mails to increase. Therefore, existing spam mail filtering algorithms suffer from increasing load to be processed and decreasing reliability in spam mail blocking systems due to the shortage of newly developed algorithms and their research. This paper presents the Fit-FA Finder which is able to select appropriate algorithms to be applied and their procedures, and the development of the SMBC platform. The Fit-FA Finder is developed and implemented in the SMBC platform in which recovering process based on privacy information is employed for false-positive mails

  • PDF

Comparing Feature Selection Methods in Spam Mail Filtering

  • Kim, Jong-Wan;Kang, Sin-Jae
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.17-20
    • /
    • 2005
  • In this work, we compared several feature selection methods in the field of spam mail filtering. The proposed fuzzy inference method outperforms information gain and chi squared test methods as a feature selection method in terms of error rate. In the case of junk mails, since the mail body has little text information, it provides insufficient hints to distinguish spam mails from legitimate ones. To address this problem, we follow hyperlinks contained in the email body, fetch contents of a remote web page, and extract hints from both original email body and fetched web pages. A two-phase approach is applied to filter spam mails in which definite hint is used first, and then less definite textual information is used. In our experiment, the proposed two-phase method achieved an improvement of recall by 32.4% on the average over the $1^{st}$ phase or the $2^{nd}$ phase only works.

  • PDF

A spam mail blocking method using URL frequency analysis (URL 빈도분석을 이용한 스팸메일 차단 방법)

  • Baek Ki-young;Lee Chul-soo;Ryou Jae-cheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.14 no.6
    • /
    • pp.135-148
    • /
    • 2004
  • Recently, it is difficult to block the spam mail that changes variously with past spam distinction method by words. To solve such problem, This paper propose the method of generating spam distinction rule using URL frequency analysis. It is consist of collecting spam, drawing URL that get into characteristic from collected spam mail. URL noonalizing, generating spam distinction rule by time frequency, and blocking mail. It can effectively block various types of spam mail and various forms of spam mail that change.

An Architecture for Certificate and Agent Based E-mailing to Block Spam Mail

  • Nam, Sang-Zo
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.2
    • /
    • pp.39-50
    • /
    • 2003
  • Deleting unsolicited email, popularly known as spam mail, is an annoying task for Internet users. Moreover, spam mail causes a variety of social problems. At present, legal restrictions cannot eradicate spam senders. As a result, many technical methods to eliminate spam mail such as spam filtering and online stamps have been introduced. However, the process of blocking spam mail can inadvertently result in suspension of indispensable or beneficial communication. In this paper, we propose a certificate and agent based emailing architecture that can block spam mail, while at the same time approve certified mail. This architecture can be accelerated by synergistic utilization of digital signature and electronic document interchange.

  • PDF

A Proposed Architecture for Certificate and Agent Based E-mailing to Block Spam Mail

  • Nam, Sang-Zo
    • Proceedings of the KAIS Fall Conference
    • /
    • 2003.11a
    • /
    • pp.28-34
    • /
    • 2003
  • Deleting unsolicited email, popularly known as spam mail, is an annoying task for Internet users. Moreover, spam mail causes a variety of social problems. At present, legal restrictions cannot eradicate spam senders. As a result many technical methods to eliminate spam mail such as spam filtering and online stamps have been introduced. However, the process of blocking spam mail can inadvertently result in suspension of indispensable or beneficial communication. In this paper, we propose a certificate and agent based emailing architecture that can block spam mail, while at the same time approve certified mail. This architecture can be accelerated by synergistic utilization of digital signature and electronic document interchange.

  • PDF