• Title/Summary/Keyword: spam email

Search Result 31, Processing Time 0.031 seconds

Detection of Zombie PCs Based on Email Spam Analysis

  • Jeong, Hyun-Cheol;Kim, Huy-Kang;Lee, Sang-Jin;Kim, Eun-Jin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.6 no.5
    • /
    • pp.1445-1462
    • /
    • 2012
  • While botnets are used for various malicious activities, it is well known that they are widely used for email spam. Though the spam filtering systems currently in use block IPs that send email spam, simply blocking the IPs of zombie PCs participating in a botnet is not enough to prevent the spamming activities of the botnet because these IPs can easily be changed or manipulated. This IP blocking is also insufficient to prevent crimes other than spamming, as the botnet can be simultaneously used for multiple purposes. For this reason, we propose a system that detects botnets and zombie PCs based on email spam analysis. This study introduces the concept of "group pollution level" - the degree to which a certain spam group is suspected of being a botnet - and "IP pollution level" - the degree to which a certain IP in the spam group is suspected of being a zombie PC. Such concepts are applied in our system that detects botnets and zombie PCs by grouping spam mails based on the URL links or attachments contained, and by assessing the pollution level of each group and each IP address. For empirical testing, we used email spam data collected in an "email spam trap system" - Korea's national spam collection system. Our proposed system detected 203 botnets and 18,283 zombie PCs in a day and these zombie PCs sent about 70% of all the spam messages in our analysis. This shows the effectiveness of detecting zombie PCs by email spam analysis, and the possibility of a dramatic reduction in email spam by taking countermeasure against these botnets and zombie PCs.

Personalized Anti-spam Filter Considering Users' Different Preferences

  • Kim, Jong-Wan
    • Journal of Korea Multimedia Society
    • /
    • v.13 no.6
    • /
    • pp.841-848
    • /
    • 2010
  • Conventional filters using email header and body information equally judge whether an incoming email is spam or not. However this is unrealistic in everyday life because each person has different criteria to judge what is spam or not. To resolve this problem, we consider user preference information as well as email category information derived from the email content. In this paper, we have developed a personalized anti-spam system using ontologies constructed from rules derived in a data mining process. The reason why traditional content-based filters are not applicable to the proposed experimental situation is described. In also, several experiments constructing classifiers to decide email category and comparing classification rule learners are performed. Especially, an ID3 decision tree algorithm improved the overall accuracy around 17% compared to a conventional SVM text miner on the decision of email category. Some discussions about the axioms generated from the experimental dataset are given too.

Study for Tracing Zombie PCS and Botnet Using an Email Spam Trap (이메일 스팸트랩을 이용한 좀비 PC 및 봇넷 추적 방안연구)

  • Jeong, Hyun-Cheol;Kim, Huy-Kang;Lee, Sang-Jin;Oh, Joo-Hyung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.21 no.3
    • /
    • pp.101-115
    • /
    • 2011
  • A botnet is a huge network of hacked zombie PCs. Recognizing the fact that the majority of email spam is sent out by botnets, a system that is capable of detecting botnets and zombie PCS will be designed in this study by analyzing email spam. In this study, spam data collected in "an email spam trail system", Korea's national spam collection system, were used for analysis. In this study, we classified the spam groups by the URLs or attached files, and we measured how much the group has the characteristics of botnet and how much the IPs have the characteristics of zombie PC. Through the simulation result in this study, we could extract 16,030 zombie suspected PCs for one hours and it was verified that email spam can provide considerably useful information in tracing zombie PCs.

Spam-Filtering by Identifying Automatically Generated Email Accounts (자동 생성 메일계정 인식을 통한 스팸 필터링)

  • Lee Sangho
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.378-384
    • /
    • 2005
  • In this paper, we describe a novel method of spam-filtering to improve the performance of conventional spam-filtering systems. Conventional systems filter emails by investigating words distribution in email headers or bodies. Nowadays, spammers begin making email accounts in web-based email service sites and sending emails as if they are not spams. Investigating the email accounts of those spams, we notice that there is a large difference between the automatically generated accounts and ordinaries. Based on that difference, incoming emails are classified into spam/non-spam classes. To classify emails from only account strings, we used decision trees, which have been generally used for conventional pattern classification problems. We collected about 2.15 million account strings from email service sites, and our account checker resulted in the accuracy of $96.3\%$. The previous filter system with the checker yielded the improved filtering performance.

Semantics in Social Web: A Case of Personalized Email Marketing (소셜 웹에서의 시맨틱스: 개인화 이메일 마케팅 개발 사례)

  • Joo, Jae-Hun;Myeong, Sung-Jae
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.6
    • /
    • pp.43-48
    • /
    • 2010
  • Useful emails influence on consumers' purchase behavior and activate them to visit retail stores. Regular contact with consumers by e-mail has positive effects on brand loyalty. However, email marketing has a limitation. Spam now accounts for over half of all e-mail traffic. The increase of email users has resulted in the dramatic increase of spam emails during the past few years. In this paper, we proposed an ontology-based system offering personalized email services to overcome such limitation. Our method is not the ontology-driven spam filtering, but a personalized content service considering personal interests and relations among people by using FOAF and domain ontologies. Our system was successfully tested in email marketing domain.

Intelligent Spam-mail Filtering Based on Textual Information and Hyperlinks (텍스트정보와 하이퍼링크에 기반한 지능형 스팸 메일 필터링)

  • Kang, Sin-Jae;Kim, Jong-Wan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.14 no.7
    • /
    • pp.895-901
    • /
    • 2004
  • This paper describes a two-phase intelligent method for filtering spam mail based on textual information and hyperlinks. Scince the body of spam mail has little text information, it provides insufficient hints to distinguish spam mails from legitimate mails. To resolve this problem, we follows hyperlinks contained in the email body, fetches contents of a remote webpage, and extracts hints (i.e., features) from original email body and fetched webpages. We divided hints into two kinds of information: definite information (sender`s information and definite spam keyword lists) and less definite textual information (words or phrases, and particular features of email). In filtering spam mails, definite information is used first, and then less definite textual information is applied. In our experiment, the method of fetching web pages achieved an improvement of F-measure by 9.4% over the method of using on original email header and body only.

On the Performance of Cuckoo Search and Bat Algorithms Based Instance Selection Techniques for SVM Speed Optimization with Application to e-Fraud Detection

  • AKINYELU, Andronicus Ayobami;ADEWUMI, Aderemi Oluyinka
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1348-1375
    • /
    • 2018
  • Support Vector Machine (SVM) is a well-known machine learning classification algorithm, which has been widely applied to many data mining problems, with good accuracy. However, SVM classification speed decreases with increase in dataset size. Some applications, like video surveillance and intrusion detection, requires a classifier to be trained very quickly, and on large datasets. Hence, this paper introduces two filter-based instance selection techniques for optimizing SVM training speed. Fast classification is often achieved at the expense of classification accuracy, and some applications, such as phishing and spam email classifiers, are very sensitive to slight drop in classification accuracy. Hence, this paper also introduces two wrapper-based instance selection techniques for improving SVM predictive accuracy and training speed. The wrapper and filter based techniques are inspired by Cuckoo Search Algorithm and Bat Algorithm. The proposed techniques are validated on three popular e-fraud types: credit card fraud, spam email and phishing email. In addition, the proposed techniques are validated on 20 other datasets provided by UCI data repository. Moreover, statistical analysis is performed and experimental results reveals that the filter-based and wrapper-based techniques significantly improved SVM classification speed. Also, results reveal that the wrapper-based techniques improved SVM predictive accuracy in most cases.

Comparing Feature Selection Methods in Spam Mail Filtering

  • Kim, Jong-Wan;Kang, Sin-Jae
    • Proceedings of the Korea Society of Information Technology Applications Conference
    • /
    • 2005.11a
    • /
    • pp.17-20
    • /
    • 2005
  • In this work, we compared several feature selection methods in the field of spam mail filtering. The proposed fuzzy inference method outperforms information gain and chi squared test methods as a feature selection method in terms of error rate. In the case of junk mails, since the mail body has little text information, it provides insufficient hints to distinguish spam mails from legitimate ones. To address this problem, we follow hyperlinks contained in the email body, fetch contents of a remote web page, and extract hints from both original email body and fetched web pages. A two-phase approach is applied to filter spam mails in which definite hint is used first, and then less definite textual information is used. In our experiment, the proposed two-phase method achieved an improvement of recall by 32.4% on the average over the $1^{st}$ phase or the $2^{nd}$ phase only works.

  • PDF

A Research on an Email Method based on Sender Mailbox (송신자사서함 기반의 메일 방식에 관한 연구)

  • Kim, Tae-Joon
    • The KIPS Transactions:PartC
    • /
    • v.11C no.5
    • /
    • pp.689-696
    • /
    • 2004
  • The conventional email method based on a recipient mailbox has a structural weakness, which may cause the spam message problem and the extreme waste of recipient mailbox space, and also require an explicit recipient notification scheme. This paper proposes a new email method based on a sender mailbox and evaluates its performance. Under the new email method, a message is stored at sender mailbox instead of recipient one until an intended recipient reads the message, so that the burden of mailbox management such as removing spam message is now shifted to sender side. And also a sender can confirm whether an intended recipient has read his or her message by simply rummaging his or her sender mailbox. The results of Performance evaluation show that 75% of mailbox space and 90% of message traffic are reduced in conditions that the portions of spam message and multicasting message are 90% and 80%, respectively.

From Computing Distribution of Email Responses for Each User Cluster To Construct User Preference based Anti-spam Mail System (사용자 클러스터별 이메일 반응 분포 계산 및 사용자 선호 스팸 메일 대응 시스템 구축)

  • Kim, Jong-Wan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.3
    • /
    • pp.343-349
    • /
    • 2009
  • In this paper, it would be shown that individuals can have different responses to the same email based on their preferences through computing the distributions of user clusters' email responses from clustering results based on email users' preference information. This paper presents an approach that incorporates user preferences to construct an anti-spam mail system, which is different from the conventional content-based ones. We consider email category information derived from the email content as well as user preference information. We also build a user preference ontology to formally represent the important concepts and rules derived from a data mining process and then apply a rule optimization procedure to exclude unnecessary rules. Experimental results show that our user preference based system achieves good performance in terms of accuracy, the rules derived from the system and human comprehensibility.