Spam Message Filtering with Bayesian Approach for Internet Communities

Kim, Bum-Bae;Choi, Hyoung-Kee;

doi:10.3745/KIPSTC.2006.13C.6.733

The KIPS Transactions:PartC (정보처리학회논문지C)

Volume 13C Issue 6 Serial No. 109
/
Pages.733-740
/
2006
/
1598-2858(pISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Spam Message Filtering with Bayesian Approach for Internet Communities

베이지안을 이용한 인터넷 커뮤니티 상의 유해 메시지 차단 기법

김범배 (성균관대학교 컴퓨터공학과) ;
최형기 (성균관대학교 정보통신공학부)

Published : 2006.10.30

https://doi.org/10.3745/KIPSTC.2006.13C.6.733 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Spam Message has been Causing widespread damages on the Internet. One source of the problems is rooted from an anonymously posted message in the bulletin board in Internet communities. This type of the Spam messages tries to advertise products, to harm other's reputation, to deliver religious messages and so on. In this paper we present the Spam message filtering using the Bayesian approach. In order to increase usefulness of the Spam filter in the bulletin board in Internet communities, we made the Spam filter which can divide the Spam message into six categories such as advertisement, pornography, abuse, religion and other. The test conducted against messages posted on the popular web sites.

스팸의 피해가 이메일 서비스를 넘어 인터넷 전반에 걸쳐 급증하는 현재 인터넷은 익명성을 악용하여 해당 커뮤니티의 공동 관심사와는 무관한 메시지들, 즉 상업적 광고, 상호비방, 종교 홍보 등의 스팸 메시지들을 게재하면서 심각한 사회적 문제를 일으키고 있다. 본고에서는 인터넷 커뮤니티 상의 스팸 메시지를 해결하고자 기존의 스팸 메일 차단에 이용되고 있는 베이지안 접근법을 적용한 인터넷 커뮤니티 상의 스팸 메시지 차단 방법을 소개한다. 나아가 인터넷 커뮤니티 상에서의 스팸 메시지 필터링의 효과를 증대시키기 위한 방편으로 스팸 메시지를 다양한 소분류로 세분화가 가능토록 구성했다 이는 인터넷 커뮤니티의 다양한 이용자의 요구를 충족시키기 위한 방안이다. 구현된 베이지안 필터링 기법은 현재 운영되고 있는 사이트들을 대상으로 정확도를 측정하였다.

Keywords

References

TopTenReviews, 'Spam Statistics 2006,' available at http://spam-filter-review.toptenreviews.com/spam-statistics.html
Paulson, L.D, 'Spam hits instant messaging,' IEEE Computer, IEEE Computer Society, Volume 37, Issue 4, April 2004 pp. 18 https://doi.org/10.1109/MC.2004.1297295
The Radicati Group Inc., 'Email Sent and Received Growth Statistic, 2003-2005', Jul. 2003
Graham Paul, 'A Plan For Spam,' available at http://www.paulgraham.com/spam.html, 2002
Graham Paul, 'Better Bayesian Filtering,' available at http://paulgraham.com/better.html, Jan. 2003
Trend Micro Inc., 'Nominations', available at http://www.mail-abuse.com/nominats.html
SpamCop, 'SpamCop Blocking List,' available at http://www.spamcop.net/bl.shtml
Spamhaus, 'The Spamhaus Block List,' available at http://www.spamhaus.org/sbl/index.lasso
Pobox, SPF, 'How it works,' available at http://spf.pobox.com/howworks.html
Microsoft SenderID, 'Sender ID Framework Overview,' available at http://www.microsoft.com/mscorp/safety/technologies/senderid/overview.mspx
Yahoo! DomainKeys, 'Domainkeys: Proving and Protecting Email Sender Identity,' available at http://antispam.yahoo.com/domainkey
Jim Fenton, 'Identified Internet Mail,' Cisco System, 2004 available at http://antiphishing.kavi.com/events/Conference_Notes/Jim_Fenton_on_Cisco_Internet_Identified_Mail.pdf
SpamAssassin, 'The Apache SpamAssassin Project,' available at http://spamassassin.apache.org
Thornsten Joachims, 'Text categorization with support vector machines: learning with many relevant features,' Proc. European Conference on Machine Learning, Springer-Verlag, pp.137-142, 1998
Hongrak Lee and Andrew Y. Ng, 'Spam Deobfuscation using a Hidden Markov Model,' Second Conference on Email and Anti-Spam (CEAS2005), 2005, available at http://www.ceas.cc/papers-2005/166.pdf
Ian Stuart, Sung-Hyuk Cha, Charles C. Tappert, 'A Neural Network Classifier for Junk E-Mail,' Proc. Document Analysis System VI, 6th International Workshop, Springer-Verlag, pp.442-450, 2004
Sam Holden, 'Spam Filters,' Category Reviews, Aug. 2003, available at http://freshmeat.net/articles/view/964
Roger Burton, 'Mail::SpamTest::Bayesian,' available at http://search.cpan.org/~firedrake/Mail-SpamTest-Bayesian-0.02/Bayesian.pm

Cited by

Spam Message Filtering for Internet Communities using Collection and Frequency Analysis vol.18C, pp.2, 2011, https://doi.org/10.3745/KIPSTC.2011.18C.2.061

The KIPS Transactions:PartC (정보처리학회논문지C)

Spam Message Filtering with Bayesian Approach for Internet Communities

베이지안을 이용한 인터넷 커뮤니티 상의 유해 메시지 차단 기법

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)