• 제목/요약/키워드: SPAMS

Search Result 7, Processing Time 0.032 seconds

Modeling and Evaluating Information Diffusion for Spam Detection in Micro-blogging Networks

  • Chen, Kan;Zhu, Peidong;Chen, Liang;Xiong, Yueshan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.8
    • /
    • pp.3005-3027
    • /
    • 2015
  • Spam has become one of the top threats of micro-blogging networks as the representations of rumor spreading, advertisement abusing and malware distribution. With the increasing popularity of micro-blogging, the problems will exacerbate. Prior detection tools are either designed for specific types of spams or not robust enough. Spammers may escape easily from being detected by adjusting their behaviors. In this paper, we present a novel model to quantitatively evaluate information diffusion in micro-blogging networks. Under this model, we found that spam posts differ wildly from the non-spam ones. First, the propagations of non-spam posts mostly result from their followers, but those of spam posts are mainly from strangers. Second, the non-spam posts relatively last longer than the spam posts. Besides, the non-spam posts always get their first reposts/comments much sooner than the spam posts. With the features defined in our model, we propose an RBF-based approach to detect spams. Different from the previous works, in which the features are extracted from individual profiles or contents, the diffusion features are not determined by any single user but the crowd. Thus, our method is more robust because any single user's behavior changes will not affect the effectiveness. Besides, although the spams vary in types and forms, they're propagated in the same way, so our method is effective for all types of spams. With the real data crawled from the leading micro-blogging services of China, we are able to evaluate the effectiveness of our model. The experiment results show that our model can achieve high accuracy both in precision and recall.

Spam-Filtering by Identifying Automatically Generated Email Accounts (자동 생성 메일계정 인식을 통한 스팸 필터링)

  • Lee Sangho
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.5
    • /
    • pp.378-384
    • /
    • 2005
  • In this paper, we describe a novel method of spam-filtering to improve the performance of conventional spam-filtering systems. Conventional systems filter emails by investigating words distribution in email headers or bodies. Nowadays, spammers begin making email accounts in web-based email service sites and sending emails as if they are not spams. Investigating the email accounts of those spams, we notice that there is a large difference between the automatically generated accounts and ordinaries. Based on that difference, incoming emails are classified into spam/non-spam classes. To classify emails from only account strings, we used decision trees, which have been generally used for conventional pattern classification problems. We collected about 2.15 million account strings from email service sites, and our account checker resulted in the accuracy of $96.3\%$. The previous filter system with the checker yielded the improved filtering performance.

Context-based classification for harmful web documents and comparison of feature selecting algorithms

  • Kim, Young-Soo;Park, Nam-Je;Hong, Do-Won;Won, Dong-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.12 no.6
    • /
    • pp.867-875
    • /
    • 2009
  • More and richer information sources and services are available on the web everyday. However, harmful information, such as adult content, is not appropriate for all users, notably children. Since internet is a worldwide open network, it has a limit to regulate users providing harmful contents through each countrie's national laws or systems. Additionally it is not a desirable way of developing a certain system-specific classification technology for harmful contents, because internet users can contact with them in diverse ways, for example, porn sites, harmful spams, or peer-to-peer networks, etc. Therefore, it is being emphasized to research and develop context-based core technologies for classifying harmful contents. In this paper, we propose an efficient text filter for blocking harmful texts of web documents using context-based technologies and examine which algorithms for feature selection, the process that select content terms, as features, can be useful for text categorization in all content term occurs in documents, are suitable for classifying harmful contents through implementation and experiment.

  • PDF

Herbal Medicine Treatment of Refractory Epilepsy in Tuberous Sclerosis Complex : A Case Report

  • Son, Kwanghyun;Lee, Jinsoo;Kim, Moonju
    • The Journal of Korean Medicine
    • /
    • v.36 no.2
    • /
    • pp.50-55
    • /
    • 2015
  • Infants with tuberous sclerosis complex (TSC) have a higher chance of experiencing seizures before the age of 1 year; in particular, they commonly accompany infantile spasms. In cases where infantile spasms resulting from TSC are drug-resistant, more severe neuro-developmental and cognitive impairments occur. This particular case dealt with an infant with TSC who continued to experience partial seizures and infantile spasms despite using two different kinds of antiepileptic drugs (AEDs). His spasms ceased on the seventh day of taking modified Yukmijihwang-tang (YMJ), at which point he stopped the use of all AEDs. He became seizure-free after a month of the treatment and modified hypsarrythmia was found to have been resolved in the electroencephalogram test. Until now, the infant has been taking YMJ for 16 months and is maintaining the seizure-free state without side effects. Moreover, his developmental status is continually improving, with a significant progress in language and cognitive-adaptive abilities. Such results suggest that YMJ can serve as an alternative treatment option for refractory epilepsy.

Characteristics of long-range transported PM2.5 at a coastal city using the single particle aerosol mass spectrometry

  • Cai, Qiuliang;Tong, Lei;Zhang, Jingjing;Zheng, Jie;He, Mengmeng;Lin, Jiamei;Chen, Xiaoqiu;Xiao, Hang
    • Environmental Engineering Research
    • /
    • v.24 no.4
    • /
    • pp.690-698
    • /
    • 2019
  • Air pollution has attracted ever-increasing attention because of its substantial influence on air quality and human health. To better understand the characteristics of long-range transported pollution, the single particle chemical composition and size were investigated by the single particle aerosol mass spectrometry in Fuzhou, China from 17th to 22nd January, 2016. The results showed that the haze was mainly caused by the transport of cold air mass under higher wind speed (10 m·s-1) from the Yangtze River Delta region to Fuzhou. The number concentration elevated from 1,000 to 4,500 #·h-1, and the composition of mobile source and secondary aerosol increased from 24.3% to 30.9% and from 16.0% to 22.5%, respectively. Then, the haze was eliminated by the clean air mass from the sea as indicated by a sharp decrease of particle number concentration from 4,500 to 1,000 #·h-1. The composition of secondary aerosol and mobile sources decreased from 29.3% to 23.5% and from 30.9% to 23.1%, respectively. The particles with the size ranging from 0.5 to 1.5 ㎛ were mainly in the accumulation mode. The stationary source, mobile source, and secondary aerosol contributed to over 70% of the potential sources. These results will help to understand the physical and chemical characteristics of long- range transported pollutants.

A Distinction Technology for Harmful Web Documents by Rates (등급에 따른 웹 유해 문서 분류 기술)

  • Kim, Yong-Soo;Nam, Taek-Yong;Won, Dong-Ho
    • The KIPS Transactions:PartC
    • /
    • v.13C no.7 s.110
    • /
    • pp.859-864
    • /
    • 2006
  • The openness of the Web allows any user to access almost any type of information easily at any time and anywhere. However, with function of easy access for useful information, internet has dysfunctions of providing users with harmful contents indiscriminately. Some information, such as adult content, is not appropriate for all users, notably children. Additionally for adults, some contents included in abnormal porn sites can do ordinary people's mental health harm. In the meantime, since Internet is a worldwide open network it has a limit to regulate users providing harmful contents through each countrie's national laws or systems. Additionally it is not a desirable way of developing a certain system-specific classification technology for harmful contents, because internet users can contact with them in diverse way, for example, porn sites, harmful spams, or peer-to-peer networks, etc. Therefore, it is being emphasized to research and develop context-based core technologies for classifying harmful contents. In this paper, we propose an efficient text filter for blocking harmful texts of web documents using context-based technologies.

A Study on Spam Document Classification Method using Characteristics of Keyword Repetition (단어 반복 특징을 이용한 스팸 문서 분류 방법에 관한 연구)

  • Lee, Seong-Jin;Baik, Jong-Bum;Han, Chung-Seok;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.18B no.5
    • /
    • pp.315-324
    • /
    • 2011
  • In Web environment, a flood of spam causes serious social problems such as personal information leak, monetary loss from fishing and distribution of harmful contents. Moreover, types and techniques of spam distribution which must be controlled are varying as days go by. The learning based spam classification method using Bag-of-Words model is the most widely used method until now. However, this method is vulnerable to anti-spam avoidance techniques, which recent spams commonly have, because it classifies spam documents utilizing only keyword occurrence information from classification model training process. In this paper, we propose a spam document detection method using a characteristic of repeating words occurring in spam documents as a solution of anti-spam avoidance techniques. Recently, most spam documents have a trend of repeating key phrases that are designed to spread, and this trend can be used as a measure in classifying spam documents. In this paper, we define six variables, which represent a characteristic of word repetition, and use those variables as a feature set for constructing a classification model. The effectiveness of proposed method is evaluated by an experiment with blog posts and E-mail data. The result of experiment shows that the proposed method outperforms other approaches.