• Title/Summary/Keyword: Time-based Clustering

Search Result 716, Processing Time 0.024 seconds

Weighted Bayesian Automatic Document Categorization Based on Association Word Knowledge Base by Apriori Algorithm (Apriori알고리즘에 의한 연관 단어 지식 베이스에 기반한 가중치가 부여된 베이지만 자동 문서 분류)

  • 고수정;이정현
    • Journal of Korea Multimedia Society
    • /
    • v.4 no.2
    • /
    • pp.171-181
    • /
    • 2001
  • The previous Bayesian document categorization method has problems that it requires a lot of time and effort in word clustering and it hardly reflects the semantic information between words. In this paper, we propose a weighted Bayesian document categorizing method based on association word knowledge base acquired by mining technique. The proposed method constructs weighted association word knowledge base using documents in training set. Then, classifier using Bayesian probability categorizes documents based on the constructed association word knowledge base. In order to evaluate performance of the proposed method, we compare our experimental results with those of weighted Bayesian document categorizing method using vocabulary dictionary by mutual information, weighted Bayesian document categorizing method, and simple Bayesian document categorizing method. The experimental result shows that weighted Bayesian categorizing method using association word knowledge base has improved performance 0.87% and 2.77% and 5.09% over weighted Bayesian categorizing method using vocabulary dictionary by mutual information and weighted Bayesian method and simple Bayesian method, respectively.

  • PDF

A Term Weight Mensuration based on Popularity for Search Query Expansion (검색 질의 확장을 위한 인기도 기반 단어 가중치 측정)

  • Lee, Jung-Hun;Cheon, Suh-Hyun
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.8
    • /
    • pp.620-628
    • /
    • 2010
  • With the use of the Internet pervasive in everyday life, people are now able to retrieve a lot of information through the web. However, exponential growth in the quantity of information on the web has brought limits to online search engines in their search performance by showing piles and piles of unwanted information. With so much unwanted information, web users nowadays need more time and efforts than in the past to search for needed information. This paper suggests a method of using query expansion in order to quickly bring wanted information to web users. Popularity based Term Weight Mensuration better performance than the TF-IDF and Simple Popularity Term Weight Mensuration to experiments without changes of search subject. When a subject changed during search, Popularity based Term Weight Mensuration's performance change is smaller than others.

An Alert Data Mining Framework for Intrusion Detection System (침입탐지시스템의 경보데이터 분석을 위한 데이터 마이닝 프레임워크)

  • Shin, Moon-Sun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.1
    • /
    • pp.459-466
    • /
    • 2011
  • In this paper, we proposed a data mining framework for the management of alerts in order to improve the performance of the intrusion detection systems. The proposed alert data mining framework performs alert correlation analysis by using mining tasks such as axis-based association rule, axis-based frequent episodes and order-based clustering. It also provides the capability of classify false alarms in order to reduce false alarms. We also analyzed the characteristics of the proposed system through the implementation and evaluation of the proposed system. The proposed alert data mining framework performs not only the alert correlation analysis but also the false alarm classification. The alert data mining framework can find out the unknown patterns of the alerts. It also can be applied to predict attacks in progress and to understand logical steps and strategies behind series of attacks using sequences of clusters and to classify false alerts from intrusion detection system. The final rules that were generated by alert data mining framework can be used to the real time response of the intrusion detection system.

Anomaly Detection in Sensor Data

  • Kim, Jong-Min;Baik, Jaiwook
    • Journal of Applied Reliability
    • /
    • v.18 no.1
    • /
    • pp.20-32
    • /
    • 2018
  • Purpose: The purpose of this study is to set up an anomaly detection criteria for sensor data coming from a motorcycle. Methods: Five sensor values for accelerator pedal, engine rpm, transmission rpm, gear and speed are obtained every 0.02 second from a motorcycle. Exploratory data analysis is used to find any pattern in the data. Traditional process control methods such as X control chart and time series models are fitted to find any anomaly behavior in the data. Finally unsupervised learning algorithm such as k-means clustering is used to find any anomaly spot in the sensor data. Results: According to exploratory data analysis, the distribution of accelerator pedal sensor values is very much skewed to the left. The motorcycle seemed to have been driven in a city at speed less than 45 kilometers per hour. Traditional process control charts such as X control chart fail due to severe autocorrelation in each sensor data. However, ARIMA model found three abnormal points where they are beyond 2 sigma limits in the control chart. We applied a copula based Markov chain to perform statistical process control for correlated observations. Copula based Markov model found anomaly behavior in the similar places as ARIMA model. In an unsupervised learning algorithm, large sensor values get subdivided into two, three, and four disjoint regions. So extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior in the sensor values. Conclusion: Exploratory data analysis is useful to find any pattern in the sensor data. Process control chart using ARIMA and Joe's copula based Markov model also give warnings near similar places in the data. Unsupervised learning algorithm shows us that the extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior.

Cluster-based Image Retrieval Method Using RAGMD (RAGMD를 이용한 클러스터 기반의 영상 검색 기법)

  • Jung, Sung-Hwan;Lee, Woo-Sun
    • The KIPS Transactions:PartB
    • /
    • v.9B no.1
    • /
    • pp.113-118
    • /
    • 2002
  • This paper presents a cluster-based image retrieval method. It retrieves images from a related cluster after classifying images into clusters using RAGMD, a clustering technique. When images are retrieved, first they are retrieved not from the whole image database one by one but from the similar cluster, a similar small image group with a query image. So it gives us retrieval-time reduction, keeping almost the same precision with the exhaustive retrieval. In the experiment using an image database consisting of about 2,400 real images, it shows that the proposed method is about 18 times faster than 7he exhaustive method with almost same precision and it can retrieve more similar images which belong to the same class with a query image.

A Cluster-based Efficient Key Management Protocol for Wireless Sensor Networks (무선 센서 네트워크를 위한 클러스터 기반의 효율적 키 관리 프로토콜)

  • Jeong, Yoon-Su;Hwang, Yoon-Cheol;Lee, Keon-Myung;Lee, Sang-Ho
    • Journal of KIISE:Information Networking
    • /
    • v.33 no.2
    • /
    • pp.131-138
    • /
    • 2006
  • To achieve security in wireless sensor networks(WSN), it is important to be able to encrypt and authenticate messages sent among sensor nodes. Due to resource constraints, many key agreement schemes used in general networks such as Diffie-Hellman and public-key based schemes are not suitable for wireless sensor networks. The current pre-distribution of secret keys uses q-composite random key and it randomly allocates keys. But there exists high probability not to be public-key among sensor nodes and it is not efficient to find public-key because of the problem for time and energy consumption. To remove problems in pre-distribution of secret keys, we propose a new cryptographic key management protocol, which is based on the clustering scheme but does not depend on probabilistic key. The protocol can increase efficiency to manage keys because, before distributing keys in bootstrap, using public-key shared among nodes can remove processes to send or to receive key among sensors. Also, to find outcompromised nodes safely on network, it selves safety problem by applying a function of lightweight attack-detection mechanism.

Representative Feature Extraction of Objects using VQ and Its Application to Content-based Image Retrieval (VQ를 이용한 영상의 객체 특징 추출과 이를 이용한 내용 기반 영상 검색)

  • Jang, Dong-Sik;Jung, Seh-Hwan;Yoo, Hun-Woo;Sohn, Yong--Jun
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.6
    • /
    • pp.724-732
    • /
    • 2001
  • In this paper, a new method of feature extraction of major objects to represent an image using Vector Quantization(VQ) is proposed. The principal features of the image, which are used in a content-based image retrieval system, are color, texture, shape and spatial positions of objects. The representative color and texture features are extracted from the given image using VQ(Vector Quantization) clustering algorithm with a general feature extraction method of color and texture. Since these are used for content-based image retrieval and searched by objects, it is possible to search and retrieve some desirable images regardless of the position, rotation and size of objects. The experimental results show that the representative feature extraction time is much reduced by using VQ, and the highest retrieval rate is given as the weighted values of color and texture are set to 0.5 and 0.5, respectively, and the proposed method provides up to 90% precision and recall rate for 'person'query images.

  • PDF

Design of Face Recognition algorithm Using PCA&LDA combined for Data Pre-Processing and Polynomial-based RBF Neural Networks (PCA와 LDA를 결합한 데이터 전 처리와 다항식 기반 RBFNNs을 이용한 얼굴 인식 알고리즘 설계)

  • Oh, Sung-Kwun;Yoo, Sung-Hoon
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.61 no.5
    • /
    • pp.744-752
    • /
    • 2012
  • In this study, the Polynomial-based Radial Basis Function Neural Networks is proposed as an one of the recognition part of overall face recognition system that consists of two parts such as the preprocessing part and recognition part. The design methodology and procedure of the proposed pRBFNNs are presented to obtain the solution to high-dimensional pattern recognition problems. In data preprocessing part, Principal Component Analysis(PCA) which is generally used in face recognition, which is useful to express some classes using reduction, since it is effective to maintain the rate of recognition and to reduce the amount of data at the same time. However, because of there of the whole face image, it can not guarantee the detection rate about the change of viewpoint and whole image. Thus, to compensate for the defects, Linear Discriminant Analysis(LDA) is used to enhance the separation of different classes. In this paper, we combine the PCA&LDA algorithm and design the optimized pRBFNNs for recognition module. The proposed pRBFNNs architecture consists of three functional modules such as the condition part, the conclusion part, and the inference part as fuzzy rules formed in 'If-then' format. In the condition part of fuzzy rules, input space is partitioned with Fuzzy C-Means clustering. In the conclusion part of rules, the connection weight of pRBFNNs is represented as two kinds of polynomials such as constant, and linear. The coefficients of connection weight identified with back-propagation using gradient descent method. The output of the pRBFNNs model is obtained by fuzzy inference method in the inference part of fuzzy rules. The essential design parameters (including learning rate, momentum coefficient and fuzzification coefficient) of the networks are optimized by means of Differential Evolution. The proposed pRBFNNs are applied to face image(ex Yale, AT&T) datasets and then demonstrated from the viewpoint of the output performance and recognition rate.

Comparison of Algorithms for Generating Parametric Image of Cerebral Blood Flow Using ${H_2}^{15}O$ PET Positron Emission Tomography (${H_2}^{15}O$ PET을 이용한 뇌혈류 파라메트릭 영상 구성을 위한 알고리즘 비교)

  • Lee, Jae-Sung;Lee, Dong-Soo;Park, Kwang-Suk;Chung, June-Key;Lee, Myung-Chul
    • The Korean Journal of Nuclear Medicine
    • /
    • v.37 no.5
    • /
    • pp.288-300
    • /
    • 2003
  • Purpose: To obtain regional blood flow and tissue-blood partition coefficient with time-activity curves from ${H_2}^{15}O$ PET, fitting of some parameters in the Kety model is conventionally accomplished by nonlinear least squares (NLS) analysis. However, NLS requires considerable compuation time then is impractical for pixel-by-pixel analysis to generate parametric images of these parameters. In this study, we investigated several fast parameter estimation methods for the parametric image generation and compared their statistical reliability and computational efficiency. Materials and Methods: These methods included linear least squres (LLS), linear weighted least squares (LWLS), linear generalized least squares (GLS), linear generalized weighted least squares (GWLS), weighted Integration (WI), and model-based clustering method (CAKS). ${H_2}^{15}O$ dynamic brain PET with Poisson noise component was simulated using numerical Zubal brain phantom. Error and bias in the estimation of rCBF and partition coefficient, and computation time in various noise environments was estimated and compared. In audition, parametric images from ${H_2}^{15}O$ dynamic brain PET data peformed on 16 healthy volunteers under various physiological conditions was compared to examine the utility of these methods for real human data. Results: These fast algorithms produced parametric images with similar image qualify and statistical reliability. When CAKS and LLS methods were used combinedly, computation time was significantly reduced and less than 30 seconds for $128{\times}128{\times}46$ images on Pentium III processor. Conclusion: Parametric images of rCBF and partition coefficient with good statistical properties can be generated with short computation time which is acceptable in clinical situation.

SOM-Based $R^{*}-Tree$ for Similarity Retrieval (자기 조직화 맵 기반 유사 검색 시스템)

  • O, Chang-Yun;Im, Dong-Ju;O, Gun-Seok;Bae, Sang-Hyeon
    • The KIPS Transactions:PartD
    • /
    • v.8D no.5
    • /
    • pp.507-512
    • /
    • 2001
  • Feature-based similarity has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects. the performance of conventional multidimensional data structures tends to deteriorate as the number of dimensions of feature vectors increase. The $R^{*}-Tree$ is the most successful variant of the R-Tree. In this paper, we propose a SOM-based $R^{*}-Tree$ as a new indexing method for high-dimensional feature vectors. The SOM-based $R^{*}-Tree$ combines SOM and $R^{*}-Tree$ to achieve search performance more scalable to high-dimensionalties. Self-Organizingf Maps (SOMs) provide mapping from high-dimensional feature vectors onto a two-dimensional space. The map is called a topological feature map, and preserves the mutual relationships (similarity) in the feature spaces of input data, clustering mutually similar feature vectors in neighboring nodes. Each node of the topological feature map holds a codebook vector. We experimentally compare the retrieval time cost of a SOM-based $R^{*}-Tree$ with of an SOM and $R^{*}-Tree$ using color feature vectors extracted from 40,000 images. The results show that the SOM-based $R^{*}-Tree$ outperform both the SOM and $R^{*}-Tree$ due to reduction of the number of nodes to build $R^{*}-Tree$ and retrieval time cost.

  • PDF