• Title/Summary/Keyword: k-mean clustering algorithm

Search Result 119, Processing Time 0.028 seconds

Fuzzy Clustering Method for the Identification of Joint Sets (절리군 분석을 위한 퍼지 클러스터링 기법)

  • 정용복;전석원
    • Tunnel and Underground Space
    • /
    • v.13 no.4
    • /
    • pp.294-303
    • /
    • 2003
  • The structural behaviour of rock mass structure, such as tunnel or slope is critically dependent on the various characteristics of discontinuities. Therefore, it is important to survey and analyze discontinuities correctly for the design and construction of rock mass structure. One inevitable Procedure of discontinuity survey and analysis is joint set identification from a lot of raw directional joint data. The identification procedure is generally done by a graphical method. This type of analysis has some shortcomings such as subjective identification results, inability to use extra information on discontinuity, and so on. In this study, a computer program for joint set identification based on the fuzzy clustering algorithm was implemented and tested using two kinds of joint data. It was confirmed that fuzzy clustering method is effective and valid for joint set identification and estimation of mean direction and degree of clustering of huge joint data through the applications.

Voice Activity Detection Algorithm base on Radial Basis Function Networks with Dual Threshold (Radial Basis Function Networks를 이용한 이중 임계값 방식의 음성구간 검출기)

  • Kim Hong lk;Park Sung Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1660-1668
    • /
    • 2004
  • This paper proposes a Voice Activity Detection (VAD) algorithm based on Radial Basis Function (RBF) network using dual threshold. The k-means clustering and Least Mean Square (LMS) algorithm are used to upade the RBF network to the underlying speech condition. The inputs for RBF are the three parameters in a Code Exited Linear Prediction (CELP) coder, which works stably under various background noise levels. Dual hangover threshold applies in BRF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental result show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.

Product Recommendation System on VLDB using k-means Clustering and Sequential Pattern Technique (k-means 클러스터링과 순차 패턴 기법을 이용한 VLDB 기반의 상품 추천시스템)

  • Shim, Jang-Sup;Woo, Seon-Mi;Lee, Dong-Ha;Kim, Yong-Sung;Chung, Soon-Key
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1027-1038
    • /
    • 2006
  • There are many technical problems in the recommendation system based on very large database(VLDB). So, it is necessary to study the recommendation system' structure and the data-mining technique suitable for the large scale Internet shopping mail. Thus we design and implement the product recommendation system using k-means clustering algorithm and sequential pattern technique which can be used in large scale Internet shopping mall. This paper processes user information by batch processing, defines the various categories by hierarchical structure, and uses a sequential pattern mining technique for the search engine. For predictive modeling and experiment, we use the real data(user's interest and preference of given category) extracted from log file of the major Internet shopping mall in Korea during 30 days. And we define PRP(Predictive Recommend Precision), PRR(Predictive Recommend Recall), and PF1(Predictive Factor One-measure) for evaluation. In the result of experiments, the best recommendation time and the best learning time of our system are much as O(N) and the values of measures are very excellent.

Unsupervised Learning Model for Fault Prediction Using Representative Clustering Algorithms (대표적인 클러스터링 알고리즘을 사용한 비감독형 결함 예측 모델)

  • Hong, Euyseok;Park, Mikyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.57-64
    • /
    • 2014
  • Most previous studies of software fault prediction model which determines the fault-proneness of input modules have focused on supervised learning model using training data set. However, Unsupervised learning model is needed in case supervised learning model cannot be applied: either past training data set is not present or even though there exists data set, current project type is changed. Building an unsupervised learning model is extremely difficult that is why only a few studies exist. In this paper, we build unsupervised models using representative clustering algorithms, EM and DBSCAN, that have not been used in prior studies and compare these models with the previous model using K-means algorithm. The results of our study show that the EM model performs slightly better than the K-means model in terms of error rate and these two models significantly outperform the DBSCAN model.

Design of RBFNN-Based Pattern Classifier for the Classification of Precipitation/Non-Precipitation Cases (강수/비강수 사례 분류를 위한 RBFNN 기반 패턴분류기 설계)

  • Choi, Woo-Yong;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.586-591
    • /
    • 2014
  • In this study, we introduce Radial Basis Function Neural Networks(RBFNNs) classifier using Artificial Bee Colony(ABC) algorithm in order to classify between precipitation event and non-precipitation event from given radar data. Input information data is rebuilt up through feature analysis of meteorological radar data used in Korea Meteorological Administration. In the condition phase of the proposed classifier, the values of fitness are obtained by using Fuzzy C-Mean clustering method, and the coefficients of polynomial function used in the conclusion phase are estimated by least square method. In the aggregation phase, the final output is obtained by using fuzzy inference method. The performance results of the proposed classifier are compared and analyzed by considering both QC(Quality control) data and CZ(corrected reflectivity) data being used in Korea Meteorological Administration.

Improvement of the PFCM(Possibilistic Fuzzy C-Means) Clustering Method (PFCM 클러스터링 기법의 개선)

  • Heo, Gyeong-Yong;Choe, Se-Woon;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.177-185
    • /
    • 2009
  • Cluster analysis or clustering is a kind of unsupervised learning method in which a set of data points is divided into a given number of homogeneous groups. Fuzzy clustering method, one of the most popular clustering method, allows a point to belong to all the clusters with different degrees, so produces more intuitive and natural clusters than hard clustering method does. Even more some of fuzzy clustering variants have noise-immunity. In this paper, we improved the Possibilistic Fuzzy C-Means (PFCM), which generates a membership matrix as well as a typicality matrix, using Gath-Geva (GG) method. The proposed method has a focus on the boundaries of clusters, which is different from most of the other methods having a focus on the centers of clusters. The generated membership values are suitable for the classification-type applications. As the typicality values generated from the algorithm have a similar distribution with the values of density function of Gaussian distribution, it is useful for Gaussian-type density estimation. Even more GG method can handle the clusters having different numbers of data points, which the other well-known method by Gustafson and Kessel can not. All of these points are obvious in the experimental results.

Automatic Clustering on Trained Self-organizing Feature Maps via Graph Cuts (그래프 컷을 이용한 학습된 자기 조직화 맵의 자동 군집화)

  • Park, An-Jin;Jung, Kee-Chul
    • Journal of KIISE:Software and Applications
    • /
    • v.35 no.9
    • /
    • pp.572-587
    • /
    • 2008
  • The Self-organizing Feature Map(SOFM) that is one of unsupervised neural networks is a very powerful tool for data clustering and visualization in high-dimensional data sets. Although the SOFM has been applied in many engineering problems, it needs to cluster similar weights into one class on the trained SOFM as a post-processing, which is manually performed in many cases. The traditional clustering algorithms, such as t-means, on the trained SOFM however do not yield satisfactory results, especially when clusters have arbitrary shapes. This paper proposes automatic clustering on trained SOFM, which can deal with arbitrary cluster shapes and be globally optimized by graph cuts. When using the graph cuts, the graph must have two additional vertices, called terminals, and weights between the terminals and vertices of the graph are generally set based on data manually obtained by users. The Proposed method automatically sets the weights based on mode-seeking on a distance matrix. Experimental results demonstrated the effectiveness of the proposed method in texture segmentation. In the experimental results, the proposed method improved precision rates compared with previous traditional clustering algorithm, as the method can deal with arbitrary cluster shapes based on the graph-theoretic clustering.

Design of Modeling & Simulator for ASP Realized with the Aid of Polynomiai Radial Basis Function Neural Networks (다항식 방사형기저함수 신경회로망을 이용한 ASP 모델링 및 시뮬레이터 설계)

  • Kim, Hyun-Ki;Lee, Seung-Joo;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.4
    • /
    • pp.554-561
    • /
    • 2013
  • In this paper, we introduce a modeling and a process simulator developed with the aid of pRBFNNs for activated sludge process in the sewage treatment system. Activated sludge process(ASP) of sewage treatment system facilities is a process that handles biological treatment reaction and is a very complex system with non-linear characteristics. In this paper, we carry out modeling by using essential ASP factors such as water effluent quality, the manipulated value of various pumps, and water inflow quality, and so on. Intelligent algorithms used for constructing process simulator are developed by considering multi-output polynomial radial basis function Neural Networks(pRBFNNs) as well as Fuzzy C-Means clustering and Particle Swarm Optimization. Here, the apexes of the antecedent gaussian functions of fuzzy rules are decided by C-means clustering algorithm and the apexes of the consequent part of fuzzy rules are learned by using back-propagation based on gradient decent method. Also, the parameters related to the fuzzy model are optimized by means of particle swarm optimization. The coefficients of the consequent polynomial of fuzzy rules and performance index are considered by the Least Square Estimation and Mean Squared Error. The descriptions of developed process simulator architecture and ensuing operation method are handled.

Extensions of X-means with Efficient Learning the Number of Clusters (X-means 확장을 통한 효율적인 집단 개수의 결정)

  • Heo, Gyeong-Yong;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.4
    • /
    • pp.772-780
    • /
    • 2008
  • K-means is one of the simplest unsupervised learning algorithms that solve the clustering problem. However K-means suffers the basic shortcoming: the number of clusters k has to be known in advance. In this paper, we propose extensions of X-means, which can estimate the number of clusters using Bayesian information criterion(BIC). We introduce two different versions of algorithm: modified X-means(MX-means) and generalized X-means(GX-means), which employ one full covariance matrix for one cluster and so can estimate the number of clusters efficiently without severe over-fitting which X-means suffers due to its spherical cluster assumption. The algorithms start with one cluster and try to split a cluster iteratively to maximize the BIC score. The former uses K-means algorithm to find a set of optimal clusters with current k, which makes it simple and fast. However it generates wrongly estimated centers when the clusters are overlapped. The latter uses EM algorithm to estimate the parameters and generates more stable clusters even when the clusters are overlapped. Experiments with synthetic data show that the purposed methods can provide a robust estimate of the number of clusters and cluster parameters compared to other existing top-down algorithms.

Analysis of the Inner Degradation Pattern by Clustering Algorism at Distribution Line (군집화 알고리즘을 이용한 배전선로 내부 열화 패턴 분석)

  • Choi, Woon-Shik;Kim, Jin-Sa
    • Journal of the Korean Institute of Electrical and Electronic Material Engineers
    • /
    • v.29 no.1
    • /
    • pp.58-61
    • /
    • 2016
  • Degradation in power cables used in distribution lines to the material of the wire, manufacturing method, but also the line of the environment, generates a variety of degradation depending upon the type of load. The local wire deterioration weighted wire breakage accident can occur frequently, causing significant proprietary damage can lead to accidents and precious. In this study, the signal detected by the eddy current aim to develop algorithms capable of determining the signals for the top part and at least part of the signal by using a signal processing technique called K-means algorithm.