• Title/Summary/Keyword: K-Means clustering algorithm

Search Result 548, Processing Time 0.028 seconds

Nonlinear Inference Using Fuzzy Cluster (퍼지 클러스터를 이용한 비선형 추론)

  • Park, Keon-Jung;Lee, Dong-Yoon
    • Journal of Digital Convergence
    • /
    • v.14 no.1
    • /
    • pp.203-209
    • /
    • 2016
  • In this paper, we introduce a fuzzy inference systems for nonlinear inference using fuzzy cluster. Typically, the generation of fuzzy rules for nonlinear inference causes the problem that the number of fuzzy rules increases exponentially if the input vectors increase. To handle this problem, the fuzzy rules of fuzzy model are designed by dividing the input vector space in the scatter form using fuzzy clustering algorithm which expresses fuzzy cluster. From this method, complex nonlinear process can be modeled. The premise part of the fuzzy rules is determined by means of FCM clustering algorithm with fuzzy clusters. The consequence part of the fuzzy rules have four kinds of polynomial functions and the coefficient parameters of each rule are estimated by using the standard least-squares method. And we use the data widely used in nonlinear process for the performance and the nonlinear characteristics of the nonlinear process. Experimental results show that the non-linear inference is possible.

Document Clustering using Generic Algorithm and Cluster Measurement (클러스터 측정과 유전자 알고리즘을 이용한 문서 클러스터링)

  • Choi, Lim Cheon;Park, Soon Cheol
    • Annual Conference of KIPS
    • /
    • 2010.11a
    • /
    • pp.490-493
    • /
    • 2010
  • 본 논문에서는 클러스터 측정(Cluster Measurement)과 유전자 알고리즘을 이용한 문서 클러스링 알고리즘을 제안한다. 유전자 알고리즘의 요소를 클러스터링에 대입하고 클러스터 측정을 적합도 함수에 대입하여 문서 클러스터링을 구현하였다. 성능 평가를 위하여 한국일보-20000/한국일보-40075 문서범주화 실험문서집합의 데이터 셋을 이용하였다. 클러스터링 성능 평가 결과 AS Index가 DB Index, RS Index 보다 좋은 성능을 보여준다. 또한 제안한 알고리즘이 K-means 클러스터링 알고리즘에 비교해 안정적으로 좋은 성능을 보여준다.

Optimized KNN/IFCM Algorithm for Efficient Indoor Location (효율적인 실내 측위를 위한 최적화된 KNN/IFCM 알고리즘)

  • Lee, Jang-Jae;Song, Lick-Ho;Kim, Jong-Hwa;Lee, Seong-Ro
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.2
    • /
    • pp.125-133
    • /
    • 2011
  • For any pattern matching based algorithm in WLAN environment, the characteristics of signal to noise ratio(SNR) to multiple access points(APs) are utilized to establish database in the training phase, and in the estimation phase, the actual two dimensional coordinates of mobile unit(MU) are estimated based on the comparison between the new recorded SNR and fingerprints stored in database. As fingerprinting method, k-nearest neighbor(KNN) has been widely applied for indoor location in wireless location area networks(WLAN), but its performance is sensitive to number of neighbors k and positions of reference points(RPs). So intuitive fuzzy c-means(IFCM) clustering algorithm is applied to improve KNN, which is the KNN/IFCM hybrid algorithm presented in this paper. In the proposed algorithm, through KNN, k RPs are firstly chosen as the data samples of IFCM based on signal to noise ratio(SNR). Then, the k RPs are classified into different clusters through IFCM based on SNR. Experimental results indicate that the proposed KNN/IFCM hybrid algorithm generally outperforms KNN, KNN/FCM, KNN/PFCM algorithm when the locations error is less than 2m.

Design of Data-centroid Radial Basis Function Neural Network with Extended Polynomial Type and Its Optimization (데이터 중심 다항식 확장형 RBF 신경회로망의 설계 및 최적화)

  • Oh, Sung-Kwun;Kim, Young-Hoon;Park, Ho-Sung;Kim, Jeong-Tae
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.3
    • /
    • pp.639-647
    • /
    • 2011
  • In this paper, we introduce a design methodology of data-centroid Radial Basis Function neural networks with extended polynomial function. The two underlying design mechanisms of such networks involve K-means clustering method and Particle Swarm Optimization(PSO). The proposed algorithm is based on K-means clustering method for efficient processing of data and the optimization of model was carried out using PSO. In this paper, as the connection weight of RBF neural networks, we are able to use four types of polynomials such as simplified, linear, quadratic, and modified quadratic. Using K-means clustering, the center values of Gaussian function as activation function are selected. And the PSO-based RBF neural networks results in a structurally optimized structure and comes with a higher level of flexibility than the one encountered in the conventional RBF neural networks. The PSO-based design procedure being applied at each node of RBF neural networks leads to the selection of preferred parameters with specific local characteristics (such as the number of input variables, a specific set of input variables, and the distribution constant value in activation function) available within the RBF neural networks. To evaluate the performance of the proposed data-centroid RBF neural network with extended polynomial function, the model is experimented with using the nonlinear process data(2-Dimensional synthetic data and Mackey-Glass time series process data) and the Machine Learning dataset(NOx emission process data in gas turbine plant, Automobile Miles per Gallon(MPG) data, and Boston housing data). For the characteristic analysis of the given entire dataset with non-linearity as well as the efficient construction and evaluation of the dynamic network model, the partition of the given entire dataset distinguishes between two cases of Division I(training dataset and testing dataset) and Division II(training dataset, validation dataset, and testing dataset). A comparative analysis shows that the proposed RBF neural networks produces model with higher accuracy as well as more superb predictive capability than other intelligent models presented previously.

Color Code Detection and Recognition Using Image Segmentation Based on k-Means Clustering Algorithm (k-평균 클러스터링 알고리즘 기반의 영상 분할을 이용한 칼라코드 검출 및 인식)

  • Kim, Tae-Woo;Yoo, Hyeon-Joong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.6
    • /
    • pp.1100-1105
    • /
    • 2006
  • Severe distortions of colors in the obtained images have made it difficult for color codes to expand their applications. To reduce the effect of color distortions on reading colors, it will be more desirable to statistically process as many pixels in the individual color region as possible, than relying on some regularly sampled pixels. This process may require segmentation, which usually requires edge detection. However, edges in color codes can be disconnected due tovarious distortions such as zipper effect and reflection, to name a few, making segmentation incomplete. Edge linking is also a difficult process. In this paper, a more efficient approach to reducing the effect of color distortions on reading colors, one that excludes precise edge detection for segmentation, was obtained by employing the k-means clustering algorithm. And, in detecting color codes, the properties of both six safe colors and grays were utilized. Experiments were conducted on 144, 4M-pixel, outdoor images. The proposed method resulted in a color-code detection rate of 100% fur the test images, and an average color-reading accuracy of over 99% for the detected codes, while the highest accuracy that could be achieved with an approach employing Canny edge detection was 91.28%.

  • PDF

Probabilistic reduced K-means cluster analysis (확률적 reduced K-means 군집분석)

  • Lee, Seunghoon;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.34 no.6
    • /
    • pp.905-922
    • /
    • 2021
  • Cluster analysis is one of unsupervised learning techniques used for discovering clusters when there is no prior knowledge of group membership. K-means, one of the commonly used cluster analysis techniques, may fail when the number of variables becomes large. In such high-dimensional cases, it is common to perform tandem analysis, K-means cluster analysis after reducing the number of variables using dimension reduction methods. However, there is no guarantee that the reduced dimension reveals the cluster structure properly. Principal component analysis may mask the structure of clusters, especially when there are large variances for variables that are not related to cluster structure. To overcome this, techniques that perform dimension reduction and cluster analysis simultaneously have been suggested. This study proposes probabilistic reduced K-means, the transition of reduced K-means (De Soete and Caroll, 1994) into a probabilistic framework. Simulation shows that the proposed method performs better than tandem clustering or clustering without any dimension reduction. When the number of the variables is larger than the number of samples in each cluster, probabilistic reduced K-means show better formation of clusters than non-probabilistic reduced K-means. In the application to a real data set, it revealed similar or better cluster structure compared to other methods.

Design of PCA-based pRBFNNs Pattern Classifier for Digit Recognition (숫자 인식을 위한 PCA 기반 pRBFNNs 패턴 분류기 설계)

  • Lee, Seung-Cheol;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.25 no.4
    • /
    • pp.355-360
    • /
    • 2015
  • In this paper, we propose the design of Radial Basis Function Neural Network based on PCA in order to recognize handwritten digits. The proposed pattern classifier consists of the preprocessing step of PCA and the pattern classification step of pRBFNNs. In the preprocessing step, Feature data is obtained through preprocessing step of PCA for minimizing the information loss of given data and then this data is used as input data to pRBFNNs. The hidden layer of the proposed classifier is built up by Fuzzy C-Means(FCM) clustering algorithm and the connection weights are defined as linear polynomial function. In the output layer, polynomial parameters are obtained by using Least Square Estimation (LSE). MNIST database known as one of the benchmark handwritten dataset is applied for the performance evaluation of the proposed classifier. The experimental results of the proposed system are compared with other existing classifiers.

A Study on VQ/HMM using Nonlinear Clustering and Smoothing Method (비선형 집단화와 완화기법을 이용한 VQ/HMM에 관한 연구)

  • 정희석;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.3
    • /
    • pp.35-42
    • /
    • 1999
  • In this paper, a modified clustering algorithm is proposed to improve the discrimination of discrete HMM(Hidden Markov Model), so that it has increased recognition rate of 2.16% in comparison with the original HMM using the K-means or LBG algorithm. And, for preventing the decrease of recognition rate because of insufficient training data at the training scheme of HMM, a modified probabilistic smoothing method is proposed, which has increased recognition rate of 3.07% for the speaker-independent case. In the experiment applied the two proposed algorithms, the average rate of recognition has increased 4.66% for the speaker-independent case in comparison with that of original VQ/HMM.

  • PDF

An Image Processing Mechanism for Disease Detection in Tomato Leaf (토마토 잎사귀 질병 감지를 위한 이미지 처리 메커니즘)

  • Park, Jeong-Hyeon;Lee, Sung-Keun
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.14 no.5
    • /
    • pp.959-968
    • /
    • 2019
  • In the agricultural industry, wireless sensor network technology has being applied by utilizing various sensors and embedded systems. In particular, a lot of researches are being conducted to diagnose diseases of crops early by using sensor network. There are some difficulties on traditional research how to diagnose crop diseases is not practical for agriculture. This paper proposes the algorithm which enables to investigate and analyze the crop leaf image taken by image camera and detect the infected area within the image. We applied the enhanced k-means clustering method to the images captured at horticulture facility and categorized the areas in the image. Then we used the edge detection and edge tracking scheme to decide whether the extracted areas are located in inside of leaf or not. The performance was evaluated using the images capturing tomato leaves. The results of performance evaluation shows that the proposed algorithm outperforms the traditional algorithms in terms of classification capability.

SWAT Direct Runoff and Baseflow Evaluation using Web-based Flow Clustering EI Estimation System (웹기반의 유량 군집화 EI 평가시스템을 이용한 SWAT 직접유출과 기저유출 평가)

  • Jang, Won Seok;Moon, Jong Pil;Kim, Nam Won;Yoo, Dong Sun;Kum, Dong Hyuk;Kim, Ik Jae;Mun, Yuri;Lim, Kyoung Jae
    • Journal of Korean Society on Water Environment
    • /
    • v.27 no.1
    • /
    • pp.61-72
    • /
    • 2011
  • In order to assess hydrologic and nonpoint source pollutant behaviors in a watershed with Soil and Water Assessment Tool (SWAT) model, the accuracy evaluation of SWAT model should be conducted prior to the application of it to a watershed. When calibrating and validating hydrological components of SWAT model, the Nash-Sutcliffe efficiency coefficient (EI) has been widely used. However, the EI value has been known as it is affected sensitively by big numbers among the range of numbers. In this study, a Web-based flow clustering EI estimation system using K-means clustering algorithm was developed and used for SWAT hydrology evaluation. Even though the EI of total streamflow was high, the EI values of hydrologic components (i.e., direct runoff and baseflow) were not high. Also when the EI values of flow group I and II (i.e., low and high value group) clustered from direct runoff and baseflow were computed, respectively, the EI values of them were much lower with negative EI values for some flow group comparison. The SWAT auto-calibration tool estimated values also showed negative EI values for most flow group I and II of direct runoff and baseflow although EI value of total streamflow was high. The result obtained in this study indicates that the SWAT hydrology component should be calibrated until all four positive EI values for each flow group of direct runoff and baseflow are obtained for better accuracy both in direct runoff and baseflow.