• Title/Summary/Keyword: k-mean clustering algorithm

Search Result 119, Processing Time 0.024 seconds

A New Fast EM Algorithm (새로운 고속 EM 알고리즘)

  • 김성수;강지혜
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.10
    • /
    • pp.575-587
    • /
    • 2004
  • In this paper. a new Fast Expectation-Maximization algorithm(FEM) is proposed. Firstly the K-means algorithm is modified to reduce the number of iterations for finding the initial values that are used as the initial values in EM process. Conventionally the Initial values in K-means clustering are chosen randomly. which sometimes forces the process of clustering converge to some undesired center points. Uniform partitioning method is added to the conventional K-means to extract the proper initial points for each clusters. Secondly the effect of posterior probability is emphasized such that the application of Maximum Likelihood Posterior(MLP) yields fast convergence. The proposed FEM strengthens the characteristics of conventional EM by reinforcing the speed of convergence. The superiority of FEM is demonstrated in experimental results by presenting the improvement results of EM and accelerating the speed of convergence in parameter estimation procedures.

A Study on Clustering using Genetic Algorithm (유전자 알고리즘을 이용한 문서 클러스터링 연구)

  • Song, Wei;Choi, Lim Cheon;Park, Soon Cheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.325-326
    • /
    • 2009
  • 본 논문에서는 효율적인 인공지능 알고리즘인 유전자 알고리즘(GA)을 이용한 문서 클러스터링 시스템을 제안한다. 일반적으로 클러스터링 알고리즘에 가장 많이 사용되는 K-Means는 임의로 결정되는 초기 센트로이드 벡터에 따라 그 성능이 많이 달라지는 것을 볼 수 있다. 이에 본 논문에서는 유전자 알고리즘을 이용하여 안정적이면서도 높은 성능을 보여주는 클러스터링 알고리즘을 개발하였다. 제안한 클러스터링 알고리즘의 성능 평가를 위하여 HANTEC 2.0과 문서 범주화 집단 데이터 셋을 사용하였다. 제안된 방법은 효율적이고 빠른 K-Means를 이용한 클러스터링 알고리즘에 비하여 훨씬 뛰어난 성능을 보였다.

A Dispersion Mean Algorithm based on Similarity Measure for Evaluation of Port Competitiveness (항만 경쟁력 평가를 위한 유사도 기반의 이산형 평균 알고리즘)

  • Chw, Bong-Sung;Lee, Cheol-Yeong
    • Journal of Navigation and Port Research
    • /
    • v.28 no.3
    • /
    • pp.185-191
    • /
    • 2004
  • The mean and Clustering are important methods of data mining, which is now widely applied to various multi-attributes problem However, feature weighting and feature selection are important in those methods bemuse features may differ in importance and such differences need to be considered in data mining with various multiful-attributes problem. In addition, in the event of arithmetic mean, which is inadequate to figure out the most fitted result for structure of evaluation with attributes that there are weighted and ranked. Moreover, it is hard to catch hold of a specific character for assume the form of user's group. In this paper. we propose a dispersion mean algorithm for evaluation of similarity measure based on the geometrical figure. In addition, it is applied to mean classified by user's group. One of the key issues to be considered in evaluation of the similarity measure is how to achieve objectiveness that it is not change over an item ranking in evaluation process.

Problems in Fuzzy c-means and Its Possible Solutions (Fuzzy c-means의 문제점 및 해결 방안)

  • Heo, Gyeong-Yong;Seo, Jin-Seok;Lee, Im-Geun
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.1
    • /
    • pp.39-46
    • /
    • 2011
  • Clustering is one of the well-known unsupervised learning methods, in which a data set is grouped into some number of homogeneous clusters. There are numerous clustering algorithms available and they have been used in various applications. Fuzzy c-means (FCM), the most well-known partitional clustering algorithm, was established in 1970's and still in use. However, there are some unsolved problems in FCM and variants of FCM are still under development. In this paper, the problems in FCM are first explained and the available solutions are investigated, which is aimed to give researchers some possible ways of future research. Most of the FCM variants try to solve the problems using domain knowledge specific to a given problem. However, in this paper, we try to give general solutions without using any domain knowledge. Although there are more things left than discovered, this paper may be a good starting point for researchers newly entered into a clustering area.

Comparison of Blooming Artifact Reduction Using Image Segmentation Method in CT Image (CT영상에서 이미지 분할기법을 적용한 Blooming Artifact Reduction 비교 연구)

  • Kim, Jung-Hun;Park, Ji-Eun;Park, Yu-Jin;Ji, In-Hee;Lee, Jong-Min;Cho, Jin-Ho
    • Journal of Biomedical Engineering Research
    • /
    • v.38 no.6
    • /
    • pp.295-301
    • /
    • 2017
  • In this study, We subtracted the calcification blooming artifact from MDCT images of coronary atherosclerosis patients and verified their accuracy and usefulness. We performed coronary artery calcification stenosis phantom and a program to subtract calcification blooming artifact by applying 8 different image segmentation method (Otsu, Sobel, Prewitt, Canny, DoG, Region Growing, Gaussian+K-mean clustering, Otsu+DoG). As a result, In the coronary artery calcification stenosis phantom with the lumen region 5 mm the calcification blooming artifact was subtracted in the application of the mixture of Gaussian filtering and K- Clustering algorithm, and the value was close to the actual calcification region. These results may help to accurately diagnose coronary artery calcification stenosis.

A Study on the Improvement of Fault Detection Capability for Fault Indicator using Fuzzy Clustering and Neural Network (퍼지클러스터링 기법과 신경회로망을 이용한 고장표시기의 고장검출 능력 개선에 관한 연구)

  • Hong, Dae-Seung;Yim, Hwa-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.374-379
    • /
    • 2007
  • This paper focuses on the improvement of fault detection algorithm in FRTU(feeder remote terminal unit) on the feeder of distribution power system. FRTU is applied to fault detection schemes for phase fault and ground fault. Especially, cold load pickup and inrush restraint functions distinguish the fault current from the normal load current. FRTU shows FI(Fault Indicator) when the fault current is over pickup value or inrush current. STFT(Short Time Fourier Transform) analysis provides the frequency and time Information. FCM(Fuzzy C-Mean clustering) algorithm extracts characteristics of harmonics. The neural network system as a fault detector was trained to distinguish the inruih current from the fault status by a gradient descent method. In this paper, fault detection is improved by using FCM and neural network. The result data were measured in actual 22.9kV distribution power system.

Automatic Photovoltaic Panel Area Extraction from UAV Thermal Infrared Images

  • Kim, Dusik;Youn, Junhee;Kim, Changyoon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.6
    • /
    • pp.559-568
    • /
    • 2016
  • For the economic management of photovoltaic power plants, it is necessary to regularly monitor the panels within the plants to detect malfunctions. Thermal infrared image cameras are generally used for monitoring, since malfunctioning panels emit higher temperatures compared to those that are functioning. Recently, technologies that observe photovoltaic arrays by mounting thermal infrared cameras on UAVs (Unmanned Aerial Vehicle) are being developed for the efficient monitoring of large-scale photovoltaic power plants. However, the technologies developed until now have had the shortcomings of having to analyze the images manually to detect malfunctioning panels, which is time-consuming. In this paper, we propose an automatic photovoltaic panel area extraction algorithm for thermal infrared images acquired via a UAV. In the thermal infrared images, panel boundaries are presented as obvious linear features, and the panels are regularly arranged. Therefore, we exaggerate the linear features with a vertical and horizontal filtering algorithm, and apply a modified hierarchical histogram clustering method to extract candidates of panel boundaries. Among the candidates, initial panel areas are extracted by exclusion editing with the results of the photovoltaic array area detection. In this step, thresholding and image morphological algorithms are applied. Finally, panel areas are refined with the geometry of the surrounding panels. The accuracy of the results is evaluated quantitatively by manually digitized data, and a mean completeness of 95.0%, a mean correctness of 96.9%, and mean quality of 92.1 percent are obtained with the proposed algorithm.

Designing Tracking Method using Compensating Acceleration with FCM for Maneuvering Target (FCM 기반 추정 가속도 보상을 이용한 기동표적 추적기법 설계)

  • Son, Hyun-Seung;Park, Jin-Bae;Joo, Young-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.3
    • /
    • pp.82-89
    • /
    • 2012
  • This paper presents the intelligent tracking algorithm for maneuvering target using the positional error compensation of the maneuvering target. The difference between measured point and predict point is separated into acceleration and noise. Fuzzy c-mean clustering and predicted impact point are used to get the optimal acceleration value. The membership function is determined for acceleration and noise which are divided by fuzzy c-means clustering and the characteristics of the maneuvering target is figured out. Divided acceleration and noise are used in the tracking algorithm to compensate computational error. The filtering process in a series of the algorithm which estimates the target value recognize the nonlinear maneuvering target as linear one because the filter recognize only remained noise by extracting acceleration from the positional error. After filtering process, we get the estimates target by compensating extracted acceleration. The proposed system improves the adaptiveness and the robustness by adjusting the parameters in the membership function of fuzzy system. To maximize the effectiveness of the proposed system, we construct the multiple model structure. Procedures of the proposed algorithm can be implemented as an on-line system. Finally, some examples are provided to show the effectiveness of the proposed algorithm.

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

Implementation of Elbow Method to improve the Gases Classification Performance based on the RBFN-NSG Algorithm

  • Jeon, Jin-Young;Choi, Jang-Sik;Byun, Hyung-Gi
    • Journal of Sensor Science and Technology
    • /
    • v.25 no.6
    • /
    • pp.431-434
    • /
    • 2016
  • Currently, the radial basis function network (RBFN) and various other neural networks are employed to classify gases using chemical sensors arrays, and their performance is steadily improving. In particular, the identification performance of the RBFN algorithm is being improved by optimizing parameters such as the center, width, and weight, and improved algorithms such as the radial basis function network-stochastic gradient (RBFN-SG) and radial basis function network-normalized stochastic gradient (RBFN-NSG) have been announced. In this study, we optimized the number of centers, which is one of the parameters of the RBFN-NSG algorithm, and observed the change in the identification performance. For the experiment, repeated measurement data of 8 samples were used, and the elbow method was applied to determine the optimal number of centers for each sample of input data. The experiment was carried out in two cases(the only one center per sample and the optimal number of centers obtained by elbow method), and the experimental results were compared using the mean square error (MSE). From the results of the experiments, we observed that the case having an optimal number of centers, obtained using the elbow method, showed a better identification performance than that without any optimization.