• Title/Summary/Keyword: K-Means clustering algorithm

Search Result 548, Processing Time 0.029 seconds

Improved TI-FCM Clustering Algorithm in Big Data (빅데이터에서 개선된 TI-FCM 클러스터링 알고리즘)

  • Lee, Kwang-Kyug
    • Journal of IKEEE
    • /
    • v.23 no.2
    • /
    • pp.419-424
    • /
    • 2019
  • The FCM algorithm finds the optimal solution through iterative optimization technique. In particular, there is a difference in execution time depending on the initial center of clustering, the location of noise, the location and number of crowded densities. However, this method gradually updates the center point, and the center of the initial cluster is shifted to one side. In this paper, we propose a TI-FCM(Triangular Inequality-Fuzzy C-Means) clustering algorithm that determines the cluster center density by maximizing the distance between clusters using triangular inequality. The proposed method is an effective method to converge to real clusters compared to FCM even in large data sets. Experiments show that execution time is reduced compared to existing FCM.

A Study on Data Clustering Method Using Local Probability (국부 확률을 이용한 데이터 분류에 관한 연구)

  • Son, Chang-Ho;Choi, Won-Ho;Lee, Jae-Kook
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.1
    • /
    • pp.46-51
    • /
    • 2007
  • In this paper, we propose a new data clustering method using local probability and hypothesis theory. To cluster the test data set we analyze the local area of the test data set using local probability distribution and decide the candidate class of the data set using mean standard deviation and variance etc. To decide each class of the test data, statistical hypothesis theory is applied to the decided candidate class of the test data set. For evaluating, the proposed classification method is compared to the conventional fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm. The simulation results show more accuracy than results of fuzzy c-mean method, k-means algorithm and Discriminator analysis algorithm.

Real-Time Traffic Sign Detection Using K-means Clustering and Neural Network (K-means Clustering 기법과 신경망을 이용한 실시간 교통 표지판의 위치 인식)

  • Park, Jung-Guk;Kim, Kyung-Joong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06a
    • /
    • pp.491-493
    • /
    • 2011
  • Traffic sign detection is the domain of automatic driver assistant systems. There are literatures for traffic sign detection using color information, however, color-based method contains ill-posed condition and to extract the region of interest is difficult. In our work, we propose a method for traffic sign detection using k-means clustering method, back-propagation neural network, and projection histogram features that yields the robustness for ill-posed condition. Using the color information of traffic signs enables k-means algorithm to cluster the region of interest for the detection efficiently. In each step of clustering, a cluster is verified by the neural network so that the cluster exactly represents the location of a traffic sign. Proposed method is practical, and yields robustness for the unexpected region of interest or for multiple detections.

Colorectal Cancer Staging Using Three Clustering Methods Based on Preoperative Clinical Findings

  • Pourahmad, Saeedeh;Pourhashemi, Soudabeh;Mohammadianpanah, Mohammad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.17 no.2
    • /
    • pp.823-827
    • /
    • 2016
  • Determination of the colorectal cancer stage is possible only after surgery based on pathology results. However, sometimes this may prove impossible. The aim of the present study was to determine colorectal cancer stage using three clustering methods based on preoperative clinical findings. All patients referred to the Colorectal Research Center of Shiraz University of Medical Sciences for colorectal cancer surgery during 2006 to 2014 were enrolled in the study. Accordingly, 117 cases participated. Three clustering algorithms were utilized including k-means, hierarchical and fuzzy c-means clustering methods. External validity measures such as sensitivity, specificity and accuracy were used for evaluation of the methods. The results revealed maximum accuracy and sensitivity values for the hierarchical and a maximum specificity value for the fuzzy c-means clustering methods. Furthermore, according to the internal validity measures for the present data set, the optimal number of clusters was two (silhouette coefficient) and the fuzzy c-means algorithm was more appropriate than the k-means clustering approach by increasing the number of clusters.

Projection Pursuit K-Means Visual Clustering

  • Kim, Mi-Kyung;Huh, Myung-Hoe
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.519-532
    • /
    • 2002
  • K-means clustering is a well-known partitioning method of multivariate observations. Recently, the method is implemented broadly in data mining softwares due to its computational efficiency in handling large data sets. However, it does not yield a suitable visual display of multivariate observations that is important especially in exploratory stage of data analysis. The aim of this study is to develop a K-means clustering method that enables visual display of multivariate observations in a low-dimensional space, for which the projection pursuit method is adopted. We propose a computationally inexpensive and reliable algorithm and provide two numerical examples.

An Improved Clustering Method with Cluster Density Independence

  • Yoo, Byeong-Hyeon;Kim, Wan-Woo;Heo, Gyeongyong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.12
    • /
    • pp.15-20
    • /
    • 2015
  • In this paper, we propose a modified fuzzy clustering algorithm which can overcome the center deviation due to the Euclidean distance commonly used in fuzzy clustering. Among fuzzy clustering methods, Fuzzy C-Means (FCM) is the most well-known clustering algorithm and has been widely applied to various problems successfully. In FCM, however, cluster centers tend leaning to high density clusters because the Euclidean distance measure forces high density cluster to make more contribution to clustering result. Proposed is an enhanced algorithm which modifies the objective function of FCM by adding a center-scattering term to make centers not to be close due to the cluster density. The proposed method converges more to real centers with small number of iterations compared to FCM. All the strengths can be verified with experimental results.

A Design of Fuzzy Classifier with Hierarchical Structure (계층적 구조를 가진 퍼지 패턴 분류기 설계)

  • Ahn, Tae-Chon;Roh, Seok-Beom;Kim, Yong Soo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.4
    • /
    • pp.355-359
    • /
    • 2014
  • In this paper, we proposed the new fuzzy pattern classifier which combines several fuzzy models with simple consequent parts hierarchically. The basic component of the proposed fuzzy pattern classifier with hierarchical structure is a fuzzy model with simple consequent part so that the complexity of the proposed fuzzy pattern classifier is not high. In order to analyze and divide the input space, we use Fuzzy C-Means clustering algorithm. In addition, we exploit Conditional Fuzzy C-Means clustering algorithm to analyze the sub space which is divided by Fuzzy C-Means clustering algorithm. At each clustered region, we apply a fuzzy model with simple consequent part and build the fuzzy pattern classifier with hierarchical structure. Because of the hierarchical structure of the proposed pattern classifier, the data distribution of the input space can be analyzed in the macroscopic point of view and the microscopic point of view. Finally, in order to evaluate the classification ability of the proposed pattern classifier, the machine learning data sets are used.

Segmentation of Color Image by Subtractive and Gravity Fuzzy C-means Clustering (차감 및 중력 fuzzy C-means 클러스터링을 이용한 칼라 영상 분할에 관한 연구)

  • Jin, Young-Goun;Kim, Tae-Gyun
    • Journal of IKEEE
    • /
    • v.1 no.1 s.1
    • /
    • pp.93-100
    • /
    • 1997
  • In general, fuzzy C-means clustering method was used on the segmentation of true color image. However, this method requires number of clusters as an input. In this study, we suggest new method that uses subtractive and gravity fuzzy C-means clustering. We get number of clusters and initial cluster centers by applying subtractive clustering on color image. After coarse segmentation of the image, we apply gravity fuzzy C-means for optimizing segmentation of the image. We show efficiency of the proposed algorithm by qualitative evaluation.

  • PDF

Industrial Waste Database Analysis Using Data Mining Techniques

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.455-465
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database. It is used to find hidden knowledge by massive data, unexpectedly pattern, and relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these outputs for environmental preservation and environmental improvement.

  • PDF

Industrial Waste Database Analysis Using Data Mining

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2006.04a
    • /
    • pp.241-251
    • /
    • 2006
  • Data mining is the method to find useful information for large amounts of data in database It is used to find hidden knowledge by massive data, unexpectedly pattern, relation to new rule. The methods of data mining are decision tree, association rules, clustering, neural network and so on. We analyze industrial waste database using data mining technique. We use k-means algorithm for clustering and C5.0 algorithm for decision tree and Apriori algorithm for association rule. We can use these analysis outputs for environmental preservation and environmental improvement.

  • PDF