• Title/Summary/Keyword: K-Mean++ 클러스터링

Search Result 83, Processing Time 0.028 seconds

Classifying Color Codes Via k-Mean Clustering and L*a*b* Color Model (k-평균 클러스터링과 L*a*b* 칼라 모델에 의한 칼라코드 분류)

  • Yoo, Hyeon-Joong
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.2
    • /
    • pp.109-116
    • /
    • 2007
  • To reduce the effect of color distortions on reading colors, it is more desirable to statistically process as many pixels in the individual color region as possible. This process may require segmentation, which usually requires edge detection. However, edges in color codes can be disconnected due to various distortions such as dark current, color cross, zipper effect, shade and reflection, to name a few. Edge linking is also a difficult process. In this paper, k-means clustering was performed on the images where edge detectors failed segmentation. Experiments were conducted on 311 images taken in different environments with different cameras. The primary and secondary colors were randomly selected for each color code region. While segmentation rate by edge detectors was 89.4%, the proposed method increased it to 99.4%. Color recognition was performed based on hue, a*, and b* components, with the accuracy of 100% for the successfully segmented cases.

Product Recommendation System on VLDB using k-means Clustering and Sequential Pattern Technique (k-means 클러스터링과 순차 패턴 기법을 이용한 VLDB 기반의 상품 추천시스템)

  • Shim, Jang-Sup;Woo, Seon-Mi;Lee, Dong-Ha;Kim, Yong-Sung;Chung, Soon-Key
    • The KIPS Transactions:PartD
    • /
    • v.13D no.7 s.110
    • /
    • pp.1027-1038
    • /
    • 2006
  • There are many technical problems in the recommendation system based on very large database(VLDB). So, it is necessary to study the recommendation system' structure and the data-mining technique suitable for the large scale Internet shopping mail. Thus we design and implement the product recommendation system using k-means clustering algorithm and sequential pattern technique which can be used in large scale Internet shopping mall. This paper processes user information by batch processing, defines the various categories by hierarchical structure, and uses a sequential pattern mining technique for the search engine. For predictive modeling and experiment, we use the real data(user's interest and preference of given category) extracted from log file of the major Internet shopping mall in Korea during 30 days. And we define PRP(Predictive Recommend Precision), PRR(Predictive Recommend Recall), and PF1(Predictive Factor One-measure) for evaluation. In the result of experiments, the best recommendation time and the best learning time of our system are much as O(N) and the values of measures are very excellent.

Unsupervised Learning Model for Fault Prediction Using Representative Clustering Algorithms (대표적인 클러스터링 알고리즘을 사용한 비감독형 결함 예측 모델)

  • Hong, Euyseok;Park, Mikyeong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.3 no.2
    • /
    • pp.57-64
    • /
    • 2014
  • Most previous studies of software fault prediction model which determines the fault-proneness of input modules have focused on supervised learning model using training data set. However, Unsupervised learning model is needed in case supervised learning model cannot be applied: either past training data set is not present or even though there exists data set, current project type is changed. Building an unsupervised learning model is extremely difficult that is why only a few studies exist. In this paper, we build unsupervised models using representative clustering algorithms, EM and DBSCAN, that have not been used in prior studies and compare these models with the previous model using K-means algorithm. The results of our study show that the EM model performs slightly better than the K-means model in terms of error rate and these two models significantly outperform the DBSCAN model.

An Efficient Clustering Algorithm based on Heuristic Evolution (휴리스틱 진화에 기반한 효율적 클러스터링 알고리즘)

  • Ryu, Joung-Woo;Kang, Myung-Ku;Kim, Myung-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.80-90
    • /
    • 2002
  • Clustering is a useful technique for grouping data points such that points within a single group/cluster have similar characteristics. Many clustering algorithms have been developed and used in engineering applications including pattern recognition and image processing etc. Recently, it has drawn increasing attention as one of important techniques in data mining. However, clustering algorithms such as K-means and Fuzzy C-means suffer from difficulties. Those are the needs to determine the number of clusters apriori and the clustering results depending on the initial set of clusters which fails to gain desirable results. In this paper, we propose a new clustering algorithm, which solves mentioned problems. In our method we use evolutionary algorithm to solve the local optima problem that clustering converges to an undesirable state starting with an inappropriate set of clusters. We also adopt a new measure that represents how well data are clustered. The measure is determined in terms of both intra-cluster dispersion and inter-cluster separability. Using the measure, in our method the number of clusters is automatically determined as the result of optimization process. And also, we combine heuristic that is problem-specific knowledge with a evolutionary algorithm to speed evolutionary algorithm search. We have experimented our algorithm with several sets of multi-dimensional data and it has been shown that one algorithm outperforms the existing algorithms.

Design of RBFNN-Based Pattern Classifier for the Classification of Precipitation/Non-Precipitation Cases (강수/비강수 사례 분류를 위한 RBFNN 기반 패턴분류기 설계)

  • Choi, Woo-Yong;Oh, Sung-Kwun;Kim, Hyun-Ki
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.6
    • /
    • pp.586-591
    • /
    • 2014
  • In this study, we introduce Radial Basis Function Neural Networks(RBFNNs) classifier using Artificial Bee Colony(ABC) algorithm in order to classify between precipitation event and non-precipitation event from given radar data. Input information data is rebuilt up through feature analysis of meteorological radar data used in Korea Meteorological Administration. In the condition phase of the proposed classifier, the values of fitness are obtained by using Fuzzy C-Mean clustering method, and the coefficients of polynomial function used in the conclusion phase are estimated by least square method. In the aggregation phase, the final output is obtained by using fuzzy inference method. The performance results of the proposed classifier are compared and analyzed by considering both QC(Quality control) data and CZ(corrected reflectivity) data being used in Korea Meteorological Administration.

Design of Optimized Radial Basis Function Neural Networks Classifier with the Aid of Principal Component Analysis and Linear Discriminant Analysis (주성분 분석법과 선형판별 분석법을 이용한 최적화된 방사형 기저 함수 신경회로망 분류기의 설계)

  • Kim, Wook-Dong;Oh, Sung-Kwun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.6
    • /
    • pp.735-740
    • /
    • 2012
  • In this paper, we introduce design methodologies of polynomial radial basis function neural network classifier with the aid of Principal Component Analysis(PCA) and Linear Discriminant Analysis(LDA). By minimizing the information loss of given data, Feature data is obtained through preprocessing of PCA and LDA and then this data is used as input data of RBFNNs. The hidden layer of RBFNNs is built up by Fuzzy C-Mean(FCM) clustering algorithm instead of receptive fields and linear polynomial function is used as connection weights between hidden and output layer. In order to design optimized classifier, the structural and parametric values such as the number of eigenvectors of PCA and LDA, and fuzzification coefficient of FCM algorithm are optimized by Artificial Bee Colony(ABC) optimization algorithm. The proposed classifier is applied to some machine learning datasets and its result is compared with some other classifiers.

A Study on the Improvement of Fault Detection Capability for Fault Indicator using Fuzzy Clustering and Neural Network (퍼지클러스터링 기법과 신경회로망을 이용한 고장표시기의 고장검출 능력 개선에 관한 연구)

  • Hong, Dae-Seung;Yim, Hwa-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.17 no.3
    • /
    • pp.374-379
    • /
    • 2007
  • This paper focuses on the improvement of fault detection algorithm in FRTU(feeder remote terminal unit) on the feeder of distribution power system. FRTU is applied to fault detection schemes for phase fault and ground fault. Especially, cold load pickup and inrush restraint functions distinguish the fault current from the normal load current. FRTU shows FI(Fault Indicator) when the fault current is over pickup value or inrush current. STFT(Short Time Fourier Transform) analysis provides the frequency and time Information. FCM(Fuzzy C-Mean clustering) algorithm extracts characteristics of harmonics. The neural network system as a fault detector was trained to distinguish the inruih current from the fault status by a gradient descent method. In this paper, fault detection is improved by using FCM and neural network. The result data were measured in actual 22.9kV distribution power system.

Improvement of the PFCM(Possibilistic Fuzzy C-Means) Clustering Method (PFCM 클러스터링 기법의 개선)

  • Heo, Gyeong-Yong;Choe, Se-Woon;Woo, Young-Woon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.13 no.1
    • /
    • pp.177-185
    • /
    • 2009
  • Cluster analysis or clustering is a kind of unsupervised learning method in which a set of data points is divided into a given number of homogeneous groups. Fuzzy clustering method, one of the most popular clustering method, allows a point to belong to all the clusters with different degrees, so produces more intuitive and natural clusters than hard clustering method does. Even more some of fuzzy clustering variants have noise-immunity. In this paper, we improved the Possibilistic Fuzzy C-Means (PFCM), which generates a membership matrix as well as a typicality matrix, using Gath-Geva (GG) method. The proposed method has a focus on the boundaries of clusters, which is different from most of the other methods having a focus on the centers of clusters. The generated membership values are suitable for the classification-type applications. As the typicality values generated from the algorithm have a similar distribution with the values of density function of Gaussian distribution, it is useful for Gaussian-type density estimation. Even more GG method can handle the clusters having different numbers of data points, which the other well-known method by Gustafson and Kessel can not. All of these points are obvious in the experimental results.

Design of video surveillance system using k-means clustering (k-means 클러스터링을 이용한 CCTV의 효율적인 운영 설계)

  • Hong, Ji-Hoon;kim, Seung ho;Lee, Keun-Ho
    • Journal of Internet of Things and Convergence
    • /
    • v.3 no.2
    • /
    • pp.1-5
    • /
    • 2017
  • As CCTV technology develops, it is used in various fields. Currently, we want to know about CCTV operation in detail. In addition, CCTV in many fields is causing problems in operation. We plan to design a new system to solve the problem. In this paper, we analyze data using K-means so that CCTV can be operated efficiently, add new technology and function to existing system to increase image technology and operate efficiently, Technology. In addition, we will design a new system for CCTV technology using k-means so that the CCTV can be efficiently operated in the center, and propose the problem to solve the problem.

Identifying Classes for Classification of Potential Liver Disorder Patients by Unsupervised Learning with K-means Clustering (K-means 클러스터링을 이용한 자율학습을 통한 잠재적간 질환 환자의 분류를 위한 계층 정의)

  • Kim, Jun-Beom;Oh, Kyo-Joong;Oh, Keun-Whee;Choi, Ho-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2011.06c
    • /
    • pp.195-197
    • /
    • 2011
  • This research deals with an issue of preventive medicine in bioinformatics. We can diagnose liver conditions reasonably well to prevent Liver Cirrhosis by classifying liver disorder patients into fatty liver and high risk groups. The classification proceeds in two steps. Classification rules are first built by clustering five attributes (MCV, ALP, ALT, ASP, and GGT) of blood test dataset provided by the UCI Repository. The clusters can be formed by the K-mean method that analyzes multi dimensional attributes. We analyze the properties of each cluster divided into fatty liver, high risk and normal classes. The classification rules are generated by the analysis. In this paper, we suggest a method to diagnosis and predict liver condition to alcoholic patient according to risk levels using the classification rule from the new results of blood test. The K-mean classifier has been found to be more accurate for the result of blood test and provides the risk of fatty liver to normal liver conditions.