• 제목/요약/키워드: k-means algorithms

검색결과 400건 처리시간 0.024초

Information Granulation-based Fuzzy Inference Systems by Means of Genetic Optimization and Polynomial Fuzzy Inference Method

  • Park Keon-Jun;Lee Young-Il;Oh Sung-Kwun
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제5권3호
    • /
    • pp.253-258
    • /
    • 2005
  • In this study, we introduce a new category of fuzzy inference systems based on information granulation to carry out the model identification of complex and nonlinear systems. Informal speaking, information granules are viewed as linked collections of objects (data, in particular) drawn together by the criteria of proximity, similarity, or functionality. To identify the structure of fuzzy rules we use genetic algorithms (GAs). Granulation of information with the aid of Hard C-Means (HCM) clustering algorithm help determine the initial parameters of fuzzy model such as the initial apexes of the membership functions and the initial values of polynomial functions being used in the premise and consequence part of the fuzzy rules. And the initial parameters are tuned effectively with the aid of the genetic algorithms and the least square method (LSM). The proposed model is contrasted with the performance of the conventional fuzzy models in the literature.

다목적 유전자 알고리즘을 이용한문서 클러스터링 (The Document Clustering using Multi-Objective Genetic Algorithms)

  • 이정송;박순철
    • 한국산업정보학회논문지
    • /
    • 제17권2호
    • /
    • pp.57-64
    • /
    • 2012
  • 본 논문에서는 텍스트 마이닝 분야에서 중요한 부분을 차지하고 있는 문서 클러스터링을 위하여 다목적 유전자 알고리즘을 제안한다. 문서 클러스터링에 있어 중요한 요소 중 하나는 유사한 문서를 그룹화 하는 클러스터링 알고리즘이다. 지금까지 문서 클러스터링에는 k-means 클러스터링, 유전자 알고리즘 등을 사용한 연구가 많이 진행되고 있다. 하지만 k-means 클러스터링은 초기 클러스터 중심에 따라 성능 차이가 크며 유전자 알고리즘은 목적함수에 따라 지역 최적해에 쉽게 빠지는 단점을 갖고 있다. 본 논문에서는 이러한 단점을 보완하기 위하여 다목적 유전자 알고리즘을 문서 클러스터링에 적용해 보고, 기존의 알고리즘과 정확성을 비교 및 분석한다. 성능 시험을 통해 k-means 클러스터링(약 20%)과 기존의 유전자 알고리즘(약 17%)을 비교할 때 본 논문에서 제안한 다목적 유전자 알고리즘의 성능이 월등하게 향상됨을 보인다.

X-means 확장을 통한 효율적인 집단 개수의 결정 (Extensions of X-means with Efficient Learning the Number of Clusters)

  • 허경용;우영운
    • 한국정보통신학회논문지
    • /
    • 제12권4호
    • /
    • pp.772-780
    • /
    • 2008
  • K-means는 알고리즘의 단순함과 효율적인 구현이 가능함으로 인해 군집화를 위해 현재까지 널리 사용되는 방법 중 하나이다. 하지만 K-means는 집단의 개수가 사전에 결정되어야 하는 근본적인 문제점이 있다. 이 논문에서는 BIC(Bayesian information criterion) 점수를 이용하여 효율적으로 집단의 개수를 추정할 수 있는 X-means 알고리즘을 확장한 두 가지 알고리즘을 제안한다. 제안한 방법은 기본적으로 X-means 방법을 따르면서 집단이 임의의 분산 행렬을 가질 수 있도록 함으로써 X-means 알고리즘이 원형 집단만을 허용함에 따른 over-fitting을 개선한다. 제안한 방법은 하나의 집단에서 시작하여 계속해서 집단을 나누어가는 하향식 방법으로, BIC score를 최대로 증가시키는 집단을 분할해 나간다. 제안한 알고리즘은 Modified X-means(MX-means)와 Generalized X-means(GX-means)의 두 가지로, 전자는 K-means 알고리즘을, 후자는 EM 알고리즘을 사용하여 현재 주어진 집단들에서 최적의 분할을 찾아낸다. MX-means는 GX-means보다 그 속도에서 앞서지만 집단들이 중첩 된 경우에는 올바른 집단을 찾아낼 수 없는 단점이 있다. GX-means는 실행 속도가 느린 단점이 있지만 집단들이 중첩된 경우에도 안정적으로 집단들을 찾아낼 수 있다. 이러한 점들은 일련의 실험을 통해서 확인할 수 있으며, 제안한 방법들이 기존의 방법들에 비해 나은 성능을 보임을 확인할 수 있다.

Automatic Fuzzy Rule Generation Utilizing Genetic Algorithms

  • Hee, Soo-Hwang;Kwang, Bang-Woo
    • 한국지능시스템학회논문지
    • /
    • 제2권3호
    • /
    • pp.40-49
    • /
    • 1992
  • In this paper, an approach to identify fuzzy rules is proposed. The decision of the optimal number of fuzzy rule is made by means of fuzzy c-means clustering. The identification of the parameters of fuzzy implications is carried out by use of genetic algorithms. For the efficinet and fast parameter identification, the reduction thechnique of search areas of genetica algorithms is proposed. The feasibility of the proposed approach is evaluated through the identification of the fuzzy model to describe an input-output relation of Gas Furnace. Despite the simplicity of the propsed apprach the accuracy of the identified fuzzy model of gas furnace is superior as compared with that of other fuzzy modles.

  • PDF

클러스터링 성능평가: 신경망 및 통계적 방법 (A Study on Performance Evaluation of Clustering Algorithms using Neural and Statistical Method)

  • 윤석환;신용백
    • 기술사
    • /
    • 제29권2호
    • /
    • pp.71-79
    • /
    • 1996
  • This paper evaluates the clustering performance of a neural network and a statistical method. Algorithms which are used in this paper are the GLVQ(Generalized Loaming vector Quantization) for a neural method and the k -means algorithm for a statistical clustering method. For comparison of two methods, we calculate the Rand's c statistics. As a result, the mean of c value obtained with the GLVQ is higher than that obtained with the k -means algorithm, while standard deviation of c value is lower. Experimental data sets were the Fisher's IRIS data and patterns extracted from handwritten numerals.

  • PDF

Path based K-means Clustering for RFID Data Sets

  • Yun, Hong-Won
    • Journal of information and communication convergence engineering
    • /
    • 제6권4호
    • /
    • pp.434-438
    • /
    • 2008
  • Massive data are continuously produced with a data rate of over several terabytes every day. These applications need effective clustering algorithms to achieve an overall high performance computation. In this paper, we propose ancestor as cluster center based approach to clustering, the K-means algorithm using ancestor. We modify the K-means algorithm. We present a clustering architecture and a clustering algorithm that minimize of I/Os and show a performance with excellent. In our experimental performance evaluation, we present that our algorithm can improve the I/O speed and the query processing time.

The Design of Fuzzy Controller by Means of Genetic Optimization and Estimation Algorithms

  • Oh, Sung-Kwun;Rho, Seok-Beom
    • KIEE International Transaction on Systems and Control
    • /
    • 제12D권1호
    • /
    • pp.17-26
    • /
    • 2002
  • In this paper, a new design methodology of the fuzzy controller is presented. The performance of the fuzzy controller is sensitive to the variety of scaling factors. The design procedure is based on evolutionary computing (more specifically, a genetic algorithm) and estimation algorithm to adjust and estimate scaling factors respectively. The tuning of the soiling factors of the fuzzy controller is essential to the entire optimization process. And then we estimate scaling factors of the fuzzy controller by means of two types of estimation algorithms such as HCM (Hard C-Means) and Neuro-Fuzzy model[7]. The validity and effectiveness of the proposed estimation algorithm for the fuzzy controller are demonstrated by the inverted pendulum system.

  • PDF

Clustering Approaches to Identifying Gene Expression Patterns from DNA Microarray Data

  • Do, Jin Hwan;Choi, Dong-Kug
    • Molecules and Cells
    • /
    • 제25권2호
    • /
    • pp.279-288
    • /
    • 2008
  • The analysis of microarray data is essential for large amounts of gene expression data. In this review we focus on clustering techniques. The biological rationale for this approach is the fact that many co-expressed genes are co-regulated, and identifying co-expressed genes could aid in functional annotation of novel genes, de novo identification of transcription factor binding sites and elucidation of complex biological pathways. Co-expressed genes are usually identified in microarray experiments by clustering techniques. There are many such methods, and the results obtained even for the same datasets may vary considerably depending on the algorithms and metrics for dissimilarity measures used, as well as on user-selectable parameters such as desired number of clusters and initial values. Therefore, biologists who want to interpret microarray data should be aware of the weakness and strengths of the clustering methods used. In this review, we survey the basic principles of clustering of DNA microarray data from crisp clustering algorithms such as hierarchical clustering, K-means and self-organizing maps, to complex clustering algorithms like fuzzy clustering.

클러스터링 기법과 유전자 알고리즘에 의한 다중 퍼지 모델으 동정 (The Identification of Multi-Fuzzy Model by means of HCM and Genetic Algorithms)

  • 박병준;이수구;오성권;김현기
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.3007-3009
    • /
    • 2000
  • In this paper, we design a Multi-Fuzzy model by means of clustering method and genetic algorithms for a nonlinear system. In order to determine structure of the proposed Multi-Fuzzy model. HCM clustering method is used. The parameters of membership function of the Multi-Fuzzy are identified by genetic algorithms. We use simplified inference and linear inference as inference method of the proposed Multi-Fuzzy model and the standard least square method for estimating consequence parameters of the Multi-Fuzzy. Finally, we use some of numerical data to evaluate the proposed Multi-Fuzzy model and discuss about the usefulness.

  • PDF

Exponential Probability Clustering

  • Yuxi, Hou;Park, Cheol-Hoon
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2008년도 하계종합학술대회
    • /
    • pp.671-672
    • /
    • 2008
  • K-means is a popular one in clustering algorithms, and it minimizes the mutual euclidean distance among the sample points. But K-means has some demerits, such as depending on initial condition, unsupervised learning and local optimum. However mahalanobis distancecan deal this case well. In this paper, the author proposed a new clustering algorithm, named exponential probability clustering, which applied Mahalanobis distance into K-means clustering. This new clustering does possess not only the probability interpretation, but also clustering merits. Finally, the simulation results also demonstrate its good performance compared to K-means algorithm.

  • PDF