• Title/Summary/Keyword: K-평균 군집분석법

Search Result 52, Processing Time 0.033 seconds

K-means clustering using a center of gravity for grid-based sample (그리드 기반 표본의 무게중심을 이용한 케이-평균군집화)

  • Lee, Sun-Myung;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.1
    • /
    • pp.121-128
    • /
    • 2010
  • K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

K-평균 군집분석을 활용한 다중대응분석의 재해석

  • 김경희;최용석
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2001.11a
    • /
    • pp.175-178
    • /
    • 2001
  • 다원분할표에서 범주들의 대응관계를 그래프적으로 보여주는 다중대응분석(multiple correspondence analysis)은 주결여성(principal inertia)이 총결여성(total inertia)에서 차지하는 비율이 전반적으로 낮아 설명력(goodness-of-fit)이 낮은 2차원의 대응분석그림을 얻게 된다. 이를 극복하기 위해 Benzecri의 공식을 사용하면 낮은 주결여성을 높이고 새로운 2차원 대응분석그림을 얻을 수 있다. 그러나 이 새로운 대응분석그림도 범주들의 대응관계를 명확히 보여주지는 못한다(Greenacre and Blasius, 1994, chapter 10). 앤드류 플롯(Andrews plot)을 이용하여 범주들의 군집화(clustering)로 다중대응분석을 재해석 하고자 하나 범주의 수가 많은 경우 해석상 어려움이 따른다. 본 소고에서 이와 같은 경우 K-평균 군집분석을 활용하여 다중대응분석의 해석을 용이하게 하고자 한다.

  • PDF

Selecting Technique of Accident Sections using K-mean Method (K-평균법을 이용한 고속도로 사고분석구간 분할기법 개발)

  • Lee, Ki-Young;Chang, Myung-Soon
    • International Journal of Highway Engineering
    • /
    • v.7 no.4 s.26
    • /
    • pp.211-219
    • /
    • 2005
  • A selection of the analysis section for traffic accidents is used to analyze definitely the cause of accidents sorting similar accidents by a group and to raise the effect of improvement projects deciding the priority of accidents. In the existing method, an uniformly dividing method based on road mileages has been used, which has no consideration for similarities among accidents. Consequently, in recent, a slider-length method considering accident types rather than road mileages is widely used. In this study, using K-mean method, a non-hierarchical grouping technique used in the Cluster Analysis ai a applicatory method for the slider length method, a method classifies accidents that occurred the most nearby mileages into one group is proposed. To verify the proposed method, a comparison between the f-mean method and the dividing method at regular intervals on the data of a total of 25.6km lengths along Kyung-bu freeway in Pusan direction was made so that the K-mean method was proved to an effective method considering the similarities and adjacencies of accidents.

  • PDF

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

  • Lee, Kyung-A;Kim, Jae-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.741-753
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.

Hierarchical Clustering Analysis of Water Main Leak Location Data (상수관로 누수위치 자료를 이용한 계층적 군집분석)

  • Park, Su-Wan;Im, Gwang-Chae;Choi, Chang-Lok;Kim, Kyu-Lee
    • Journal of Korea Water Resources Association
    • /
    • v.42 no.3
    • /
    • pp.177-190
    • /
    • 2009
  • Rehabilitation projects for old water mains typically require considerable capital investments. One of the economical ways of pursuing the rehabilitation projects is to focus on a specific area within the entire region under management. In this paper the hierarchical clustering methods that analyze spatial inter-relationship of location data are applied to about 8,000 water leak location data recorded in a case study area from 1992 to 1997. Among the hierarchical clustering methods Single, Complete, and Average Linkage Methods are used to identify clusters of the water leak locations and to divide the area according to the defined clusters. By comparing the clusters identified by the clustering methods, the best clustering method for the case study area is suggested. Prioritization of the area for maintenance is obtained based on the water leak incident intensity for the clustered area using the suggested best clustering method.

A Comparison of cluster analysis based on profile of LPGA player profile in 2009 (2009년 여자프로골프선수 프로파일을 이용한 군집방법비교)

  • Min, Dae-Kee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.3
    • /
    • pp.471-480
    • /
    • 2010
  • Cluster analysis is one of the useful methods to find out number of groups and member’s belongings. With the rapid development of computer application in statistics, variety of new methods in clustering analysis were studied such as EM algorism and Self organization maps. The goals of cluster analysis is finding the number of groupings that are meaningful to me. If data are analyzed perfectly with cluster analysis, we can get the same results from discernment analysis.

Clustering analysis of Korea's meteorological data (우리나라 기상자료에 대한 군집분석)

  • Yeo, In-Kwon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.5
    • /
    • pp.941-949
    • /
    • 2011
  • In this paper, 72 weather stations in Korea are clustered by the hierarchical agglomerative procedure based on the average linkage method. We compare our clusters and stations divided by mountain chains which are applied to study on the impact analysis of foodborne disease outbreak due to climate change.

A Major DNA Marker Mining of microsatellite loci in Hanwoo Chromosome 17

  • Lee, Yong-Won;Lee, Je-Yeong
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2005.04a
    • /
    • pp.54-58
    • /
    • 2005
  • 한우 17번 염색체 유전자 지도에서 QTL (quantitative trait loci) 분석을 실시하여 선별된 Loci 값들을 순열검정(Permutation Test)을 이용하여 유의성 검정을 실시하였다. 한편, 우수 경제형질 DNA marker들을 K-평균 군집법을 실시 파악하였다. 또한, 부스트랩 방법을 이용하여 선별된 Locus의 DNA Marker들의 신뢰구간을 구하였다. 이들 QTL과 K-평균법, 부스트랩 방법에 의해 한우의 염색체 17번 BMS941의 우수 DNA Marker 85, 105번을 선별하였다.

  • PDF