• Title/Summary/Keyword: Dunn 지수

Search Result 9, Processing Time 0.019 seconds

The Analysis of Optimal Cluster Number of Precipitation Region with Dunn Index (Dunn 지수를 이용한 최적 강수지역 군집수 분석)

  • Um, Myoung-Jin;Jeong, Chang-Sam;Nam, Woo-Sung;Jung, Young-Hun;Heo, Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.87-91
    • /
    • 2011
  • 강수는 지역에 따라 발생양상이 매우 다른 자연현상 중 하나이다. 이러한 강수를 효과적으로 분석하여 확률강수량을 산정하기위해서 수문학에서는 다양한 방법이 시도되어 왔다. 우리나라에서는 지점빈도해석을 통한 확률강수량을 주로 사용해왔으나 최근 들어 Hosking and Wallis(1997)가 제안한 지역빈도해석을 활용을 적극 도모 하고 있는 중이다. 이러한 지역빈도해석 기법은 지점빈도해석 기법에 비하여 한정된 강수자료를 활용하는 측면 등 여러 가지 장점을 가진 확률 강수량 산정방법이다. 그러나 이 기법을 적용하여 확률강수량을 산정하기 위해서는 강수의 지역구분을 먼저 수행하여야 한다. 강수지역의 구분을 위해서는 여러 가지 기법이 존재하나 최근에는 Cluster 기법 중 K-means 방법이나 Fuzzy c-means 방법 등을 주로 적용하여 지역구분을 수행하고 있다. 그러나 K-means 방법이나 Fuzzy c-means 방법 등은 산정 방법내에서 최적 군집수를 결정할 수 있는 알고리즘이 없기 때문에 임의적으로 최적 군집수를 결정하여야 한다. 본 연구에서는 이러한 단점을 극복하기 위하여 Cluster 평가지수 중 하나인 Dunn 지수를 이용하여 최적 군집수를 제시하고자 한다. 본 연구에서 강수지역을 구분하기 위하여 적용한 인자는 월 평균 강수량, 연 평균 강수량, 월 최대 강수량, 경도, 위도, 고도 등이며, 이를 K-means, PAM 및 친근도 전파 기법을 통하여 강수지역을 구분하였다. 적정 군집수를 임의적으로 증가시켜 가면서 Dunn 지수를 산정하였다. 산정된 결과를 통하여 최적 군집수를 결정하였다.

  • PDF

Gene Screening and Clustering of Yeast Microarray Gene Expression Data (효모 마이크로어레이 유전자 발현 데이터에 대한 유전자 선별 및 군집분석)

  • Lee, Kyung-A;Kim, Tae-Houn;Kim, Jae-Hee
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.1077-1094
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. To reflect the characteristics of a time-course data, we screen the genes using the test statistics with Fourier coefficients applying a FDR procedure. We compare the results done by model-based clustering, K-means, PAM, SOM, hierarchical Ward method and Fuzzy method with the yeast data. As the validity measure for clustering results, connectivity, Dunn index and silhouette values are computed and compared. A biological interpretation with GO analysis is also included.

Comparison of clustering with yeast microarray gene expression data (효모 마이크로어레이 유전자발현 데이터에 대한 군집화 비교)

  • Lee, Kyung-A;Kim, Jae-Hee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.741-753
    • /
    • 2011
  • We accomplish clustering analyses for yeast cell cycle microarray expression data. We compare model-based clustering, K-means, PAM, SOM and hierarchical Ward method with yeast data. As the validity measure for clustering results, connectivity, Dunn Index and silhouette values are computed and compared.

A Comparison of Cluster Analyses and Clustering of Sensory Data on Hanwoo Bulls (군집분석 비교 및 한우 관능평가데이터 군집화)

  • Kim, Jae-Hee;Ko, Yoon-Sil
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.4
    • /
    • pp.745-758
    • /
    • 2009
  • Cluster analysis is the automated search for groups of related observations in a data set. To group the observations into clusters many techniques has been proposed, and a variety measures aimed at validating the results of a cluster analysis have been suggested. In this paper, we compare complete linkage, Ward's method, K-means and model-based clustering and compute validity measures such as connectivity, Dunn Index and silhouette with simulated data from multivariate distributions. We also select a clustering algorithm and determine the number of clusters of Korean consumers based on Korean consumers' palatability scores for Hanwoo bull in BBQ cooking method.

Comparison of the Cluster Validation Methods for High-dimensional (Gene Expression) Data (고차원 (유전자 발현) 자료에 대한 군집 타당성분석 기법의 성능 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.167-181
    • /
    • 2007
  • Many clustering algorithms and cluster validation techniques for high-dimensional gene expression data have been suggested. The evaluations of these cluster validation techniques have, however, seldom been implemented. In this paper we compared various cluster validity indices for low-dimensional simulation data and real gene expression data, and found that Dunn's index is the most effective and robust, Silhouette index is next and Davies-Bouldin index is the bottom among the internal measures. Jaccard index is much more effective than Goodman-Kruskal index and adjusted Rand index among the external measures.

Selection of Optimal Variables for Clustering of Seoul using Genetic Algorithm (유전자 알고리즘을 이용한 서울시 군집화 최적 변수 선정)

  • Kim, Hyung Jin;Jung, Jae Hoon;Lee, Jung Bin;Kim, Sang Min;Heo, Joon
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.175-181
    • /
    • 2014
  • Korean government proposed a new initiative 'government 3.0' with which the administration will open its dataset to the public before requests. City of Seoul is the front runner in disclosure of government data. If we know what kind of attributes are governing factors for any given segmentation, these outcomes can be applied to real world problems of marketing and business strategy, and administrative decision makings. However, with respect to city of Seoul, selection of optimal variables from the open dataset up to several thousands of attributes would require a humongous amount of computation time because it might require a combinatorial optimization while maximizing dissimilarity measures between clusters. In this study, we acquired 718 attribute dataset from Statistics Korea and conducted an analysis to select the most suitable variables, which differentiate Gangnam from other districts, using the Genetic algorithm and Dunn's index. Also, we utilized the Microsoft Azure cloud computing system to speed up the process time. As the result, the optimal 28 variables were finally selected, and the validation result showed that those 28 variables effectively group the Gangnam from other districts using the Ward's minimum variance and K-means algorithm.

Analysis of Relative Settlement Behavior of Retaining Wall Backside Ground Using Clustering (군집분류를 이용한 흙막이 벽체 배면 지반의 상대적 침하거동 분석)

  • Young-Jun Kwack;Heui-Soo Han
    • The Journal of Engineering Geology
    • /
    • v.33 no.1
    • /
    • pp.189-200
    • /
    • 2023
  • As urbanization and industrialization increase development in downtown areas, damage due to ground settlement continues to occur. Building collapse in urban has a high risk of leading to large-scale damage to life and property. However, there has rarely been studied on measurement data analysis methods when uneven loads are applied to the excavated ground and no prior knowledge of the ground. Accordingly, it was attempted to analyze the relative settlement behavior and correlation by processing the time-series surface settlement of construction sites in the urban. In this paper, the average index of difference in settlement and average of relative difference in settlement are defined and calculated, then plotted in the coordinate system to analyze the relative settlement behavior over time. In addition, since there was no prior knowledge of the ground, a standard to classify the clusters was needed, and the observation points were classified into using k-means clustering and Dunn Index. As a result of the analysis, it was confirmed that all the clusters moved to the stable region as the settlement amount converges. The clusters were segmented. Based on the analysis results, it was possible to distinguish between the independent displacement area and same behavior area by analyzing the correlation between measurement points. If possible to analyze the relative settlement behavior between the stations and classify the behavior areas, it can be helpful in settlement and stability management, such as uplift of the surrounding area, prediction of ground failure area, and prevention of activity failure.

The Flora of Vascular Plants in the Construction Site of the National DMZ Native Botanic Garden (국립 DMZ자생식물원 조성 부지의 관속식물상)

  • Shin, Hyun-Tak;Yi, Myung-Hoon;Lee, Chang-Hyeon;Sung, Jung-Won;Kim, Ki-Song;Kwon, Yeong-Han;Kim, Sang-Jun;An, Jong-Bin;Heo, Tae-Im;Yoon, Jung-Won
    • Korean Journal of Plant Resources
    • /
    • v.27 no.4
    • /
    • pp.293-308
    • /
    • 2014
  • This study was carried out to investigate the vascular plants in the construction site of the National DMZ Native Botanic Garden. The period of survey was from May 2012 to November 2013. Vascular plants based on voucher specimen were summarized as 313 taxa including 79 families, 211 genera, 272 species, 4 subspecies, 32 varieties, 4 forms and 1 hybrids. The rare plant species designated by Korea Forest Service were 8 taxa including Galium boreale L., Eleutherococcus senticosus (Rupr. & Maxim.) Maxim., Eranthis stellata Maxim. and Lloydia triflora (Ledeb.) Baker, etc. Endemic plant species were 4 taxa including Salix koriyanagi Kimura, Clematis trichotoma Nakai, Philadelphus schrenkii Rupr. and Cirsium setidens (Dunn) Nakai. Furthermore, 51 taxa were listed as specific plant species based on phytogeographical in the investigated area. The naturalized plants were recorded as 11 taxa, and their Naturalization Ratio and Urbanization Index were recorded as 3.51%, and 3.43%, respectively.

The Flora of Vascular Plants in the West Side of DMZ Area (DMZ 일원의 관속식물상 I - 민통선 이북 서부지역(파주-연천) -)

  • Lee, Seung-Hyuk;Choi, Seung-Se;Lee, Doo-Bum;Hwang, Seung-Hyun;Ahn, Jin-Kap
    • Korean Journal of Environment and Ecology
    • /
    • v.30 no.1
    • /
    • pp.1-18
    • /
    • 2016
  • This study was carried out to investigate the flora of the western front (Paju-Yencheon Area) of the Civilian Control Zone. Vascular plants collected in these areas were a total of 558 taxa composing of 501 species, 3 subspecies, 48 varieties and 1 forma of 330 genera under 109 families This shows that 11% of the 4,880 vascular plant species that are known to exist in Korea is distributed in the western part of the DMZ. 1 taxa of endangered species designated by the Ministry of Environment was found: the Polygonatum stenophyllum Maxim in the edge of the military operation road from Taepung observatory to Imjin river. For the floristically specific ones of the Korean floristic zones, 3 taxa of the $5^{th}$ grade, 3 taxa of the $4^{th}$ grade, 13 taxa of the $3^{rd}$ grade, 13 taxa of the $2^{nd}$ grade and 22 taxa of the $1^{st}$ grade were found. For the endemic species of Korea, 4 taxa including Cirsium setidens (Dunn) Nakai were confirmed to be distributed mostly on the slope or the cutting area. Among the collected rare plants (11 taxa), there were 1 taxa of endangered species, 4 taxa of vulnerable species and 6 taxa of least concern species. Also, 51 taxa of naturalized plants were identified and 4 taxa of ecosystem disturbance organism designated by the Ministry of Environment were identified. The urbanization index and naturalization index for all species were estimated to be 15.89% and 9.14% respectively. Our survey is expected to be considered as primary data of biological diversity and ecological axis in the DMZ and the western part of the DMZ. According to the results of this study, it is thought to be necessary to establish policies for conservation and protection of the DMZ.