• Title/Summary/Keyword: Rand지수

Search Result 3, Processing Time 0.016 seconds

Reproducibility Assessment of K-Means Clustering and Applications (K-평균 군집화의 재현성 평가 및 응용)

  • 허명회;이용구
    • The Korean Journal of Applied Statistics
    • /
    • v.17 no.1
    • /
    • pp.135-144
    • /
    • 2004
  • We propose a reproducibility (validity) assessment procedure of K-means cluster analysis by randomly partitioning the data set into three parts, of which two subsets are used for developing clustering rules and one subset for testing consistency of clustering rules. Also, as an alternative to Rand index and corrected Rand index, we propose an entropy-based consistency measure between two clustering rules, and apply it to determination of the number of clusters in K-means clustering.

Comparison of the Cluster Validation Methods for High-dimensional (Gene Expression) Data (고차원 (유전자 발현) 자료에 대한 군집 타당성분석 기법의 성능 비교)

  • Jeong, Yun-Kyoung;Baek, Jang-Sun
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.1
    • /
    • pp.167-181
    • /
    • 2007
  • Many clustering algorithms and cluster validation techniques for high-dimensional gene expression data have been suggested. The evaluations of these cluster validation techniques have, however, seldom been implemented. In this paper we compared various cluster validity indices for low-dimensional simulation data and real gene expression data, and found that Dunn's index is the most effective and robust, Silhouette index is next and Davies-Bouldin index is the bottom among the internal measures. Jaccard index is much more effective than Goodman-Kruskal index and adjusted Rand index among the external measures.

Multivariate Process Capability Indices for Skewed Populations with Weighted Standard Deviations (가중표준편차를 이용한 비대칭 모집단에 대한 다변량 공정능력지수)

  • Jang, Young Soon;Bai, Do Sun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.29 no.2
    • /
    • pp.114-125
    • /
    • 2003
  • This paper proposes multivariate process capability indices (PCIs) for skewed populations using $T^2$rand modified process region approaches. The proposed methods are based on the multivariate version of a weighted standard deviation method which adjusts the variance-covariance matrix of quality characteristics and approximates the probability density function using several multivariate Journal distributions with the adjusted variance-covariance matrix. Performance of the proposed PCIs is investigated using Monte Carlo simulation, and finite sample properties of the estimators are studied by means of relative bias and mean square error.