• Title/Summary/Keyword: Optimal Cluster Size

Search Result 38, Processing Time 0.025 seconds

A Optimal Cluster Size in Stratified Two-Stage Cluster Sampling (층화 2-단 표본 추출시 최적 집락의 크기 결정)

  • 신민웅;신기일
    • The Korean Journal of Applied Statistics
    • /
    • v.13 no.2
    • /
    • pp.207-224
    • /
    • 2000
  • Generally cluster size is predetermined when we use the stratified two-stage cluster sampling But in case that the sizes of clusters vary greatly one may want to make the sizes to be about equal. In this paper we study the optimal cluster size in stratified twostage cluster sampling. Also we find the optimal primary sampling unit sizes and optimal secondary sampling unit sizes under the given cost restriction.

  • PDF

A Hybrid Genetic Algorithm for K-Means Clustering

  • Jun, Sung-Hae;Han, Jin-Woo;Park, Minjae;Oh, Kyung-Whan
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.09a
    • /
    • pp.330-333
    • /
    • 2003
  • Initial cluster size for clustering of partitioning methods is very important to the clustering result. In K-means algorithm, the result of cluster analysis becomes different with optimal cluster size K. Usually, the initial cluster size is determined by prior and subjective information. Sometimes this may not be optimal. Now, more objective method is needed to solve this problem. In our research, we propose a hybrid genetic algorithm, a tree induction based evolution algorithm, for determination of optimal cluster size. Initial population of this algorithm is determined by the number of terminal nodes of tree induction. From the initial population based on decision tree, our optimal cluster size is generated. The fitness function of ours is defined an inverse of dissimilarity measure. And the bagging approach is used for saying computational time cost.

  • PDF

Determination of Optimal Cluster Size Using Bootstrap and Genetic Algorithm (붓스트랩 기법과 유전자 알고리즘을 이용한 최적 군집 수 결정)

  • Park, Min-Jae;Jun, Sung-Hae;Oh, Kyung-Whan
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.1
    • /
    • pp.12-17
    • /
    • 2003
  • Optimal determination of cluster size has an effect on the result of clustering. In K-means algorithm, the difference of clustering performance is large by initial K. But the initial cluster size is determined by prior knowledge or subjectivity in most clustering process. This subjective determination may not be optimal. In this Paper, the genetic algorithm based optimal determination approach of cluster size is proposed for automatic determination of cluster size and performance upgrading of its result. The initial population based on attribution is generated for searching optimal cluster size. The fitness value is defined the inverse of dissimilarity summation. So this is converged to upgraded total performance. The mutation operation is used for local minima problem. Finally, the re-sampling of bootstrapping is used for computational time cost.

The Impact of Network Coding Cluster Size on Approximate Decoding Performance

  • Kwon, Minhae;Park, Hyunggon
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.10 no.3
    • /
    • pp.1144-1158
    • /
    • 2016
  • In this paper, delay-constrained data transmission is considered over error-prone networks. Network coding is deployed for efficient information exchange, and an approximate decoding approach is deployed to overcome potential all-or-nothing problems. Our focus is on determining the cluster size and its impact on approximate decoding performance. Decoding performance is quantified, and we show that performance is determined only by the number of packets. Moreover, the fundamental tradeoff between approximate decoding performance and data transfer rate improvement is analyzed; as the cluster size increases, the data transfer rate improves and decoding performance is degraded. This tradeoff can lead to an optimal cluster size of network coding-based networks that achieves the target decoding performance of applications. A set of experiment results confirms the analysis.

A Study of Sample Size for Two-Stage Cluster Sampling (이단계 집락추출에서의 표본크기에 대한 연구)

  • Song, Jong-Ho;Jea, Hea-Sung;Park, Min-Gue
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.2
    • /
    • pp.393-400
    • /
    • 2011
  • In a large scale survey, cluster sampling design in which a set of observation units called clusters are selected is often used to satisfy practical restrictions on time and cost. Especially, a two stage cluster sampling design is preferred when a strong intra-class correlation exists among observation units. The sample Primary Sampling Unit(PSU) and Secondary Sampling Unit(SSU) size for a two stage cluster sample is determined by the survey cost and precision of the estimator calculated. For this study, we derive the optimal sample PSU and SSU size when the population SSU size across the PSU are di erent by extending the result obtained under the assumption that all PSU have the same number of SSU. The results on the sample size are then applied to the $4^{th}$ Korea Hospital Discharge results and is compared to the conventional method. We also propose the optimal sample SSU (discharged patients) size for the $7^{th}$ Korea Hospital Discharge Survey.

Optimal Allocations in Two-Stage Cluster Sampling

  • Koh, Bong-Sung
    • Communications for Statistical Applications and Methods
    • /
    • v.6 no.3
    • /
    • pp.749-754
    • /
    • 1999
  • The cost is known to be proportional to the size of sample. We consider a cost function of the form Cost=c1np+c2npmq where c1, c2 p, and q are all positive constants. This cost function is to be used in finding an optimal allocation in two-stage cluster sampling. The optimal allocations of n and m gives the properties of uniqueness under some conditions and of monotonicity with p>0 when q=1.

  • PDF

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

A Layer-based Dynamic Unequal Clustering Method in Large Scale Wireless Sensor Networks (대규모 무선 센서 네트워크에서 계층 기반의 동적 불균형 클러스터링 기법)

  • Kim, Jin-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.12
    • /
    • pp.6081-6088
    • /
    • 2012
  • An unequal clustering method in wireless sensor networks is the technique that forms the cluster of different size. This method decreases whole energy consumption by solving the hot spot problem. In this paper, I propose a layer-based dynamic unequal clustering using the unequal clustering model. This method decreases whole energy consumption and maintain that equally using optimal cluster's number and cluster head position. I also show that proposed method is better than previous clustering method at the point of network lifetime.

Traffic based Estimation of Optimal Number of Super-peers in Clustered P2P Environments

  • Kim, Ju-Gyun;Lee, Jun-Soo
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.12
    • /
    • pp.1706-1715
    • /
    • 2008
  • In a super-peer based P2P network, the network is clustered and each cluster is managed by a special peer, which is called a super-peer. A Super-peer has information of all the peers in its cluster. This type of clustered P2P model is known to have efficient information search and less traffic load than unclustered P2P model. In this paper, we compute the message traffic cost incurred by peers' query, join and update actions within a cluster as well as between the clusters. With these values, we estimate the optimal number of super-peers that minimizes the traffic cost for the various size of super-peer based P2P networks.

  • PDF

An Analysis of Body Feature to the Optimal Size of Industrial Products (산업제품의 표준치 설정을 위한 체형특성의 인간공학적 연구)

  • 유병철;이상도
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.22 no.49
    • /
    • pp.11-21
    • /
    • 1999
  • The purpose of this study is to present the method to select optimal size for the industrial products which are closely related to human's body size. For this purpose, human factors such as body characteristics, body features, and preference in product selection which needs to be considered in setting standards were analyzed. This analysis is to select optimal size to minimize losses caused by the difference of size between demand by the customers and supply from the manufacturers. Using loss function, repetitive calculation process algorithm by using bisearch method was applied in selecting the sizes of demand and supply which minimize the total expected losses. For cumulative normal distribution probability, IMSL routine DNORDF was used. In case study, comparison has been made between the result which was calculated using presented algorithm and the results calculated by the process currently used by KS and ISO by measuring aged women's body size in human factors side and sorting them through the factor analysis and cluster analysis for feature factor extraction. Thus, they can be used as a basis for establishing industrial product standards.