• Title/Summary/Keyword: maximum cluster size

Search Result 38, Processing Time 0.024 seconds

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

Molecular Dynamics study of Aluminum growth using Aluminum Cluster Deposition (알루미늄 덩어리를 사용한 알루미늄 성장에 관한 분자동력학 연구)

  • J.W. Kang;K.R. Byun;W.H. Mun;E.S. Kang;H.J. Hwang
    • Proceedings of the IEEK Conference
    • /
    • 2000.06b
    • /
    • pp.306-309
    • /
    • 2000
  • In this work, we investigated A1 cluster deposition on Al (100) surface using molecular dynamics simulation. A result of simulations showed that large cluster with low energy was proper for good surfaced-films without craters at the low temperatures. We investigated the maximum substrate temperature and the time taken for substrate temperature to reach its maximum as a function of cluster size in the case of the same total energy and in the case of the same energy Per atom. The correlated collisions play an important role in interaction between energetic cluster and surface, and as cluster size and cluster energy increases, the correlated collisions effect affects interaction between energetic cluster and surface.

  • PDF

CONDENSATION IN DENSITY DEPENDENT ZERO RANGE PROCESSES

  • Jeon, Intae
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.17 no.4
    • /
    • pp.267-278
    • /
    • 2013
  • We consider zero range processes with density dependent jump rates g given by $g=g(n,k)=g_1(n)g_2(k/n)$ with $g_1(x)=x^{-\alpha}$ and $$g_2(x)=\{^{x^{-\alpha}\;if\;a&lt;x}_{Mx^{-\alpha}\;if\;x{\leq}a}$$. (0.1) In this case, with 1/2 < a < 1 and ${\alpha}$ > 0, we show that non-complete condensation occurs with maximum cluster size an. More precisely, for any ${\epsilon}$ > 0, there exists $M^*$ > 0 such that, for any 0 < M ${\leq}M^*$, the maximum cluster size is between (a - ${\epsilon}$)n and (a + ${\epsilon}$)n for large n. This provides a simple example of non-complete condensation under perturbation of rates which are deep in the range of perfect condensation (e.g. ${\alpha}$ >> 1) and supports the instability of the condensation transition.

Improved Classification Algorithm using Extended Fuzzy Clustering and Maximum Likelihood Method

  • Jeon Young-Joon;Kim Jin-Il
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.447-450
    • /
    • 2004
  • This paper proposes remotely sensed image classification method by fuzzy c-means clustering algorithm using average intra-cluster distance. The average intra-cluster distance acquires an average of the vector set belong to each cluster and proportionates to its size and density. We perform classification according to pixel's membership grade by cluster center of fuzzy c-means clustering using the mean-values of training data about each class. Fuzzy c-means algorithm considered membership degree for inter-cluster of each class. And then, we validate degree of overlap between clusters. A pixel which has a high degree of overlap applies to the maximum likelihood classification method. Finally, we decide category by comparing with fuzzy membership degree and likelihood rate. The proposed method is applied to IKONOS remote sensing satellite image for the verifying test.

  • PDF

Mass Spectrometric Study of Carbon Cluster Formation in Laser Ablation of Graphite at 355 nm

  • Koo, Young-Mi;Choi, Young-Ku;Lee, Kee-Hag;Jung, Kwang-Woo
    • Bulletin of the Korean Chemical Society
    • /
    • v.23 no.2
    • /
    • pp.309-314
    • /
    • 2002
  • The ablation dynamics and cluster formation of $C_n^+$ ions ejected from 355 nm laser ablation of a graphite target in vacuum are investigated using a reflectron time-of-flight (RTOF) mass spectrometer. At low laser fluence, odd-numbered cluster ions with $3{\leq}n{\leq}15$ are predominantly produced. Increasing the laser fluence shifts the maximum size distribution towards small cluster ions, implying the fragmentation of larger clusters within the hot plume. The temporal evolution of $C_n^+$ ions was measured by varying the delay time of the ion extraction pulse with respect to the laser irradiation, providing significant information on the characteristics of the ablated plume. Above a laser fluence of $0.2J/cm^2$ , large cluster ions ($n{\geq}30$) are produced at relatively long delay times, indicating that atoms or small carbon clusters aggregate during plume propagation. The dependence of the intensity of ablated $C_n^+$ ions on delay time after laser irradiation shows that the most probable velocity of each cluster ion decreases with cluster size.

Performance Factor of Distributed Processing of Machine Learning using Spark (스파크를 이용한 머신러닝의 분산 처리 성능 요인)

  • Ryu, Woo-Seok
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.16 no.1
    • /
    • pp.19-24
    • /
    • 2021
  • In this paper, we study performance factor of machine learning in the distributed environment using Apache Spark and presents an efficient distributed processing method through experiments. This work firstly presents performance factor when performing machine learning in a distributed cluster by classifying cluster performance, data size, and configuration of spark engine. In addition, performance study of regression analysis using Spark MLlib running on the Hadoop cluster is performed while changing the configuration of the node and the Spark Executor. As a result of the experiment, it was confirmed that the effective number of executors was affected by the number of data blocks, but depending on the cluster size, the maximum and minimum values were limited by the number of cores and the number of worker nodes, respectively.

Two-stage Sampling for Estimation of Prevalence of Bovine Tuberculosis (이단계표본추출을 이용한 소결핵병 유병률 추정)

  • Pak, Son-Il
    • Journal of Veterinary Clinics
    • /
    • v.28 no.4
    • /
    • pp.422-426
    • /
    • 2011
  • For a national survey in which wide geographic region or an entire country is targeted, multi-stage sampling approach is widely used to overcome the problem of simple random sampling, to consider both herd- and animallevel factors associated with disease occurrence, and to adjust clustering effect of disease in the population in the calculation of sample size. The aim of this study was to establish sample size for estimating bovine tuberculosis (TB) in Korea using stratified two-stage sampling design. The sample size was determined by taking into account the possible clustering of TB-infected animals on individual herds to increase the reliability of survey results. In this study, the country was stratified into nine provinces (administrative unit) and herd, the primary sampling unit, was considered as a cluster. For all analyses, design effect of 2, between-cluster prevalence of 50% to yield maximum sample size, and mean herd size of 65 were assumed due to lack of information available. Using a two-stage sampling scheme, the number of cattle sampled per herd was 65 cattle, regardless of confidence level, prevalence, and mean herd size examined. Number of clusters to be sampled at a 95% level of confidence was estimated to be 296, 74, 33, 19, 12, and 9 for desired precision of 0.01, 0.02, 0.03, 0.04, 0.05, and 0.06, respectively. Therefore, the total sample size with a 95% confidence level was 172,872, 43,218, 19,224, 10,818, 6,930, and 4,806 for desired precision ranging from 0.01 to 0.06. The sample size was increased with desired precision and design effect. In a situation where the number of cattle sampled per herd is fixed ranging from 5 to 40 with a 5-head interval, total sample size with a 95% confidence level was estimated to be 6,480, 10,080, 13,770, 17,280, 20.925, 24,570, 28,350, and 31,680, respectively. The percent increase in total sample size resulting from the use of intra-cluster correlation coefficient of 0.3 was 22.2, 32.1, 36.3, 39.6, 41.9, 42.9, 42,2, and 44.3%, respectively in comparison to the use of coefficient of 0.2.

Maximizing Information Transmission for Energy Harvesting Sensor Networks by an Uneven Clustering Protocol and Energy Management

  • Ge, Yujia;Nan, Yurong;Chen, Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.4
    • /
    • pp.1419-1436
    • /
    • 2020
  • For an energy harvesting sensor network, when the network lifetime is not the only primary goal, maximizing the network performance under environmental energy harvesting becomes a more critical issue. However, clustering protocols that aim at providing maximum information throughput have not been thoroughly explored in Energy Harvesting Wireless Sensor Networks (EH-WSNs). In this paper, clustering protocols are studied for maximizing the data transmission in the whole network. Based on a long short-term memory (LSTM) energy predictor and node energy consumption and supplement models, an uneven clustering protocol is proposed where the cluster head selection and cluster size control are thoroughly designed for this purpose. Simulations and results verify that the proposed scheme can outperform some classic schemes by having more data packets received by the cluster heads (CHs) and the base station (BS) under these energy constraints. The outcomes of this paper also provide some insights for choosing clustering routing protocols in EH-WSNs, by exploiting the factors such as uneven clustering size, number of clusters, multiple CHs, multihop routing strategy, and energy supplementing period.

Classification and Identification of Korean Hand Shapes based on Anthropometric Hand Data Analysis (손 관련 인체측정자료를 이용한 한국인의 손 모양 유형 분류 및 특성 분석)

  • Kim, Sang-Ho;Kee, Do-Hyung
    • Journal of the Korea Safety Management & Science
    • /
    • v.14 no.1
    • /
    • pp.75-85
    • /
    • 2012
  • In this study, the representative hand shapes for the adult Koreans were analyzed by factor analysis and cluster analyses. The analyses were conducted on the anthropometric data of 58 hand dimensions from 325 subjects having nonhomogeneous demographics. Maximum hand circumference, first phalanx length of index finger, and ratio between the two measures were the independent variables for the cluster analyses. The results of the study showed that Korean hand shapes can be divided into 2 clusters irrespective of their size for each of the male and female group. There were slight differences in component ratio of hand shapes with respect to the occupation and the age, but their differences were not statistically significant. The representative Korean hand shapes and their anthrpometric dimensions could be used to design and establish proper sizing system for various hand operating devices.

A Multistriped Checkpointing Scheme for the Fault-tolerant Cluster Computers (다중 분할된 구조를 가지는 클러스터 검사점 저장 기법)

  • Chang, Yun-Seok
    • The KIPS Transactions:PartA
    • /
    • v.13A no.7 s.104
    • /
    • pp.607-614
    • /
    • 2006
  • The checkpointing schemes should reduce the process delay through managing the checkpoints of each node to fit the network load to enhance the performance of the process running on the cluster system that write the checkpoints into its global stable storage. For this reason, a cluster system with single IO space on a distributed RAID chooses a suitable checkpointng scheme to get the maximum IO performance and the best rollback recovery efficiency. In this paper, we improved the striped checkpointing scheme with dynamic stripe group size by adapting to the network bandwidth variation at the point of checkpointing. To analyze the performance of the multi striped checkpointing scheme, we applied Linpack HPC benchmark with MPI on our own cluster system with maximum 512 virtual nodes. The benchmark results showed that the multistriped checkpointing scheme has better performance than the striped checkpointing scheme on the checkpoint writing efficiency and rollback recovery at heavy system load.