• Title/Summary/Keyword: model-based cluster

Search Result 638, Processing Time 0.029 seconds

Gene Expression Pattern Analysis via Latent Variable Models Coupled with Topographic Clustering

  • Chang, Jeong-Ho;Chi, Sung Wook;Zhang, Byoung Tak
    • Genomics & Informatics
    • /
    • v.1 no.1
    • /
    • pp.32-39
    • /
    • 2003
  • We present a latent variable model-based approach to the analysis of gene expression patterns, coupled with topographic clustering. Aspect model, a latent variable model for dyadic data, is applied to extract latent patterns underlying complex variations of gene expression levels. Then a topographic clustering is performed to find coherent groups of genes, based on the extracted latent patterns as well as individual gene expression behaviors. Applied to cell cycle­regulated genes of the yeast Saccharomyces cerevisiae, the proposed method could discover biologically meaningful patterns related with characteristic expression behavior in particular cell cycle phases. In addition, the display of the variation in the composition of these latent patterns on the cluster map provided more facilitated interpretation of the resulting cluster structure. From this, we argue that latent variable models, coupled with topographic clustering, are a promising tool for explorative analysis of gene expression data.

Analyzing Clustered and Interval-Censored Data based on the Semiparametric Frailty Model

  • Kim, Jin-Heum;Kim, Youn-Nam
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.5
    • /
    • pp.707-718
    • /
    • 2012
  • We propose a semi-parametric model to analyze clustered and interval-censored data; in addition, we plugged-in a gamma frailty to the model to measure the association of members within the same cluster. We propose an estimation procedure based on EM algorithm. Simulation results showed that our estimation procedure may result in unbiased estimates. The standard error is smaller than expected and provides conservative results to estimate the coverage rate; however, this trend gradually disappeared as the number of members in the same cluster increased. In addition, our proposed method was illustrated with data taken from diabetic retinopathy studies to evaluate the effectiveness of laser photocoagulation in delaying or preventing the onset of blindness in individuals with diabetic retinopathy.

A New Distributed Parallel Algorithm for Pattern Classification using Neural Network Model

  • Kim, Dae-Su;Baeg, Soon-Cheol
    • ETRI Journal
    • /
    • v.13 no.2
    • /
    • pp.34-41
    • /
    • 1991
  • In this paper, a new distributed parallel algorithm for pattern classification based upon Self-Organizing Neural Network(SONN)[10-12] is developed. This system works without any information about the number of clusters or cluster centers. The SONN model showed good performance for finding classification information, cluster centers, the number of salient clusters and membership information. It took a considerable amount of time in the sequential version if the input data set size is very large. Therefore, design of parallel algorithm is desirous. A new distributed parallel algorithm is developed and experimental results are presented.

  • PDF

An Improved Hybrid Canopy-Fuzzy C-Means Clustering Algorithm Based on MapReduce Model

  • Dai, Wei;Yu, Changjun;Jiang, Zilong
    • Journal of Computing Science and Engineering
    • /
    • v.10 no.1
    • /
    • pp.1-8
    • /
    • 2016
  • The fuzzy c-means (FCM) is a frequently utilized algorithm at present. Yet, the clustering quality and convergence rate of FCM are determined by the initial cluster centers, and so an improved FCM algorithm based on canopy cluster concept to quickly analyze the dataset has been proposed. Taking advantage of the canopy algorithm for its rapid acquisition of cluster centers, this algorithm regards the cluster results of canopy as the input. In this way, the convergence rate of the FCM algorithm is accelerated. Meanwhile, the MapReduce scheme of the proposed FCM algorithm is designed in a cloud environment. Experimental results demonstrate the hybrid canopy-FCM clustering algorithm processed by MapReduce be endowed with better clustering quality and higher operation speed.

Distributed Authentication Model using Multi-Level Cluster for Wireless Sensor Networks (무선센서네트워크를 위한 다중계층 클러스터 기반의 분산형 인증모델)

  • Shin, Jong-Whoi;Yoo, Dong-Young;Kim, Seog-Gyu
    • Journal of the Korea Society for Simulation
    • /
    • v.17 no.3
    • /
    • pp.95-105
    • /
    • 2008
  • In this paper, we propose the DAMMC(Distributed Authentication Model using Multi-level Cluster) for wireless sensor networks. The proposed model is that one cluster header in m-layer has a role of CA(Certificate Authority) but it just authenticates sensor nodes in lower layer for providing an efficient authentication without authenticating overhead among clusters. In here, the m-layer for authentication can be properly predefined by user in consideration of various network environments. And also, the DAMMC uses certificates based on the threshold cryptography scheme for more reliable configuration of WSN. Experimental results show that the cost of generation and reconfiguration certification are decreased but the security performance are increased compared to the existing method.

  • PDF

Optimizing the maximum reported cluster size for normal-based spatial scan statistics

  • Yoo, Haerin;Jung, Inkyung
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.4
    • /
    • pp.373-383
    • /
    • 2018
  • The spatial scan statistic is a widely used method to detect spatial clusters. The method imposes a large number of scanning windows with pre-defined shapes and varying sizes on the entire study region. The likelihood ratio test statistic comparing inside versus outside each window is then calculated and the window with the maximum value of test statistic becomes the most likely cluster. The results of cluster detection respond sensitively to the shape and the maximum size of scanning windows. The shape of scanning window has been extensively studied; however, there has been relatively little attention on the maximum scanning window size (MSWS) or maximum reported cluster size (MRCS). The Gini coefficient has recently been proposed by Han et al. (International Journal of Health Geographics, 15, 27, 2016) as a powerful tool to determine the optimal value of MRCS for the Poisson-based spatial scan statistic. In this paper, we apply the Gini coefficient to normal-based spatial scan statistics. Through a simulation study, we evaluate the performance of the proposed method. We illustrate the method using a real data example of female colorectal cancer incidence rates in South Korea for the year 2009.

Water consumption forecasting and pattern classification according to demographic factors and automated meter reading (인구통계학적 요인 및 원격검침 자료를 활용한 가정용 물 사용패턴 분류 및 물 사용량 예측 연구)

  • Kim, Kibum;Park, Haekeum;Kim, Taehyeon;Hyung, Jinseok;Koo, Jayong
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.36 no.3
    • /
    • pp.149-165
    • /
    • 2022
  • The water consumption data of individual consumers must be analyzed and forecast to establish an effective water demand management plan. A k-mean cluster model that can monitor water use characteristics based on hourly water consumption data measured using automated meter reading devices and demographic factors is developed in this study. In addition, the quantification model that can estimate the daily water consumption is developed. K-mean cluster analysis based on the four clusters shows that the average silhouette coefficient is 0.63, also the silhouette coefficients of each cluster exceed 0.60, thereby verifying the high reliability of the cluster analysis. Furthermore, the clusters are clearly classified based on water usage and water usage patterns. The correlation coefficients of four quantification models for estimating water consumption exceed 0.74, confirming that the models can accurately simulate the investigated demographic data. The statistical significance of the models is considered reasonable, hence, they are applicable to the actual field. Because the use of automated smart water meters has become increasingly popular in recent year, water consumption has been metered remotely in many areas. The proposed methodology and the results obtained in this study are expected to facilitate improvements in the usability of smart water meters in the future.

Unveiling Quenching History of Cluster Galaxies Using Phase-space Analysis

  • Rhee, Jinsu;Smith, Rory;Yi, Sukyoung K.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.44 no.1
    • /
    • pp.40.1-40.1
    • /
    • 2019
  • We utilize times since infall of cluster galaxies obtained from Yonsei Zoom-in Cluster Simulation (YZiCS), the cosmological hydrodynamic N-body simulations, and star formation rates from the SDSS data release 10 to study how quickly late-type galaxies are quenched in the cluster environments. In particular, we confirm that the distributions of both simulated and observed galaxies in phase-space diagrams are comparable and that each location of phase-space can provide the information of times since infall and star formation rates of cluster galaxies. Then, by limiting the location of phase-space of simulated and observed galaxies, we associate their star formation rates at z ~ 0.08 with times since infall using an abundance matching technique that employs the 10 quantiles of each probability distribution. Using a flexible quenching model covering different quenching scenarios, we find the star formation history of satellite galaxies that best reproduces the obtained relationship between time since infall and star formation rate at z ~ 0.08. Based on the derived star formation history, we constrain the quenching timescale (2 - 7 Gyr) with a clear stellar mass trend and confirm that the refined model is consistent with the "delayed-then-rapid" quenching scenario: the constant delayed phase as ~ 2.3 Gyr and the quenching efficiencies (i.e., e-folding timescale) outside and inside clusters as ~ 2 - 4 Gyr (${\propto}M_*^{-1}$) and 0.5 - 1.5 Gyr (${\propto}M_*^{-2}$), Finally, we suggest: (i) ram-pressure is the main driver of quenching of satellite galaxies for the local Universe, (ii) the quenching trend on stellar mass at z > 0.5 indicates other quenching mechanisms as the main driver.

  • PDF

Uniform Color Image Transformation based on Color Cluster Model (칼라 클러스터 모델에 근거한 균일 칼라 영상 변환)

  • Lee, Jeong-Hwan;Park, Se-Hyeon;Kim, Jung-Su
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1646-1657
    • /
    • 1996
  • This paper presents a color transformation method based on a uniform color image model. Firstly, color variation factors are grouped into identical (multiplicative) factor and independent(additive) one for the color model, and they are modelled by the Gaussian function. The shape of a color cluster in (R, G, B) feature space is an ellipsoid whose elongated major axis correspond to the direction of mean vector. Secondly, the transformation of a color cluster using the model is studied. A transformation method for three dimensional coordinated is described. The proposed method is applied to artificial and natural color images. By the result of experiments, the elongated major axis of each cluster making up the transformed color image aggress with the direction of its mean vector.

  • PDF

New Galaxy Catalog of the Virgo Cluster

  • Kim, Suk;Rey, Soo-Chang;Jerjen, Helmut;Lisker, Thorsten;Sung, Eon-Chang;Lee, Youngdae;Chung, Jiwon;Pak, Mina;Yi, Wonhyeong;Lee, Woong
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.39 no.2
    • /
    • pp.50-50
    • /
    • 2014
  • We present a new catalog of galaxies in the wider region of the Virgo cluster, based on the Sloan Digital Sky Survey (SDSS) Data Release 7. The Extended Virgo Cluster Catalog (EVCC) covers an area of 725 deg2 or 60.1 Mpc2. It is 5.2 times larger than the footprint of the classical Virgo Cluster Catalog (VCC) and reaches out to 3.5 times the virial radius of the Virgo cluster. We selected 1324 spectroscopically targeted galaxies with radial velocities less than 3000 km s-1. In addition, 265 galaxies that have been missed in the SDSS spectroscopic survey but have available redshifts in the NASA Extragalactic Database are also included. Our selection process secured a total of 1589 galaxies of which 676 galaxies are not included in the VCC. The certain and possible cluster members are defined by means of redshift comparison with a cluster infall model. We employed two independent and complementary galaxy classification schemes: the traditional morphological classification based on the visual inspection of optical images and a characterization of galaxies from their spectroscopic features. SDSS u, g, r, i, and z passband photometry of all EVCC galaxies was performed using Source Extractor. We compare the EVCC galaxies with the VCC in terms of morphology, spatial distribution, and luminosity function. The EVCC defines a comprehensive galaxy sample covering a wider range in galaxy density that is significantly different from the inner region of the Virgo cluster. It will be the foundation for forthcoming galaxy evolution studies in the extended Virgo cluster region, complementing ongoing and planned Virgo cluster surveys at various wavelengths.

  • PDF