• Title/Summary/Keyword: Markov chain clustering

Search Result 12, Processing Time 0.023 seconds

Estimation of Defect Clustering Parameter Using Markov Chain Monte Carlo (Markov Chain Monte Carlo를 이용한 반도체 결함 클러스터링 파라미터의 추정)

  • Ha, Chung-Hun;Chang, Jun-Hyun;Kim, Joon-Hyun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.32 no.3
    • /
    • pp.99-109
    • /
    • 2009
  • Negative binomial yield model for semiconductor manufacturing consists of two parameters which are the average number of defects per die and the clustering parameter. Estimating the clustering parameter is quite complex because the parameter has not clear closed form. In this paper, a Bayesian approach using Markov Chain Monte Carlo is proposed to estimate the clustering parameter. To find an appropriate estimation method for the clustering parameter, two typical estimators, the method of moments estimator and the maximum likelihood estimator, and the proposed Bayesian estimator are compared with respect to the mean absolute deviation between the real yield and the estimated yield. Experimental results show that both the proposed Bayesian estimator and the maximum likelihood estimator have excellent performance and the choice of method depends on the purpose of use.

Comparison of graph clustering methods for analyzing the mathematical subject classification codes

  • Choi, Kwangju;Lee, June-Yub;Kim, Younjin;Lee, Donghwan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.5
    • /
    • pp.569-578
    • /
    • 2020
  • Various graph clustering methods have been introduced to identify communities in social or biological networks. This paper studies the entropy-based and the Markov chain-based methods in clustering the undirected graph. We examine the performance of two clustering methods with conventional methods based on quality measures of clustering. For the real applications, we collect the mathematical subject classification (MSC) codes of research papers from published mathematical databases and construct the weighted code-to-document matrix for applying graph clustering methods. We pursue to group MSC codes into the same cluster if the corresponding MSC codes appear in many papers simultaneously. We compare the MSC clustering results based on the several assessment measures and conclude that the Markov chain-based method is suitable for clustering the MSC codes.

Music Composition Using Markov Chain and Hierarchical Clustering (마르코프 체인과 계층적 클러스터링 기법을 이용한 작곡 기법)

  • Kwon, Ji-Yong;Lee, In-Kwon
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.744-748
    • /
    • 2008
  • In this paper, we propose a novel technique that generate a new song with given example songs. Our system use k-th order Markov chain of which each state represents notes in a measure. Because we have to consider very high-dimensional space if we use notes in a measure as a state of Markov chain directly, we exploit a hierarchical clustering technique for given example songs to use each cluster as a state. Each given examples can be represented as sequences of cluster ID, and we use them for training data of the Markov chain. The resulting Markov chain effectively gives new song similar to given examples.

  • PDF

A Bayesian Wavelet Threshold Approach for Image Denoising

  • Ahn, Yun-Kee;Park, Il-Su;Rhee, Sung-Suk
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.1
    • /
    • pp.109-115
    • /
    • 2001
  • Wavelet coefficients are known to have decorrelating properties, since wavelet is orthonormal transformation. but empirically, those wavelet coefficients of images, like edges, are not statistically independent. Jansen and Bultheel(1999) developed the empirical Bayes approach to improve the classical threshold algorithm using local characterization in Markov random field. They consider the clustering of significant wavelet coefficients with uniform distribution. In this paper, we developed wavelet thresholding algorithm using Laplacian distribution which is more realistic model.

  • PDF

A Bayesian Model-based Clustering with Dissimilarities

  • Oh, Man-Suk;Raftery, Adrian
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.10a
    • /
    • pp.9-14
    • /
    • 2003
  • A Bayesian model-based clustering method is proposed for clustering objects on the basis of dissimilarites. This combines two basic ideas. The first is that tile objects have latent positions in a Euclidean space, and that the observed dissimilarities are measurements of the Euclidean distances with error. The second idea is that the latent positions are generated from a mixture of multivariate normal distributions, each one corresponding to a cluster. We estimate the resulting model in a Bayesian way using Markov chain Monte Carlo. The method carries out multidimensional scaling and model-based clustering simultaneously, and yields good object configurations and good clustering results with reasonable measures of clustering uncertainties. In the examples we studied, the clustering results based on low-dimensional configurations were almost as good as those based on high-dimensional ones. Thus tile method can be used as a tool for dimension reduction when clustering high-dimensional objects, which may be useful especially for visual inspection of clusters. We also propose a Bayesian criterion for choosing the dimension of the object configuration and the number of clusters simultaneously. This is easy to compute and works reasonably well in simulations and real examples.

  • PDF

A Proposed Simple Method for Multisite Point Rainfall Generation (일강우자료의 다지점 모의 발생을 위한 간단한 방법 제안)

  • Yu, Cheol-Sang;Lee, Dong-Ryul
    • Journal of Korea Water Resources Association
    • /
    • v.33 no.1
    • /
    • pp.99-110
    • /
    • 2000
  • In this study we proposed a simple method for generating multi-site daily rainfall based on the 1-order Markov chain and considering the spatial correlation. The occurrence of rainfall is simulated by a simple 1st-order Markov chain and its intensity to be chosen randomly from the observed data. The spatial correlation between sites could be conserved as the rainfall intensity at each site is to be chosen consistently with the target site in time through generation. It is found that the generated daily rainfall data reproduce genera] characteristics of the observed data such as average, standard deviation, average number of wet and dry days, but the clustering level in time is somewhat loosened. Thus, the lag-I correlation coefficient of the generated data gave smaller value than the observed, also the average lengths of wet run and dry run and the wet-to-wet and dry-to-dry probabilities were a bit less than the observed. This drawback seems to be overcome somewhat by choosing a proper site representing overall basin characteristics or by use of more detailed states of rainfall occurrence.

  • PDF

Bayesian Clustering of Prostate Cancer Patients by Using a Latent Class Poisson Model (잠재그룹 포아송 모형을 이용한 전립선암 환자의 베이지안 그룹화)

  • Oh Man-Suk
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.1-13
    • /
    • 2005
  • Latent Class model has been considered recently by many researchers and practitioners as a tool for identifying heterogeneous segments or groups in a population, and grouping objects into the segments. In this paper we consider data on prostate cancer patients from Korean National Cancer Institute and propose a method for grouping prostate cancer patients by using latent class Poisson model. A Bayesian approach equipped with a Markov chain Monte Carlo method is used to overcome the limit of classical likelihood approaches. Advantages of the proposed Bayesian method are easy estimation of parameters with their standard errors, segmentation of objects into groups, and provision of uncertainty measures for the segmentation. In addition, we provide a method to determine an appropriate number of segments for the given data so that the method automatically chooses the number of segments and partitions objects into heterogeneous segments.

A Method for Group Mobility Model Construction and Model Representation from Positioning Data Set Using GPGPU (GPGPU에 기반하는 위치 정보 집합에서 집단 이동성 모델의 도출 기법과 그 표현 기법)

  • Song, Ha Yoon;Kim, Dong Yup
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.3
    • /
    • pp.141-148
    • /
    • 2017
  • The current advancement of mobile devices enables users to collect a sequence of user positions by use of the positioning technology and thus the related research regarding positioning or location information are quite arising. An individual mobility model based on positioning data and time data are already established while group mobility model is not done yet. In this research, group mobility model, an extension of individual mobility model, and the process of establishment of group mobility model will be studied. Based on the previous research of group mobility model from two individual mobility model, a group mobility model with more than two individual model has been established and the transition pattern of the model is represented by Markov chain. In consideration of real application, the computing time to establish group mobility mode from huge positioning data has been drastically improved by use of GPGPU comparing to the use of traditional multicore systems.

A Use of Expectation Maximization Clustering for Constructing a Markov Chain of Human Mobility Model (기대치 최대화 기반의 군집화를 통한 인간 이동 패턴의 마르코프 연쇄모델 도출)

  • Kim, Hyunuk;Song, Ha Yoon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2012.04a
    • /
    • pp.864-867
    • /
    • 2012
  • 사람들이 휴대용 위치정보 수집 장비나 혹은 스마트폰을 사용하면서 사람의 이동 정보인 위치정보들을 모으는 일이 가능해 졌다. 이러한 위치정보들을 가지고 본 논문에서는 사람의 이동 모델을 나타내고자 하였다. 이동 정보들은 머물러 있는(Stay)상태와 이동하는(Moving) 상태로 나눌 수 있는데 이러한 상태 중 머물러 있는 상태가 군집화가 되어 연쇄 모델속의 하나의 상태(State)로 나타나 질 수 있다. 물론 이동 정보들을 통해 연쇄모델 속 각 상태간의 전이 확률 또한 계산 할 수 있다. 이러한 일련의 과정을 본 논문에서는 기대치 최대화 기반 군집화 과정을 통해 연속시간 연쇄 모델의 형태로 인간의 이동성을 표현하였다. 또한 이러한 모델에서 대표 군집(macro)과 그 부속 군집(micro)을 표현할 수 있었고 이러한 모습은 대표적인 큰 군집 속의 작은 군집의 형태로 나타나게 된다.

Anomaly Detection in Sensor Data

  • Kim, Jong-Min;Baik, Jaiwook
    • Journal of Applied Reliability
    • /
    • v.18 no.1
    • /
    • pp.20-32
    • /
    • 2018
  • Purpose: The purpose of this study is to set up an anomaly detection criteria for sensor data coming from a motorcycle. Methods: Five sensor values for accelerator pedal, engine rpm, transmission rpm, gear and speed are obtained every 0.02 second from a motorcycle. Exploratory data analysis is used to find any pattern in the data. Traditional process control methods such as X control chart and time series models are fitted to find any anomaly behavior in the data. Finally unsupervised learning algorithm such as k-means clustering is used to find any anomaly spot in the sensor data. Results: According to exploratory data analysis, the distribution of accelerator pedal sensor values is very much skewed to the left. The motorcycle seemed to have been driven in a city at speed less than 45 kilometers per hour. Traditional process control charts such as X control chart fail due to severe autocorrelation in each sensor data. However, ARIMA model found three abnormal points where they are beyond 2 sigma limits in the control chart. We applied a copula based Markov chain to perform statistical process control for correlated observations. Copula based Markov model found anomaly behavior in the similar places as ARIMA model. In an unsupervised learning algorithm, large sensor values get subdivided into two, three, and four disjoint regions. So extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior in the sensor values. Conclusion: Exploratory data analysis is useful to find any pattern in the sensor data. Process control chart using ARIMA and Joe's copula based Markov model also give warnings near similar places in the data. Unsupervised learning algorithm shows us that the extreme sensor values are the ones that need to be tracked down for any sign of anomaly behavior.