• Title/Summary/Keyword: Expectation and Maximization Algorithm

Search Result 161, Processing Time 0.023 seconds

Controlling Linkage Disequilibrium in Association Tests: Revisiting APOE Association in Alzheimer's Disease

  • Park, Lee-Young
    • Genomics & Informatics
    • /
    • v.5 no.2
    • /
    • pp.61-67
    • /
    • 2007
  • The allele frequencies of markers as well as linkage disequilibrium (LD) can be changed in cases due to the LD between markers and the disease allele, exhibiting spurious associations of markers. To identify the true association, classical statistical tests for dealing with confounders have been applied to draw a conclusion as to whether the association of variants comes from LD with the known disease allele. However, a more direct test considering LD using estimated haplotype frequencies may be more efficient. The null hypothesis is that the different allele frequencies of a variant between cases and controls come solely from the increased disease allele frequency and the LD relationship with the disease allele. The haplotype frequencies of controls are estimated using the expectation maximization (EM) algorithm from the genotype data. The estimated frequencies are applied to calculate the expected haplotype frequencies in cases corresponding to the increase or decrease of the causative or protective alleles. The suggested method was applied to previously published data, and several APOE variants showed association with Alzheimer's disease independent from the APOE ${\varepsilon}4$ variant, rs429358, regardless of LD showing significant simulated p-values. The test results support the possibility that there may be more than one common disease variant in a locus.

Semi-Supervised Recursive Learning of Discriminative Mixture Models for Time-Series Classification

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.13 no.3
    • /
    • pp.186-199
    • /
    • 2013
  • We pose pattern classification as a density estimation problem where we consider mixtures of generative models under partially labeled data setups. Unlike traditional approaches that estimate density everywhere in data space, we focus on the density along the decision boundary that can yield more discriminative models with superior classification performance. We extend our earlier work on the recursive estimation method for discriminative mixture models to semi-supervised learning setups where some of the data points lack class labels. Our model exploits the mixture structure in the functional gradient framework: it searches for the base mixture component model in a greedy fashion, maximizing the conditional class likelihoods for the labeled data and at the same time minimizing the uncertainty of class label prediction for unlabeled data points. The objective can be effectively imposed as individual mixture component learning on weighted data, hence our mixture learning typically becomes highly efficient for popular base generative models like Gaussians or hidden Markov models. Moreover, apart from the expectation-maximization algorithm, the proposed recursive estimation has several advantages including the lack of need for a pre-determined mixture order and robustness to the choice of initial parameters. We demonstrate the benefits of the proposed approach on a comprehensive set of evaluations consisting of diverse time-series classification problems in semi-supervised scenarios.

Power Consumption Patterns Analysis Using Expectation-Maximization Clustering Algorithm and Emerging Pattern Mining (기대치-최대화 군집 알고리즘과 출현 패턴 마이닝을 이용한 전력 소비 패턴 분석)

  • Jin Hyoung Park;Heon Gyu Lee;Jin-Ho Shin;Keun Ho Ryu;Hiseok Kim
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2008.11a
    • /
    • pp.261-264
    • /
    • 2008
  • 전력 회사의 효율적인 운용과 전력 시장에서의 경쟁을 위하여 고객의 전력 소비 패턴 분석 및 정확한 예측이 이루어져야 한다. 이를 위해서 이 논문에서는 원격 검침 시스템에 의한 전국의 고압 고객 데이터를 대상으로 고객의 전력 소비 패턴을 정확히 예측할 수 있는 마이닝 기법을 제안하였다. 먼저, 국내 계약종별 고객 특성에 맞는 부하 패턴의 정확한 구별을 위한 9가지의 특징 벡터를 추출하였고, 기대치-최대화 군집화 알고리즘을 사용하여 고객의 34개 대표 부하프로파일을 생성하였다. 마지막으로 추출된 특징 벡터로부터 각 대표 프로파일에 대한 출현 패턴 기반의 분류 모델을 구성하여 고객의 전력 소비 패턴을 분류하였다. 국내 원격 검침 시스템에 의해 측정된 총 3,895명의 고압 고객 데이터에 대한 실험 결과 약 91%의 분류 정확성을 보였다.

Color Image Segmentation Based on Morphological Operation and a Gaussian Mixture Model (모폴로지 연산과 가우시안 혼합 모형에 기반한 컬러 영상 분할)

  • Lee Myung-Eun;Park Soon-Young;Cho Wan-Hyun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.43 no.3 s.309
    • /
    • pp.84-91
    • /
    • 2006
  • In this paper, we present a new segmentation algorithm for color images based on mathematical morphology and a Gaussian mixture model(GMM). We use the morphological operations to determine the number of components in a mixture model and to detect their modes of each mixture component. Next, we have adopted the GMM to represent the probability distribution of color feature vectors and used the deterministic annealing expectation maximization (DAEM) algorithm to estimate the parameters of the GMM that represents the multi-colored objects statistically. Finally, we segment the color image by using posterior probability of each pixel computed from the GMM. The experimental results show that the morphological operation is efficient to determine a number of components and initial modes of each component in the mixture model. And also it shows that the proposed DAEM provides a global optimal solution for the parameter estimation in the mixture model and the natural color images are segmented efficiently by using the GMM with parameters estimated by morphological operations and the DAEM algorithm.

Depth Map Pre-processing using Gaussian Mixture Model and Mean Shift Filter (혼합 가우시안 모델과 민쉬프트 필터를 이용한 깊이 맵 부호화 전처리 기법)

  • Park, Sung-Hee;Yoo, Ji-Sang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.15 no.5
    • /
    • pp.1155-1163
    • /
    • 2011
  • In this paper, we propose a new pre-processing algorithm applied to depth map to improve the coding efficiency. Now, 3DV/FTV group in the MPEG is working for standard of 3DVC(3D video coding), but compression method for depth map images are not confirmed yet. In the proposed algorithm, after dividing the histogram distribution of a given depth map by EM clustering method based on GMM, we classify the depth map into several layered images. Then, we apply different mean shift filter to each classified image according to the existence of background or foreground in it. In other words, we try to maximize the coding efficiency while keeping the boundary of each object and taking average operation toward inner field of the boundary. The experiments are performed with many test images and the results show that the proposed algorithm achieves bits reduction of 19% ~ 20% and computation time is also reduced.

Optimization of Gaussian Mixture in CDHMM Training for Improved Speech Recognition

  • Lee, Seo-Gu;Kim, Sung-Gil;Kang, Sun-Mee;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.7-21
    • /
    • 1999
  • This paper proposes an improved training procedure in speech recognition based on the continuous density of the Hidden Markov Model (CDHMM). Of the three parameters (initial state distribution probability, state transition probability, output probability density function (p.d.f.) of state) governing the CDHMM model, we focus on the third parameter and propose an efficient algorithm that determines the p.d.f. of each state. It is known that the resulting CDHMM model converges to a local maximum point of parameter estimation via the iterative Expectation Maximization procedure. Specifically, we propose two independent algorithms that can be embedded in the segmental K -means training procedure by replacing relevant key steps; the adaptation of the number of mixture Gaussian p.d.f. and the initialization using the CDHMM parameters previously estimated. The proposed adaptation algorithm searches for the optimal number of mixture Gaussian humps to ensure that the p.d.f. is consistently re-estimated, enabling the model to converge toward the global maximum point. By applying an appropriate threshold value, which measures the amount of collective changes of weighted variances, the optimized number of mixture Gaussian branch is determined. The initialization algorithm essentially exploits the CDHMM parameters previously estimated and uses them as the basis for the current initial segmentation subroutine. It captures the trend of previous training history whereas the uniform segmentation decimates it. The recognition performance of the proposed adaptation procedures along with the suggested initialization is verified to be always better than that of existing training procedure using fixed number of mixture Gaussian p.d.f.

  • PDF

Factored MLLR Adaptation for HMM-Based Speech Synthesis in Naval-IT Fusion Technology (인자화된 최대 공산선형회귀 적응기법을 적용한 해양IT융합기술을 위한 HMM기반 음성합성 시스템)

  • Sung, June Sig;Hong, Doo Hwa;Jeong, Min A;Lee, Yeonwoo;Lee, Seong Ro;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.2
    • /
    • pp.213-218
    • /
    • 2013
  • One of the most popular approaches to parameter adaptation in hidden Markov model (HMM) based systems is the maximum likelihood linear regression (MLLR) technique. In our previous study, we proposed factored MLLR (FMLLR) where each MLLR parameter is defined as a function of a control vector. We presented a method to train the FMLLR parameters based on a general framework of the expectation-maximization (EM) algorithm. Using the proposed algorithm, supplementary information which cannot be included in the models is effectively reflected in the adaptation process. In this paper, we apply the FMLLR algorithm to a pitch sequence as well as spectrum parameters. In a series of experiments on artificial generation of expressive speech, we evaluate the performance of the FMLLR technique and also compare with other approaches to parameter adaptation in HMM-based speech synthesis.

Performance Evaluation of Reconstruction Algorithms for DMIDR (DMIDR 장치의 재구성 알고리즘 별 성능 평가)

  • Kwak, In-Suk;Lee, Hyuk;Moon, Seung-Cheol
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.23 no.2
    • /
    • pp.29-37
    • /
    • 2019
  • Purpose DMIDR(Discovery Molecular Imaging Digital Ready, General Electric Healthcare, USA) is a PET/CT scanner designed to allow application of PSF(Point Spread Function), TOF(Time of Flight) and Q.Clear algorithm. Especially, Q.Clear is a reconstruction algorithm which can overcome the limitation of OSEM(Ordered Subset Expectation Maximization) and reduce the image noise based on voxel unit. The aim of this paper is to evaluate the performance of reconstruction algorithms and optimize the algorithm combination to improve the accurate SUV(Standardized Uptake Value) measurement and lesion detectability. Materials and Methods PET phantom was filled with $^{18}F-FDG$ radioactivity concentration ratio of hot to background was in a ratio of 2:1, 4:1 and 8:1. Scan was performed using the NEMA protocols. Scan data was reconstructed using combination of (1)VPFX(VUE point FX(TOF)), (2)VPHD-S(VUE Point HD+PSF), (3)VPFX-S (TOF+PSF), (4)QCHD-S-400((VUE Point HD+Q.Clear(${\beta}-strength$ 400)+PSF), (5)QCFX-S-400(TOF +Q.Clear(${\beta}-strength$ 400)+PSF), (6)QCHD-S-50(VUE Point HD+Q.Clear(${\beta}-strength$ 50)+PSF) and (7)QCFX-S-50(TOF+Q.Clear(${\beta}-strength$ 50)+PSF). CR(Contrast Recovery) and BV(Background Variability) were compared. Also, SNR(Signal to Noise Ratio) and RC(Recovery Coefficient) of counts and SUV were compared respectively. Results VPFX-S showed the highest CR value in sphere size of 10 and 13 mm, and QCFX-S-50 showed the highest value in spheres greater than 17 mm. In comparison of BV and SNR, QCFX-S-400 and QCHD-S-400 showed good results. The results of SUV measurement were proportional to the H/B ratio. RC for SUV is in inverse proportion to the H/B ratio and QCFX-S-50 showed highest value. In addition, reconstruction algorithm of Q.Clear using 400 of ${\beta}-strength$ showed lower value. Conclusion When higher ${\beta}-strength$ was applied Q.Clear showed better image quality by reducing the noise. On the contrary, lower ${\beta}-strength$ was applied Q.Clear showed that sharpness increase and PVE(Partial Volume Effect) decrease, so it is possible to measure SUV based on high RC comparing to conventional reconstruction conditions. An appropriate choice of these reconstruction algorithm can improve the accuracy and lesion detectability. In this reason, it is necessary to optimize the algorithm parameter according to the purpose.

Method for Channel Estimation in Ambient Backscatter Communication (주변 후방산란 통신에서의 채널 추정기법)

  • Kim, Soo-Hyun;Lee, Donggu;Sun, Young-Ghyu;Sim, Issac;Hwang, Yu-Min;Shin, Yoan;Kim, Dong-In;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.7-12
    • /
    • 2019
  • Ambient backscatter communication is limited to channel estimation technique through a pilot signal, which is a channel estimation method in current RF communication, due to transmission power efficiency. In a limited transmission power environment, the research of traditional ambient backscatter communication has been studied assuming that it is an ideal channel without signal distortions due to channel conditions. In this paper, we propose an expectation-maximization(EM) algorithm, one of the blind channel estimation techniques, as a channel estimation method in ambient backscatter communication system which is the state of channel following normal distribution. In the proposed system model, the simulations confirm that channel estimate through EM algorithm is approaching the lower bound of the mean square error compared with the Bayesian Cramer-Rao Boundary(BCRB) to check performance. It shows that the channel parameter can be estimated in the ambient backscatter communication system.

A Study on GPU-based Iterative ML-EM Reconstruction Algorithm for Emission Computed Tomographic Imaging Systems (방출단층촬영 시스템을 위한 GPU 기반 반복적 기댓값 최대화 재구성 알고리즘 연구)

  • Ha, Woo-Seok;Kim, Soo-Mee;Park, Min-Jae;Lee, Dong-Soo;Lee, Jae-Sung
    • Nuclear Medicine and Molecular Imaging
    • /
    • v.43 no.5
    • /
    • pp.459-467
    • /
    • 2009
  • Purpose: The maximum likelihood-expectation maximization (ML-EM) is the statistical reconstruction algorithm derived from probabilistic model of the emission and detection processes. Although the ML-EM has many advantages in accuracy and utility, the use of the ML-EM is limited due to the computational burden of iterating processing on a CPU (central processing unit). In this study, we developed a parallel computing technique on GPU (graphic processing unit) for ML-EM algorithm. Materials and Methods: Using Geforce 9800 GTX+ graphic card and CUDA (compute unified device architecture) the projection and backprojection in ML-EM algorithm were parallelized by NVIDIA's technology. The time delay on computations for projection, errors between measured and estimated data and backprojection in an iteration were measured. Total time included the latency in data transmission between RAM and GPU memory. Results: The total computation time of the CPU- and GPU-based ML-EM with 32 iterations were 3.83 and 0.26 see, respectively. In this case, the computing speed was improved about 15 times on GPU. When the number of iterations increased into 1024, the CPU- and GPU-based computing took totally 18 min and 8 see, respectively. The improvement was about 135 times and was caused by delay on CPU-based computing after certain iterations. On the other hand, the GPU-based computation provided very small variation on time delay per iteration due to use of shared memory. Conclusion: The GPU-based parallel computation for ML-EM improved significantly the computing speed and stability. The developed GPU-based ML-EM algorithm could be easily modified for some other imaging geometries.