• Title/Summary/Keyword: GMM모델

Search Result 131, Processing Time 0.025 seconds

Gaussian Mixture Model using Minimum Classification Error for Environmental Sounds Recognition Performance Improvement (Minimum Classification Error 방법 도입을 통한 Gaussian Mixture Model 환경음 인식성능 향상)

  • Han, Da-Jeong;Park, Aa-Ron;Park, Jun-Qyu;Baek, Sung-June
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.12
    • /
    • pp.497-503
    • /
    • 2011
  • In this paper, we proposed the MCE as a GMM training method to improve the performance of environmental sounds recognition. We model the environmental sounds data with newly defined misclassification function using the log likelihood of the corresponding class and the log likelihood of the rest classes for discriminative training. The model parameters are estimated with the loss function using GPD(generalized probabilistic descent). For recognition performance comparison, we extracted the 12 degrees features using preprocessing and MFCC(mel-frequency cepstral coefficients) of the 9 kinds of environmental sounds and carry out GMM classification experiments. According to the experimental results, MCE training method showed the best performance by an average of 87.06% with 19 mixtures. This result confirmed us that MCE training method could be effectively used as a GMM training method in environmental sounds recognition.

Background Subtraction based on GMM for Night-time Video Surveillance (야간 영상 감시를 위한 GMM기반의 배경 차분)

  • Yeo, Jung Yeon;Lee, Guee Sang
    • Smart Media Journal
    • /
    • v.4 no.3
    • /
    • pp.50-55
    • /
    • 2015
  • In this paper, we present background modeling method based on Gaussian mixture model to subtract background for night-time video surveillance. In night-time video, it is hard work to distinguish the object from the background because a background pixel is similar to a object pixel. To solve this problem, we change the pixel of input frame to more advantageous value to make the Gaussian mixture model using scaled histogram stretching in preprocessing step. Using scaled pixel value of input frame, we then exploit GMM to find the ideal background pixelwisely. In case that the pixel of next frame is not included in any Gaussian, the matching test in old GMM method ignores the information of stored background by eliminating the Gaussian distribution with low weight. Therefore we consider the stacked data by applying the difference between the old mean and new pixel intensity to new mean instead of removing the Gaussian with low weight. Some experiments demonstrate that the proposed background modeling method shows the superiority of our algorithm effectively.

Implementation of Vocabulary-Independent Keyword Spotting System (가변어휘 핵심어 검출 시스템의 구현)

  • Shin Young Wook;Song Myung Gyu;Kim Hyung Soon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.167-170
    • /
    • 2000
  • 본 논문에서는 triphone을 기본단위로 하는 HMM에 의해 핵심어 모델을 구성하고, 사용자가 임의로 핵심어를 추가 및 변경할 수 있도록 가변어휘 핵심어 검출기를 구현하였다. 비핵심어 모델링 방법으로 monophone clustering을 사용한 방법 및 GMM을 사용한 방법의 성능을 비교하였다. 또한 후처리 과정에서 가변어휘 인식구조에 적합한 anti-subword 모델을 사용하였으며 몇 가지 구현방식에 따른 후처리 성능을 검토하였다. 실험결과 비핵심어 모델로 monophone을 clustering하여 사용한 방법보다 GMM을 사용한 경우 약간의 인식성능 개선을 얻을 수 있었으며, 후처리 과정에서 Kullback distance를 이용한 anti-subword 모델링 방식이 다른 방식에 비해 우수한 결과를 나타냈다.

  • PDF

A Content-Based Image Retrieval Technique Using the Shape and Color Features of Objects (객체의 모양과 색상특징을 이용한 내용기반 영상검색 기법)

  • 박종현;박순영;오일환
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.10B
    • /
    • pp.1902-1911
    • /
    • 1999
  • In this paper we present a content-based image retrieval algorithm using the visual feature vectors which describe the spatial characteristics of objects. The proposed technique uses the Gaussian mixture model(GMM) to represent multi-colored objects and the expectation maximization(EM) algorithm is employed to estimate the maximum likelihood(ML) parameters of the model. After image segmentation is performed based on GMM, the shape and color features are extracted from each object using Fourier descriptors and color histograms, respectively. Image retrieval consists of two steps: first, the shape-based query is carried out to find the candidate images whose objects have the similar shapes with the query image and second, the color-based query is followed. The experimental results show that the proposed algorithm is effective in image retrieving by using the spatial and visual features of segmented objects.

  • PDF

A Neuro-Fuzzy Modeling using the Hierarchical Clustering and Gaussian Mixture Model (계층적 클러스터링과 Gaussian Mixture Model을 이용한 뉴로-퍼지 모델링)

  • Kim, Sung-Suk;Kwak, Keun-Chang;Ryu, Jeong-Woong;Chun, Myung-Geun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.5
    • /
    • pp.512-519
    • /
    • 2003
  • In this paper, we propose a neuro-fuzzy modeling to improve the performance using the hierarchical clustering and Gaussian Mixture Model(GMM). The hierarchical clustering algorithm has a property of producing unique parameters for the given data because it does not use the object function to perform the clustering. After optimizing the obtained parameters using the GMM, we apply them as initial parameters for Adaptive Network-based Fuzzy Inference System. Here, the number of fuzzy rules becomes to the cluster numbers. From this, we can improve the performance index and reduce the number of rules simultaneously. The proposed method is verified by applying to a neuro-fuzzy modeling for Box-Jenkins s gas furnace data and Sugeno's nonlinear system, which yields better results than previous oiles.

A Study on Background Speaker Model Design for Portable Speaker Verification Systems (휴대용 화자확인시스템을 위한 배경화자모델 설계에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.35-43
    • /
    • 2003
  • General speaker verification systems improve their recognition performances by normalizing log likelihood ratio, using a speaker model and its background speaker model that are required to be verified. So these systems rely heavily on the availability of much speaker independent databases for background speaker model design. This constraint, however, may be a burden in practical and portable devices such as palm-top computers or wireless handsets which place a premium on computations and memory. In this paper, new approach for the GMM-based background model design used in portable speaker verification system is presented when the enrollment data is available. This approach is to modify three parameters of GMM speaker model such as mixture weights, means and covariances along with reduced mixture order. According to the experiment on a 20 speaker population from YOHO database, we found that this method had a promise of effective use in a portable speaker verification system.

  • PDF

Performance Improvement in GMM-based Text-Independent Speaker Verification System (GMM 기반의 문맥독립 화자 검증 시스템의 성능 향상)

  • Hahm Seong-Jun;Shen Guang-Hu;Kim Min-Jung;Kim Joo-Gon;Jung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.131-134
    • /
    • 2004
  • 본 논문에서는 GMM(Gaussian Mixture Model)을 이용한 문맥독립 화자 검증 시스템을 구현한 후, arctan 함수를 이용한 정규화 방법을 사용하여 화자검증실험을 수행하였다. 특징파라미터로서는 선형예측방법을 이용한 켑스트럼 계수와 회귀계수를 사용하고 화자의 발성 변이를 고려하여 CMN(Cepstral Mean Normalization)을 적용하였다. 화자모델 생성을 위한 학습단에서는 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian Mixture Model)을 이용하였고 화자 검증단에서는 ML(Maximum Likelihood)을 이용하여 유사도를 계산하고 기존의 정규화 방법과 arctan 함수를 이용한 방법에 의해 정규화된 점수(score)와 미리 정해진 문턱값과 비교하여 검증하였다. 화자 검증 실험결과, arctan 함수를 부가한 방법이 기존의 방법보다 항상 향상된 EER을 나타냄을 확인할 수 있었다.

  • PDF

Fast MOG Algorithm Using Object Prediction (객체 예측을 이용한 고속 MOG 알고리즘)

  • Oh, Jeong-Su
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.18 no.11
    • /
    • pp.2721-2726
    • /
    • 2014
  • In a MOG algorithm using the GMM to subtract background, the model parameter computation and the object classification to be performed at every pixel require a huge computation and are the chief obstacles to its uses. This paper proposes a fast MOG algorithm that partly adopts the simple model parameter computation and the object classification skip on the basis of the object prediction. The former is applied to the pixels that gives little effect on the model parameter and the latter is applied to the pixels whose object prediction is firmly trusted. In comparative experiment between the conventional and proposed algorithms using videos, the proposed algorithm carries out the simple model parameter computation and the object classification skip over 77.75% and 92.97%, respectively, nevertheless it retains more than 99.98% and 99.36% in terms of image and moving object-unit average classification accuracies, respectively.

LSTM RNN-based Korean Speech Recognition System Using CTC (CTC를 이용한 LSTM RNN 기반 한국어 음성인식 시스템)

  • Lee, Donghyun;Lim, Minkyu;Park, Hosung;Kim, Ji-Hwan
    • Journal of Digital Contents Society
    • /
    • v.18 no.1
    • /
    • pp.93-99
    • /
    • 2017
  • A hybrid approach using Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) has showed great improvement in speech recognition accuracy. For training acoustic model based on hybrid approach, it requires forced alignment of HMM state sequence from Gaussian Mixture Model (GMM)-Hidden Markov Model (HMM). However, high computation time for training GMM-HMM is required. This paper proposes an end-to-end approach for LSTM RNN-based Korean speech recognition to improve learning speed. A Connectionist Temporal Classification (CTC) algorithm is proposed to implement this approach. The proposed method showed almost equal performance in recognition rate, while the learning speed is 1.27 times faster.

Classification of Seoul Metro Stations Based on Boarding/ Alighting Patterns Using Machine Learning Clustering (기계학습 클러스터링을 이용한 승하차 패턴에 따른 서울시 지하철역 분류)

  • Min, Meekyung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.4
    • /
    • pp.13-18
    • /
    • 2018
  • In this study, we classify Seoul metro stations according to boarding and alighting patterns using machine earning technique. The target data is the number of boarding and alighting passengers per hour every day at 233 subway stations from 2008 to 2017 provided by the public data portal. Gaussian mixture model (GMM) and K-means clustering are used as machine learning techniques in order to classify subway stations. The distribution of the boarding time and the alighting time of the passengers can be modeled by the Gaussian mixture model. K-means clustering algorithm is used for unsupervised learning based on the data obtained by GMM modeling. As a result of the research, Seoul metro stations are classified into four groups according to boarding and alighting patterns. The results of this study can be utilized as a basic knowledge for analyzing the characteristics of Seoul subway stations and analyzing it economically, socially and culturally. The method of this research can be applied to public data and big data in areas requiring clustering.