• Title, Summary, Keyword: Gaussian mixture model

Search Result 384, Processing Time 0.061 seconds

Analysis and Implementation of Speech/Music Classification for 3GPP2 SMV Based on GMM (3GPP2 SMV의 실시간 음성/음악 분류 성능 향상을 위한 Gaussian Mixture Model의 적용)

  • Song, Ji-Hyun;Lee, Kye-Hwan;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.390-396
    • /
    • 2007
  • In this letter, we propose a novel approach to improve the performance of speech/music classification for the selectable mode vocoder(SMV) of 3GPP2 using the Gaussian mixture model(GMM) which is based on the expectation-maximization(EM) algorithm. We first present an effective analysis of the features and the classification method adopted in the conventional SMV. And then feature vectors which are applied to the GMM are selected from relevant Parameters of the SMV for the efficient speech/music classification. The performance of the proposed algorithm is evaluated under various conditions and yields better results compared with the conventional scheme of the SMV.

Measuring of Effectiveness of Tracking Based Accident Detection Algorithm Using Gaussian Mixture Model (가우시안 배경혼합모델을 이용한 Tracking기반 사고검지 알고리즘의 적용 및 평가)

  • Oh, Ju-Taek;Min, Jun-Young
    • International Journal of Highway Engineering
    • /
    • v.14 no.3
    • /
    • pp.77-85
    • /
    • 2012
  • Most of Automatic Accident Detection Algorithm has a problem of detecting an accident as traffic congestion. Actually, center's managers deal with accidents depend on watching CCTV or accident report by drivers even though they run the Automatic Accident Detection system. It is because of the system's detecting errors such as detecting non-accidents as accidents, and it makes decreasing in the system's overall reliability. It means that Automatic Accident Detection Algorithm should not only have high detection probability but also have low false alarm probability, and it has to detect accurate accident spot. The study tries to verify and evaluate the effectiveness of using Gaussian Mixture Model and individual vehicle tracking to adapt Accident Detection Algorithm to Center Management System by measuring accident detection probability and false alarm probability's frequency in the real accident.

A Realization of Injurious moving picture filtering system with Gaussian Mixture Model and Frame-level Likelihood Estimation (Gaussian Mixture Model과 프레임 단위 유사도 추정을 이용한 유해동영상 필터링 시스템 구현)

  • Kim, Min-Joung;Jeong, Jong-Hyeog
    • Journal of Korean Institute of Intelligent Systems
    • /
    • v.23 no.2
    • /
    • pp.184-189
    • /
    • 2013
  • In this paper, we propose the injurious moving picture filtering system using certain sounds contained in the injurious moving picture to filter injurious moving picture which is distributed without limitation in internet and internet storage space. For this purpose, the Gaussian Mixture Model which can well represent the characteristics of the sound, is used and frame level likelihood estimation is used to calculate the likelihood between filtering target data and the sound models. Also, the pruning method which can real-time proceed by reducing the comparing number of data, is applied for real-time processing, and MWMR method which showed good performance from existing speaker identification, is applied for the distinguish performance of high precision. In the identification experiment result, in case of the frame rate which is the proportion of total frame to high likelihood frame, is set to 50%, identification error rate is 6.06%, and in case of frame rate is set to 60%, error rate is 3.03%. As the result, the proposed system can distinguish between general and injurious moving picture effectively.

Performance Comparison of Background Estimation in the Video (영상에서의 배경추정알고리즘 성능 비교)

  • Do, Jin-Kyu;Kim, Gyu-Yeong;Park, Jang-Sik;Kim, Hyun-Tae;Yu, Yun-Sik
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.808-810
    • /
    • 2011
  • The background estimation algorithms had a significant impact on the performance of image processing and recognition. In this paper, background estimation algorithms were analysis of complexity and performance as preprocessing of image recognition. It was evaluated the performance of Gaussian Running Average, Mixture of Gaussian, and KDE algorithm. The simulation results show that KDE algorithm outperforms compared to the other algorithms.

  • PDF

Clustering In Tied Mixture HMM Using Homogeneous Centroid Neural Network (Homogeneous Centroid Neural Network에 의한 Tied Mixture HMM의 군집화)

  • Park Dong-Chul;Kim Woo-Sung
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.31 no.9C
    • /
    • pp.853-858
    • /
    • 2006
  • TMHMM(Tied Mixture Hidden Markov Model) is an important approach to reduce the number of free parameters in speech recognition. However, this model suffers from a degradation in recognition accuracy due to its GPDF (Gaussian Probability Density Function) clustering error. This paper proposes a clustering algorithm, called HCNN(Homogeneous Centroid Neural network), to cluster acoustic feature vectors in TMHMM. Moreover, the HCNN uses the heterogeneous distance measure to allocate more code vectors in the heterogeneous areas where probability densities of different states overlap each other. When applied to Korean digit isolated word recognition, the HCNN reduces the error rate by 9.39% over CNN clustering, and 14.63% over the traditional K-means clustering.

Speaker Identification Using PCA Fuzzy Mixture Model (PCA 퍼지 혼합 모델을 이용한 화자 식별)

  • Lee, Ki-Yong
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.149-157
    • /
    • 2003
  • In this paper, we proposed the principal component analysis (PCA) fuzzy mixture model for speaker identification. A PCA fuzzy mixture model is derived from the combination of the PCA and the fuzzy version of mixture model with diagonal covariance matrices. In this method, the feature vectors are first transformed by each speaker's PCA transformation matrix to reduce the correlation among the elements. Then, the fuzzy mixture model for speaker is obtained from these transformed feature vectors with reduced dimensions. The orthogonal Gaussian Mixture Model (GMM) can be derived as a special case of PCA fuzzy mixture model. In our experiments, with having the number of mixtures equal, the proposed method requires less training time and less storage as well as shows better speaker identification rate compared to the conventional GMM. Also, the proposed one shows equal or better identification performance than the orthogonal GMM does.

  • PDF

A Fuzzy Rule Extraction by EM Algorithm and A Design of Temperature Control System (EM 알고리즘에 의한 퍼지 규칙생성과 온도 제어 시스템의 설계)

  • 오범진;곽근창;유정웅
    • Journal of the Korean Institute of Illuminating and Electrical Installation Engineers
    • /
    • v.16 no.5
    • /
    • pp.104-111
    • /
    • 2002
  • This paper presents a fuzzy rule extraction method using EM(Expectation-Maximization) algorithm and a design method of adaptive neuro-fuzzy control. EM algorithm is used to estimate a maximum likelihood of a GMM(Gaussian Mixture Model) and cluster centers. The estimated clusters is used to automatically construct the fuzzy rules and membership functions for ANFIS(Adaptive Neuro-Fuzzy Inference System). Finally, we applied the proposed method to the water temperature control system and obtained better results with respect to the number of rules and SAE(Sum of Absolute Error) than previous techniques such as conventional fuzzy controller.

Speaker Verification Using SVM Kernel with GMM-Supervector Based on the Mahalanobis Distance (Mahalanobis 거리측정 방법 기반의 GMM-Supervector SVM 커널을 이용한 화자인증 방법)

  • Kim, Hyoung-Gook;Shin, Dong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.216-221
    • /
    • 2010
  • In this paper, we propose speaker verification method using Support Vector Machine (SVM) kernel with Gaussian Mixture Model (GMM)-supervector based on the Mahalanobis distance. The proposed GMM-supervector SVM kernel method is combined GMM with SVM. The GMM-supervectors are generated by GMM parameters of speaker and other speaker utterances. A speaker verification threshold of GMM-supervectors is decided by SVM kernel based on Mahalanobis distance to improve speaker verification accuracy. The experimental results for text-independent speaker verification using 20 speakers demonstrates the performance of the proposed method compared to GMM, SVM, GMM-supervector SVM kernel based on Kullback-Leibler (KL) divergence, and GMM-supervector SVM kernel based on Bhattacharyya distance.

Voice-Pishing Detection Algorithm Based on Minimum Classification Error Technique (최소 분류 오차 기법을 이용한 보이스 피싱 검출 알고리즘)

  • Lee, Kye-Hwan;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.3
    • /
    • pp.138-142
    • /
    • 2009
  • We propose an effective voice-phishing detection algorithm based on discriminative weight training. The detection of voice phishing is performed based on a Gaussian mixture model (GMM) incorporaiting minimum classification error (MCE) technique. Actually, the MCE technique is based on log-likelihood from the decoding parameter of the SMV(Selectable Mode Vocoder) directly extracted from the decoding process in the mobile phone. According to the experimental result, the proposed approach is found to be effective for the voice phishing detection.

Emotion Recognition Algorithm Based on Minimum Classification Error incorporating Multi-modal System (최소 분류 오차 기법과 멀티 모달 시스템을 이용한 감정 인식 알고리즘)

  • Lee, Kye-Hwan;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.4
    • /
    • pp.76-81
    • /
    • 2009
  • We propose an effective emotion recognition algorithm based on the minimum classification error (MCE) incorporating multi-modal system The emotion recognition is performed based on a Gaussian mixture model (GMM) based on MCE method employing on log-likelihood. In particular, the reposed technique is based on the fusion of feature vectors based on voice signal and galvanic skin response (GSR) from the body sensor. The experimental results indicate that performance of the proposal approach based on MCE incorporating the multi-modal system outperforms the conventional approach.