• Title/Summary/Keyword: Gaussian Mixture Component

Search Result 48, Processing Time 0.031 seconds

Statistical Extraction of Speech Features Using Independent Component Analysis and Its Application to Speaker Identification

  • Jang, Gil-Jin;Oh, Yung-Hwan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4E
    • /
    • pp.156-163
    • /
    • 2002
  • We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker dependent characteristics and to assess their efficiency we performed speaker identification experiments and compared our results with the conventional Fourier-basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features in that they can obtain a higher speaker identification rate.

Statistical Extraction of Speech Features Using Independent Component Analysis and Its Application to Speaker Identification

  • 장길진;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.156-156
    • /
    • 2002
  • We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker dependent characteristics and to assess their efficiency we performed speaker identification experiments and compared our results with the conventional Fourier-basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features in that they can obtain a higher speaker identification rate.

Dynamic Control of Learning Rate in the Improved Adaptive Gaussian Mixture Model for Background Subtraction (배경분리를 위한 개선된 적응적 가우시안 혼합모델에서의 동적 학습률 제어)

  • Kim, Young-Ju
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • v.9 no.2
    • /
    • pp.366-369
    • /
    • 2005
  • Background subtraction is mainly used for the real-time extraction and tracking of moving objects from image sequences. In the outdoor environment, there are many changeable factor such as gradually changing illumination, swaying trees and suddenly moving objects, which are to be considered for the adaptive processing. Normally, GMM(Gaussian Mixture Model) is used to subtract the background adaptively considering the various changes in the scenes, and the adaptive GMMs improving the real-time performance were worked. This paper, for on-line background subtraction, applied the improved adaptive GMM, which uses the small constant for learning rate ${\alpha}$ and is not able to speedily adapt the suddenly movement of objects, So, this paper proposed and evaluated the dynamic control method of ${\alpha}$ using the adaptive selection of the number of component distributions and the global variances of pixel values.

  • PDF

GENERALIZED GAUSSIAN PRIOR FOR ICA (ICA를 위한 Generalized 가우시안 Prior)

  • 최승진
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.467-469
    • /
    • 1999
  • Independent component analysis (ICA)는 주어진 데이터를 통계적으로 독립인 요소들의 선형 결합으로 표시하는 통계학적 방법이다. ICA의 주요한 적용분야중의 하나는 source들의 선형 mixture로부터 어떠한 서전 정보도 없는 상태에서 원래의 통계학적 독립변수인 source를 복원하는 blind separation이다. ICA와 source separation을 위한 다양한 신경 학습 알고리듬이 제시되어왔다. ICA의 학습 알고리듬에서는 비선형 함수가 중요한 역할을 한다. 이 논문에서는 generalized 가우시안 prior를 도입하여 다양한 확률분포를 갖는 source들의 mixture를 분리하는 효율적인 source separation 알고리즘을 제시한다. 모의실험을 통하여 제안된 방법의 우수성을 살펴본다.

  • PDF

Image Analysis for Surveillance Camera Based on 3D Depth Map (3차원 깊이 정보 기반의 감시카메라 영상 분석)

  • Lee, Subin;Seo, Yongduek
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.286-289
    • /
    • 2012
  • 본 논문은 3차원 깊이 정보를 이용하여 감시카메라에서 움직이는 사람을 검출하고 추적하는 방법을 제안한다. 제안하는 방법은 GMM(Gaussian mixture model)을 이용하여 배경과 움직이는 사람을 분리한 후, 분리된 영역을 CCL(connected-component labeling)을 통하여 각각 블랍(blob) 단위로 나누고 그 블랍을 추적한다. 그 중 블랍 단위로 나누는 데 있어 두 블랍이 합쳐진 경우, 3차원 깊이 정보를 이용하여 두 블랍을 분리하는 방법을 제안한다. 실험을 통하여 제안하는 방법의 결과를 보인다.

  • PDF

Comprehensive Performance Analysis and Comparison of various Digital communication Systems in an Multipath Fading Channel with additive Mixture of Gaussian and Impulsive Noise [Part-2] (가우스성 잡음과 임펄스성 잡음이 혼재하는 다중전파 페이딩 전송로상에서의 제반디지탈 통신시 스템특성의 종합분석 및 비교에 관한 연구 (제 2 부))

  • 김현철;고봉진;공병옥;조성준
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.14 no.3
    • /
    • pp.280-292
    • /
    • 1989
  • In this paper, the error rate equations of digitally modulated signals transmitted through the channel which is not only Gaussian/Impulsive noise but also multi-path fading have been derived. Using the derived equations for the error probabilities of ASK, QAM, CPSK, DPSK, FSK, and MSK signals, the error rate performances of digital modulation systems have been evaluated and represented in the graphs as the functions of CNR, Impulsive indes, the ratio of Gaussian noise power component to Impulsive noise power component, and fading figures. The results show that, in the deep fading environment, the error is occurred more frequency by Gaussian noise than Impulsive noise. And the comparison of various systems certifies that PSK is superior to the ohter systems in the deep fading or shallow fading environment.

  • PDF

Acoustic Model Transformation Method for Speech Recognition Employing Gaussian Mixture Model Adaptation Using Untranscribed Speech Database (미전사 음성 데이터베이스를 이용한 가우시안 혼합 모델 적응 기반의 음성 인식용 음향 모델 변환 기법)

  • Kim, Wooil
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.5
    • /
    • pp.1047-1054
    • /
    • 2015
  • This paper presents an acoustic model transform method using untranscribed speech database for improved speech recognition. In the presented model transform method, an adapted GMM is obtained by employing the conventional adaptation method, and the most similar Gaussian component is selected from the adapted GMM. The bias vector between the mean vectors of the clean GMM and the adapted GMM is used for updating the mean vector of HMM. The presented GAMT combined with MAP or MLLR brings improved speech recognition performance in car noise and speech babble conditions, compared to singly-used MAP or MLLR respectively. The experimental results show that the presented model transform method effectively utilizes untranscribed speech database for acoustic model adaptation in order to increase speech recognition accuracy.

Dimension-Reduced Audio Spectrum Projection Features for Classifying Video Sound Clips

  • Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.3E
    • /
    • pp.89-94
    • /
    • 2006
  • For audio indexing and targeted search of specific audio or corresponding visual contents, the MPEG-7 standard has adopted a sound classification framework, in which dimension-reduced Audio Spectrum Projection (ASP) features are used to train continuous hidden Markov models (HMMs) for classification of various sounds. The MPEG-7 employs Principal Component Analysis (PCA) or Independent Component Analysis (ICA) for the dimensional reduction. Other well-established techniques include Non-negative Matrix Factorization (NMF), Linear Discriminant Analysis (LDA) and Discrete Cosine Transformation (DCT). In this paper we compare the performance of different dimensional reduction methods with Gaussian mixture models (GMMs) and HMMs in the classifying video sound clips.

Speaker Identification Using GMM Based on LPCA (LPCA에 기반한 GMM을 이용한 화자 식별)

  • Seo, Chang-Woo;Lee, Youn-Jeong;Lee, Ki-Yong
    • Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.171-182
    • /
    • 2005
  • An efficient GMM (Gaussian mixture modeling) method based on LPCA (local principal component analysis) with VQ (vector quantization) for speaker identification is proposed. To reduce the dimension and correlation of the feature vector, this paper proposes a speaker identification method based on principal component analysis. The proposed method firstly partitions the data space into several disjoint regions by VQ, and then performs PCA in each region. Finally, the GMM for the speaker is obtained from the transformed feature vectors in each region. Compared to the conventional GMM method with diagonal covariance matrix, the proposed method requires less storage and complexity while maintaining the same performance requires less storage and shows faster results.

  • PDF

A Classification Method Using Data Reduction

  • Uhm, Daiho;Jun, Sung-Hae;Lee, Seung-Joo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.12 no.1
    • /
    • pp.1-5
    • /
    • 2012
  • Data reduction has been used widely in data mining for convenient analysis. Principal component analysis (PCA) and factor analysis (FA) methods are popular techniques. The PCA and FA reduce the number of variables to avoid the curse of dimensionality. The curse of dimensionality is to increase the computing time exponentially in proportion to the number of variables. So, many methods have been published for dimension reduction. Also, data augmentation is another approach to analyze data efficiently. Support vector machine (SVM) algorithm is a representative technique for dimension augmentation. The SVM maps original data to a feature space with high dimension to get the optimal decision plane. Both data reduction and augmentation have been used to solve diverse problems in data analysis. In this paper, we compare the strengths and weaknesses of dimension reduction and augmentation for classification and propose a classification method using data reduction for classification. We will carry out experiments for comparative studies to verify the performance of this research.