• Title/Summary/Keyword: Speech estimators

Search Result 6, Processing Time 0.017 seconds

Comparison of Two Speech Estimation Algorithms Based on Generalized-Gamma Distribution Applied to Speech Recognition in Car Noisy Environment (자동차 잡음환경에서의 음성인식에 적용된 두 종류의 일반화된 감마분포 기반의 음성추정 알고리즘 비교)

  • Kim, Hyoung-Gook;Lee, Jin-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.4
    • /
    • pp.28-32
    • /
    • 2009
  • This paper compares two speech estimators under a generalized Gamma distribution for DFT-based single-microphone speech enhancement methods. For the speech enhancement, the noise estimation based on recursive averaging spectral values by spectral minimum noise is applied to two speech estimators based on the generalized Gamma distribution using $\kappa$=1 or $\kappa$=2. The performance of two speech enhancement algorithms is measured by recognition accuracy of automatic speech recognition(ASR) in car noisy environment.

  • PDF

Speech Enhancement Using Multiple Kalman Filter (다중칼만필터를 이용한 음성향상)

  • 이기용
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.225-230
    • /
    • 1998
  • In this paper, a Kalman filter approach for enhancing speech signals degraded by statistically independent additive nonstationary noise is developed. The autoregressive hidden markov model is used for modeling the statistical characteristics of both the clean speech signal and the nonstationary noise process. In this case, the speech enhancement comprises a weighted sum of conditional mean estimators for the composite states of the models for the speech and noise, where the weights equal to the posterior probabilities of the composite states, given the noisy speech. The conditional mean estimators use a smoothing spproach based on two Kalmean filters with Markovian switching coefficients, where one of the filters propagates in the forward-time direction with one frame. The proposed method is tested against the noisy speech signals degraded by Gaussian colored noise or nonstationary noise at various input signal-to-noise ratios. An app개ximate improvement of 4.7-5.2 dB is SNR is achieved at input SNR 10 and 15 dB. Also, in a comparison of conventional and the proposed methods, an improvement of the about 0.3 dB in SNR is obtained with our proposed method.

  • PDF

Speech Estimators Based on Generalized Gamma Distribution and Spectral Gain Floor Applied to an Automatic Speech Recognition (잡음에 강인한 음성인식을 위한 Generalized Gamma 분포기반과 Spectral Gain Floor를 결합한 음성향상기법)

  • Kim, Hyoung-Gook;Shin, Dong;Lee, Jin-Ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.8 no.3
    • /
    • pp.64-70
    • /
    • 2009
  • This paper presents a speech enhancement technique based on generalized Gamma distribution in order to obtain robust speech recognition performance. For robust speech enhancement, the noise estimation based on a spectral noise floor controled recursive averaging spectral values is applied to speech estimation under the generalized Gamma distribution and spectral gain floor. The proposed speech enhancement technique is based on spectral component, spectral amplitude, and log spectral amplitude. The performance of three different methods is measured by recognition accuracy of automatic speech recognition (ASR).

  • PDF

Speech Enhancement Using Microphone Array with MMSE-STSA Estimator Based Post-Processing (MMSE-STSA 추정치에 기반한 후처리를 갖는 마이크로폰 배열을 이용한 음성 개선)

  • Kwon Hong Seok;Son Jong Mok;Bae Keun Sung
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.187-190
    • /
    • 2002
  • In this paper, a speech enhancement system using microphone array with MMSE-STSA (Minimum Mean Square Error-Short Time Spectral Amplitude) estimator based post-processing is proposed. Speech enhancement is first carried out by conventional delay-and-sum beamforming (DSB). A new MMSE-STSA estimator is then obtained by refining MMSE-STSA estimators from each microphone, which is applied to the output of conventional DSB to obtain additional speech enhancement. Computer simulation for white and pink noises show that the proposed system is superior to other approaches.

  • PDF

Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition (자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석)

  • Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2E
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.