• Title/Summary/Keyword: MMSE-STSA

Search Result 12, Processing Time 0.032 seconds

Speech Enhancement Using Microphone Array with MMSE-STSA Estimator Based Post-Processing (MMSE-STSA 추정치에 기반한 후처리를 갖는 마이크로폰 배열을 이용한 음성 개선)

  • Kwon Hong Seok;Son Jong Mok;Bae Keun Sung
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.187-190
    • /
    • 2002
  • In this paper, a speech enhancement system using microphone array with MMSE-STSA (Minimum Mean Square Error-Short Time Spectral Amplitude) estimator based post-processing is proposed. Speech enhancement is first carried out by conventional delay-and-sum beamforming (DSB). A new MMSE-STSA estimator is then obtained by refining MMSE-STSA estimators from each microphone, which is applied to the output of conventional DSB to obtain additional speech enhancement. Computer simulation for white and pink noises show that the proposed system is superior to other approaches.

  • PDF

Performance Analysis of Noisy Speech Recognition Depending on Parameters for Noise and Signal Power Estimation in MMSE-STSA Based Speech Enhancement (MMSE-STSA 기반의 음성개선 기법에서 잡음 및 신호 전력 추정에 사용되는 파라미터 값의 변화에 따른 잡음음성의 인식성능 분석)

  • Park Chul-Ho;Bae Keun-Sung
    • MALSORI
    • /
    • no.57
    • /
    • pp.153-164
    • /
    • 2006
  • The MMSE-STSA based speech enhancement algorithm is widely used as a preprocessing for noise robust speech recognition. It weighs the gain of each spectral bin of the noisy speech using the estimate of noise and signal power spectrum. In this paper, we investigate the influence of parameters used to estimate the speech signal and noise power in MMSE-STSA upon the recognition performance of noisy speech. For experiments, we use the Aurora2 DB which contains noisy speech with subway, babble, car, and exhibition noises. The HTK-based continuous HMM system is constructed for recognition experiments. Experimental results are presented and discussed with our findings.

  • PDF

Comparison of Recognition Performance for Preprocessing Method of USE STSA with Approximated Modified Bessel Function (Modified Bessel 함수 근사화를 적용한 MMSE STSA 전처리 기법의 음성인식 성능 비교)

  • Son Jong Mok;Kim Min Sung;Bae Keun Sung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.125-128
    • /
    • 2001
  • 본 연구에서는 음성신호의 왜곡에 대해 음성 부재 확률을 고려한 MMSE(Minimum Mean Square Error) STSA(Short-Time Spectral Amplitude Estimator)를 전처리기로 도입하여 HMM(Hidden Markov Model)에 기반 한 음성인식시스템의 인식성능을 평가하였다. 음성인식 시스템의 실시간 구현을 고려하여, MMSE STSA 기법을 음성개선을 위한 전처리기로 사용할 때 MMSE STSA의 이득계산 과정에서 많은 계산량이 요구되는 modified Bessel 함수를 근사 화하여 사용하였다.

  • PDF

Feature Extraction through the post processing of WFBA based on MMSE-STSA for Robust Speech Recognition (강인한 음성인식을 위한 MMSE-STSA기반 후처리 가중필터뱅크분석을 통한 특징추출)

  • Jung Sungyun;Bae Keunsung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.39-42
    • /
    • 2004
  • 본 논문에서는, 잡음음성에 강인한 음성인식을 위한 특징추출 방법을 제시한다. 제시한 방법은 2 단계 잡음제거 과정으로 구성되어 있다. 첫번째 단계는 MMSE-STSA 음성개선기법을 통해 잡음음성신호를 개선시키는 과정이고, 두 번째 단계는, MMSE-STSA 의 개선된 음성에 후처리 가중필터뱅크분석을 통해 잔여잡음의 영향을 감소시키는 과정이다. 제안한 방법의 성능평가를 위해, AURORA2의 잡음음성 DB 중 테스트 집합 A 에 대해 인식실험을 수행하고, 결과를 기존 방법들과 비교, 검토한다.

  • PDF

Robust Speech Recognition in the Car Interior Environment having Car Noise and Audio Output (자동차 잡음 및 오디오 출력신호가 존재하는 자동차 실내 환경에서의 강인한 음성인식)

  • Park, Chul-Ho;Bae, Jae-Chul;Bae, Keun-Sung
    • MALSORI
    • /
    • no.62
    • /
    • pp.85-96
    • /
    • 2007
  • In this paper, we carried out recognition experiments for noisy speech having various levels of car noise and output of an audio system using the speech interface. The speech interface consists of three parts: pre-processing, acoustic echo canceller, post-processing. First, a high pass filter is employed as a pre-processing part to remove some engine noises. Then, an echo canceller implemented by using an FIR-type filter with an NLMS adaptive algorithm is used to remove the music or speech coming from the audio system in a car. As a last part, the MMSE-STSA based speech enhancement method is applied to the out of the echo canceller to remove the residual noise further. For recognition experiments, we generated test signals by adding music to the car noisy speech from Aurora 2 database. The HTK-based continuous HMM system is constructed for a recognition system. Experimental results show that the proposed speech interface is very promising for robust speech recognition in a noisy car environment.

  • PDF

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment (잡음환경에서 음성인식 성능향상을 위한 바이너리 마스크를 이용한 스펙트럼 향상 방법)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.7
    • /
    • pp.468-474
    • /
    • 2010
  • The major factor that disturbs practical use of speech recognition is distortion by the ambient and channel noises. Generally, the ambient noise drops the performance and restricts places to use. DSR (Distributed Speech Recognition) based speech recognition also has this problem. Various noise cancelling algorithms are applied to solve this problem, but loss of spectrum and remaining noise by incorrect noise estimation at low SNR environments cause drop of recognition rate. This paper proposes methods for speech enhancement. This method uses MMSE-STSA for noise cancelling and ideal binary mask to compensate damaged spectrum. According to experiments at noisy environment (SNR 15 dB ~ 0 dB), the proposed methods showed better spectral results and recognition performance.

Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition (자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석)

  • Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2E
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

Comparison of Recognition Per formance of Noisy Speech Depend ing on Preprocessing Methods (전처리 기법에 따른 잡음음성의 인식성능 비교)

  • Son Jong Mok;Lee Yong Ju;Bae Keun Sung
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.31-34
    • /
    • 2000
  • 본 연구에서는 부가잡음에 의한 음성신호의 왜곡에 대해 다양한 음성개선 기법을 전처리기로 도입하여 HMM(Hidden Markov Model)에 기반 한 음성인식 시스템의 인식성능을 평가하였다. 음성개선 기법으로는 MMSE(Minimun Mean Square Error) STSA(Short-Time Spectral Amplitude Estimator) 기법과 웨이브렛 영역에서의 UWD(Undecimated Wavelet Denoising), CWD(Conventional Wavelet Denoising) 기법을 적용하였다. 잡음이 없는 데이터로 훈련한 음성인식시스템에 잡음음성을 입력할 때 각 음성개선기법을 전처리기로 사용하여 신호대잡음비(Signal to Noise Ratio)에 따른 인식 성능을 비교하였다.

  • PDF

A Robust Speech Recognition Method Combining the Model Compensation Method with the Speech Enhancement Algorithm (음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식)

  • Kim, Hee-Keun;Chung, Yong-Joo;Bae, Keun-Seung
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.115-126
    • /
    • 2007
  • There have been many research efforts to improve the performance of the speech recognizer in noisy conditions. Among them, the model compensation method and the speech enhancement approach have been used widely. In this paper, we propose to combine the two different approaches to further enhance the recognition rates in the noisy speech recognition. For the speech enhancement, the minimum mean square error-short time spectral amplitude (MMSE-STSA) has been adopted and the parallel model combination (PMC) and Jacobian adaptation (JA) have been used as the model compensation approaches. From the experimental results, we could find that the hybrid approach that applies the model compensation methods to the enhanced speech produce better results than just using only one of the two approaches.

  • PDF