• Title/Summary/Keyword: Minimum Mean Square Error Log-Spectral Amplitude Estimator

Search Result 3, Processing Time 0.02 seconds

Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition (자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석)

  • Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.2E
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

Speech Processing System Using a Noise Reduction Neural Network Based on FFT Spectrums

  • Choi, Jae-Seung
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.2
    • /
    • pp.162-167
    • /
    • 2012
  • This paper proposes a speech processing system based on a model of the human auditory system and a noise reduction neural network with fast Fourier transform (FFT) amplitude and phase spectrums for noise reduction under background noise environments. The proposed system reduces noise signals by using the proposed neural network based on FFT amplitude spectrums and phase spectrums, then implements auditory processing frame by frame after detecting voiced and transitional sections for each frame. The results of the proposed system are compared with the results of a conventional spectral subtraction method and minimum mean-square error log-spectral amplitude estimator at different noise levels. The effectiveness of the proposed system is experimentally confirmed based on measuring the signal-to-noise ratio (SNR). In this experiment, the maximal improvement in the output SNR values with the proposed method is approximately 11.5 dB better for car noise, and 11.0 dB better for street noise, when compared with a conventional spectral subtraction method.

The Study on Speaker Change Verification Using SNR based weighted KL distance (SNR 기반 가중 KL 거리를 활용한 화자 변화 검증에 관한 연구)

  • Cho, Joon-Beom;Lee, Ji-eun;Lee, Kyong-Rok
    • Journal of Convergence for Information Technology
    • /
    • v.7 no.6
    • /
    • pp.159-166
    • /
    • 2017
  • In this paper, we have experimented to improve the verification performance of speaker change detection on broadcast news. It is to enhance the input noisy speech and to apply the KL distance $D_s$ using the SNR-based weighting function $w_m$. The basic experimental system is the verification system of speaker change using GMM-UBM based KL distance D(Experiment 0). Experiment 1 applies the input noisy speech enhancement using MMSE Log-STSA. Experiment 2 applies the new KL distance $D_s$ to the system of Experiment 1. Experiments were conducted under the condition of 0% MDR in order to prevent missing information of speaker change. The FAR of Experiment 0 was 71.5%. The FAR of Experiment 1 was 67.3%, which was 4.2% higher than that of Experiment 0. The FAR of experiment 2 was 60.7%, which was 10.8% higher than that of experiment 0.