DOI QR코드

DOI QR Code

개량된 음성매개변수를 사용한 지속시간이 짧은 잡음음성 중의 배경잡음 분류

Background Noise Classification in Noisy Speech of Short Time Duration Using Improved Speech Parameter

  • 투고 : 2016.05.03
  • 심사 : 2016.05.27
  • 발행 : 2016.09.30

초록

음성인식처리 분야에서 배경잡음으로 인하여 음성입력이 배경잡음으로 잘못 판단되는 원인이 되어 음성인식율의 저하를 초래한다. 이러한 종류의 잡음대책은 단순하지 않으므로 보다 고도한 잡음처리기술이 필요하게 된다. 따라서 본 논문에서는 잡음환경 중에서 정상적인 배경잡음 혹은 비정상적인 배경잡음과 지속 시간이 짧은 음성을 구별하는 알고리즘에 대하여 기술한다. 본 알고리즘은 다른 종류의 잡음과 음성을 구별하는 중요한 수단으로서 개량된 음성의 특징파리미터를 사용한다. 다음으로 다층퍼셉트론 네트워크에 의하여 잡음의 종류를 추정하는 알고리즘에 대해서 기술한다. 본 실험에서는 잡음과 음성이 구별이 가능하도록 실험적으로 확인하였다.

In the area of the speech recognition processing, background noises are caused the incorrect response to the speech input, therefore the speech recognition rates are decreased by the background noises. Accordingly, a more high level noise processing techniques are required since these kinds of noise countermeasures are not simple. Therefore, this paper proposes an algorithm to distinguish between the stationary background noises or non-stationary background noises and the speech signal having short time duration in the noisy environments. The proposed algorithm uses the characteristic parameter of the improved speech signal as an important measure in order to distinguish different types of the background noises and the speech signals. Next, this algorithm estimates various kinds of the background noises using a multi-layer perceptron neural network. In this experiment, it was experimentally clear the estimation of the background noises and the speech signals.

키워드

참고문헌

  1. C. Hanilci, T. Kinnunen, R. Saeidi, J. Pohjalainen, P. Alku, F. Ertas, J. Sandberg, and M. Hansson-Sandsten, "Comparing spectrum estimators in speaker verification under additive noise degradation," IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4769-4772, March 2012.
  2. R. Saeidi, J. Pohjalainen, T. Kinnunen, and Paavo Alku, "Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification," IEEE Signal Processing Letters, vol. 17, no. 6, pp. 599-602, June 2010. https://doi.org/10.1109/LSP.2010.2048649
  3. R. Su, X. Liu, and L. Wang, "Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 1, pp. 102-114, Jan. 2015.
  4. J. S. Choi, "Speech and Noise Recognition System by Neural Network," The Korea Institute of Electronic Communication Sciences, vol. 5, no. 4, pp. 357-362, Aug. 2010.
  5. N. Moritz, Jorn Anemuller, and B. Kollmeier, "An Auditory Inspired Amplitude Modulation Filter Bank for Robust Feature Extraction in Automatic Speech Recognition," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 11, pp. 1926-1937, Nov. 2015. https://doi.org/10.1109/TASLP.2015.2456420
  6. J. S. Choi, "Noise Reduction Algorithm in Speech by Wiener Filter," The Korea Institute of Electronic Communication Sciences, vol. 8, no. 8, pp. 1293-1298, Aug. 2013. https://doi.org/10.13067/JKIECS.2013.8.9.1293
  7. S. Furui, "Speaker-Independent Isolasted Word Recognition Using Dynamic Features of Speech Spectrum," IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 34, no. 1, pp. 52-59, Feb. 1986. https://doi.org/10.1109/TASSP.1986.1164788
  8. J. S. Choi, "A Wiener Filter Algorithm of Noise Subtraction Based on Threshold Detection," Korean Institute of Information Technology, vol. 13, no. 7, pp. 51-56, July 2015.
  9. X. Zhang, Y. Guo, Xuemei Hou, "A speech Recognition Method of Isolated Words Based on Modified LPC Cepstrum," IEEE International Conference on Granular Computing, pp. 481-484, Nov. 2007.
  10. S. K. Pal, S. Mitra, "Multilayer perceptron, fuzzy sets, and classification," IEEE Transaction on Neural Networks, vol. 3, no. 5, pp. 683-697, Sep. 1992. https://doi.org/10.1109/72.159058
  11. D. Rumelhart, G. Hinton and R. Williams, "Learning representations by back-propagation errors," Nature, vol. 323, pp. 533-536, Oct. 1986. https://doi.org/10.1038/323533a0
  12. H. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions," in Proc. ISCA ITRW ASR2000 on Automatic Speech Recognition: Challenges for the Next Millennium, Paris, France, Oct. 2000.