A Study on a New Pre-emphasis Method Using the Short-Term Energy Difference of Speech Signal

음성 신호의 다구간 에너지 차를 이용한 새로운 프리엠퍼시스 방법에 관한 연구

  • 김동준 (청주대 이공대 전자정보통신반도체공학부) ;
  • 김주리 (청주대 리공대 전자공학과)
  • Published : 2001.12.01

Abstract

The pre-emphasis is an essential process for speech signal processing. Widely used two methods are the typical method using a fixed value near unity and te optimal method using the autocorrelation ratio of the signal. This study proposes a new pre-emphasis method using the short-term energy difference of speech signal, which can effectively compensate the glottal source characteristics and lip radiation characteristics. Using the proposed pre-emphasis, speech analysis, such as spectrum estimation, formant detection, is performed and the results are compared with those of the conventional two pre-emphasis methods. The speech analysis with 5 single vowels showed that the proposed method enhanced the spectral shapes and gave nearly constant formant frequencies and could escape the overlapping of adjacent two formants. comparison with FFT spectra had verified the above results and showed the accuracy of the proposed method. The computational complexity of the proposed method reduced to about 50% of the optimal method.

Keywords

References

  1. H. Wakita, 'Estimation of the Vocal Tract Shape by Optimal Inverse Filtering and Acoustic/Articulatory Conversion Methods', SCRL Monograph No.9, Speech Communications Reasearch Laboratory, Santa Barbara, California, 1972
  2. H. Wakita, 'Direct Estimation of the Vocal Tract Shape by Inverse Filtering of Acoustic Speech Waveforms', IEEE Trans. Acoust., Speech, Signal Processing, Vol. AU-21, No. 5, Oct. 1973
  3. S. Furui : Digital Speech Processing, Synthesis, and Recognition, 2nd ed., Marcel Dekker, INC., 2001
  4. J. D. Markel, A. H. Gray : Linear Prediction of Speech, Springer-Verlag.Berlin.Heidelberg.New York, 1976
  5. S. M. Kristensen, M. D. Sorensen, H. Gesmar, and J. J. Led, 'Estimation of Signal Intensities in 2D NMR Spectra with Severe Baseline Distortion by Combined Linear Prediction and Least-Square Analysis', J. of Magnetic Resonance, Series B 112, pp. 193-196, 1996
  6. S. J. Orfanidis : Optimun Signal Processing : An Introduction, 2nd ed., Macmillan Publishing Company, 1988
  7. L. R. Rabiner, R. W. Schafer : Digital Processing of Speech Signals, Prentice-Hall, 1978
  8. S. Saito, K. Nakata : Fundamentals of Speech Signal Processing, 2nd ed., Academic press, 1985
  9. A. H. Gray, J. D. Markel, 'A Spectral Flatness Measure for Studying the Autocorrelation Method of Linear Prediction of Speech Analysis', IEEE Trans., ASSP-22, pp. 207-217, 1974
  10. D. O'Schaughnessy : Speech Communication - Human and Machine, IEEE Press, 2000
  11. B. S. Atal, 'Speech analysis and synthesis by linear prediction of speech wave', J. Acoust. Soc. Am, Vol. 41, pp. 65(A), 1970 https://doi.org/10.1121/1.1974658
  12. K. S. Nathan, Y. T. Lee and H. F. Silverman, 'A Time Varying Analysis Method for Rapid Transitions in Speech', IEEE Trans. Signal Processing, Vol. 39, No. 4, pp. 815-824, April 1991 https://doi.org/10.1109/78.80903
  13. G. Fant : Acoustic Theory of Speech Production, Mouton, 1970
  14. H. Fujisaki, M. Ljungqvist, 'Proposal and Evaluation of Models for the Glottal Source Waveform', IEEE Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 1605-1608, 1986
  15. J. Schroter, J. N. Larar, and M. M. Sondhi, 'Speech Parameter Estimation using a Vocal Tract/Cord Model', IEEE Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 308-311, 1987
  16. E. P. Neuburg, W. R. Bauer, 'On the Source-Filter Model of the Vocal Tract', IEEE. Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 1609-1612, 1986
  17. Y. T. Lee, H. F. Silverman, 'A Model for Non-stationary Analysis of Speech', IEEE Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 1617-1620, 1986
  18. H. Fusisaki, M. Ljungqvist, 'Estimation of Voice Source and Vocal Tract Parameters Based on ARMA Analysis and a Model for the Glottal Source Waveform', IEEE Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 637-640, 1987
  19. A. M. de L. Araujo, F. Vioaro, 'Formant Frequency Estimation Using a MEL Scale LPC Algorithm', IEEE Int. Conf. on Acoustics. Speech. and Signal Processing, pp. 207-212, 1998 https://doi.org/10.1109/ITS.1998.713118