Fast Harmonic Synthesis Method for Sinusoidal Speech-Audio Model

정현파 음성-오디오 모델의 빠른 하모닉 합성 방법

  • Kim, Gyu-Jin (Department of Radio Engineering, Chungbuk National University) ;
  • Kim, Jong-Hark (Department of Radio Engineering, Chungbuk National University) ;
  • Jung, Gyu-Hyeok (Department of Radio Engineering, Chungbuk National University) ;
  • Lee, In-Sung (Department of Radio Engineering, Chungbuk National University)
  • Published : 2007.07.25

Abstract

Most harmonic synthesis methods using phase information employ a quadratic or cubic phase interpolation. The methods are computationally expensive to implement because every component sinewave must be synthesized on a per sample basis. In this paper, we propose a fast harmonic synthesis method for sinusoidal speech/audio coding based on the quadratic and cubic phase function to overcome the complexity problem. To derive the fast harmonic synthesis method, we define the over-sampling function and phase modulation function by constraining the parameter of phase function to be independent for harmonic index and derive the fast synthesis method using IFFT. Experimental results show that the proposed method significantly reduce the complexity of conventional cosine synthesis method while maintaining the performance.

대부분의 2차 및 3차 위상 보간을 사용하는 하모닉 합성 방법은 각각의 정현파 성분에 대해 샘플단위로 합성되기 때문에 구현하는데 있어 많은 연산량이 요구된다. 본 논문에서는 이러한 문제를 해결하기 위해 2차 및 3차 위상 항을 가지는 정현파 음성 및 오디오 모델을 위한 빠른 하모닉 합성 방법을 제안한다. 제안하는 빠른 하모닉 합성 방법은 2차 및 3차 위상함수의 계수를 하모닉과 독립적으로 강요함으로써 오버 샘플링 함수와 위상 변조 함수를 정의하고, Inverse Fast Fourier Transform(IFFT)을 이용한 합성식을 유도한다. 제안한 빠른 하모닉 합성 방법은 연산량과 Segment SNR(Segment Signal-to-Noise Ratio)을 코사인 함수를 이용한 합성 방법과의 비교를 통해 음질의 저하없이 연산량이 현저히 줄어드는 것을 확인할 수 있었다.

Keywords

References

  1. R. J. McAulay and T. F. Quatieri, 'Speech analysis/synthesis based on a sinusoidal representation,' IEEE Trans. on ASSP, vol. 34, no. 4, pp. 744-754, Aug. 1986 https://doi.org/10.1109/TASSP.1986.1164910
  2. W. B. Kleijn and K.K Paliwal, 'Speech coding and synthesis', ELSEVIER, chapter 4, 1995
  3. T. F. Quatieri and R. J. McAulay, 'Phase modelling and its application to sinusoidal transform coding,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '86, vol. 3, pp. 1713-1715, Apr. 1986
  4. D.W.Griffm, 'Multiband Excitation Vocoder', Ph.D. dissertation, M.I.T., Cambridge, MA, 1987
  5. D.W.Griffm and J.S.Lim, 'Multiband Excitation Vocoder', IEEE Trans. on Acoustics, Speech, and Signal Processing, pp1223-1235, 1988 https://doi.org/10.1109/29.1651
  6. ISO/IEC 14496-3, 'Information Technology - Coding of Audio Visual Object, Part 3 : Audio, Subpart 2 : Parametric Coding', ISO/IEC International Standard, 2000
  7. A. V. McCree and T. P. Barnwell 111, 'Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding,' IEEE Trans. on Speech and Audio Processing, vol. 3, pp. 242-250, July 1995 https://doi.org/10.1109/89.397089
  8. A. McCree, K. Truong, E. B. George, T. P. Barnwell 111, and V. Viswanathan, 'A 2.4 kbit/s MELP Coder Candidate for the New U.S. Federal Standard,' in Proc. IEEE Int. Con$ ASSP, (Atlanta), pp. 200-203, May 1996
  9. E. B. George and M. J. T. Smith, 'Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model,' IEEE Trans. Speech Audio Processing, vol. 5, no. 5, pp. 389-406, 1997 https://doi.org/10.1109/89.622558
  10. X. Serra and J. Smith, 'Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,' Computer Music journal, vol. 14, pp. 12-24, Dec. 1990
  11. Masayuki Nishiguchi, 'Harmonic vector excitation coding of speech', Acoustical Science and Technology, Vol. 27, No.6 pp.375-383, 2006 https://doi.org/10.1250/ast.27.375
  12. David L. Thomson, 'Parametric Models of the Magnitude/Phase Spectrum for Harmonic Speech Coding,' ICASSP 1988, pp.378-381, 198