Nonlinear Speech Production Modeling using Nonlinear Autoregressive Exogenous based on Support Vector Machine

서포트 벡터 머신 기반 비선형 외인성 자귀회귀를 이용한 비선형 조음 모델링

  • Jang, Seung-Jin (Department of Biomedical Engineering, College of Health & Science, Yonsei University) ;
  • Kim, Hyo-Min (Department of Biomedical Engineering, College of Health & Science, Yonsei University) ;
  • Park, Young-Choel (Department of Computer & Telecommunication Engineering, Yonsei University) ;
  • Choi, Hong-Shik (Department of Otolaryngology, College of Medicine, Yonsei University) ;
  • Yoon, Young Ro (Department of Biomedical Engineering, College of Health & Science, Yonsei University)
  • Published : 2007.11.09

Abstract

In this paper, our proposed Nonlinear Autoregressive Exogenous (NARX) based on Least Square-Support Vector Regression (LS-SVR) is introduced and tested for producing natural sounds. This nonlinear synthesizer perfectly reproduce voiced sounds, and also conserve the naturalness such as jitter and shimmer, compared to LPC does not keep these naturalness. However, the results of some phonation are quite different from the original sounds. These results are assumed that single-band model can not afford to control and decompose the high frequency components. Therefore multi-band model with wavelet filterbank is adopted for substituting single band model. As a results, multi-band model results in improved stability. Finally, nonlinear speech modeling using NARX based on LS-SVR can successfully reconstruct synthesized sounds nearly similar to original voiced sounds.

Keywords