DOI QR코드

DOI QR Code

Analysis and synthesis of pseudo-periodicity on voice using source model approach

음성의 준주기적 현상 분석 및 구현에 관한 연구

  • Received : 2016.11.02
  • Accepted : 2016.12.09
  • Published : 2016.12.31

Abstract

The purpose of this work is to analyze and synthesize the pseudo-periodicity of voice using a source model. A speech signal has periodic characteristics; however, it is not completely periodic. While periodicity contributes significantly to the production of prosody, emotional status, etc., pseudo-periodicity contributes to the distinctions between normal and abnormal status, the naturalness of normal speech, etc. Measurement of pseudo-periodicity is typically performed through parameters such as jitter and shimmer. For studying the pseudo-periodic nature of voice in a controlled environment, through collected natural voice, we can only observe the distributions of the parameters, which are limited by the size of collected data. If we can generate voice samples in a controlled manner, experiments that are more diverse can be conducted. In this study, the probability distributions of vowel pitch variation are obtained from the speech signal. Based on the probability distribution of vocal folds, pulses with a designated jitter value are synthesized. Then, the target and re-analyzed jitter values are compared to check the validity of the method. It was found that the jitter synthesis method is useful for normal voice synthesis.

Keywords

References

  1. Fujimura, O. (1968). An approximation to voice aperiodicity. IEEE Transactions on audio and electroacoustics, 16(1), 68-72. https://doi.org/10.1109/TAU.1968.1161951
  2. Hillenbrand, J. (1987). A methodological study of perturbation and additive noise in synthetically generated voice signals. Journal of Speech and Hearing Research, 30, 448-461. https://doi.org/10.1044/jshr.3004.448
  3. Endo, Y., & Kasuya, H. (1996). A stochastic model of fundamental period perturbation and its application to perception of pathological voice quality. Proceedings of ICSLP 1996 (pp. 772-775). Philadelphia.
  4. Titze, I. R. (1995). Summary Statement. In D. Wong (Ed.) Workshop on Acoustic Voice Analysis (pp. 1-36). Iowa City, IA: National Center for Voice and Speech.
  5. Deshmukh, O., Espy-Wilson, C. Y., Salomon, A., & Singh, J. (2005). Use of temporal information: detection of periodicity, aperiodicity, and pitch in speech. IEEE Transactions on speech and audio processing, 13(5), 776-786. https://doi.org/10.1109/TSA.2005.851910
  6. Kreiman, J., Gerratt, B. R., & Antonanzas-Barroso, N. (2006). Analysis and synthesis of pathological voice quality. UCLA.
  7. Kreiman, J., & Gerratt, B. R. (2003). Jitter, shimmer, and noise in pathological voice quality perception. Proceedings of VOQUAL 2003 (pp. 57-61). Geneva.
  8. Kay Elemetrics (2009). Disordered Voice Database of the Massachusetts Eye and Ear Infirmary Voice and Speech Lab (Model 4337), [CD Rom]. Lincoln Park, NJ: Kay Elemetrics.
  9. Teixeira, J. P., & Goncalves, A. (2014). Accuracy of Jitter and Shimmer Measurement. Procedia Technology, 16, 1190-1199. https://doi.org/10.1016/j.protcy.2014.10.134
  10. Alzamendi, G. A., Schlotthauer, G., Rufiner, H. L., & Torres, M. E. (2012). Evaluation of a new model for vowels synthesis with perturbations in acoustic parameters. Latin American Applied Research, 43(3), 1-6.
  11. Ruinskiy, D., & Lavner, Y. (2008). Stochastic models of pitch jitter and amplitude shimmer for voice modification. IEEE 25th convention of Electrical and Electronics Engineers in Israel (IEEEI 2008), (pp. 489-493). 3-5 December. Eliat, Israel.