DOI QR코드

DOI QR Code

Gender Analysis in Elderly Speech Signal Processing

노인음성신호처리에서의 젠더 분석

  • Lee, JiYeoun (Department of Biomedical Engineering, Jungwon University)
  • 이지연 (중원대학교 생체의공학과)
  • Received : 2018.09.04
  • Accepted : 2018.10.20
  • Published : 2018.10.28

Abstract

Changes in vocal cords due to aging can change the frequency of speech, and the speech signals of the elderly can be automatically distinguished from normal speech signals through various analyzes. The purpose of this study is to provide a tool that can be easily accessed by the elderly and disabled people who can be excluded from the rapidly changing technological society and to improve the voice recognition performance. In the study, the gender of the subjects was reported as sex analysis, and the number of female and male voice samples was used equally. In addition, the gender analysis was applied to set the voices of the elderly without using voices of all ages. Finally, we applied a review methodology of standards and reference models to reduce gender difference. 10 Korean women and 10 men aged 70 to 80 years old are used in this study. Comparing the F0 value extracted directly with the waveform and the F0 extracted with TF32 and the Wavesufer speech analysis program, Wavesufer analyzed the F0 of the elderly voice better than TF32. However, there is a need for a voice analysis program for elderly people. In conclusions, analyzing the voice of the elderly will improve speech recognition and synthesis capabilities of existing smart medical systems.

Keywords

Elderly voice;Fundamental frequency;Gender analysis;Disordered voice;TF32 Wavesufer;Sex analysis

Acknowledgement

Supported by : National Research Foundation of Korea(NRF)

References

  1. J. Lee. (2014). KHIDI Brief. Korea Health Industry Development Institute. 140(2014), 1-2.
  2. J. I. Yi, Y. K. Kim & G. J. Kim. (2017). A Study on Improving English Pronunciation and Intonation utilizing Fluency Improvement system, Journal of the Korea Convergence Society, 8(11), 1-6. https://doi.org/10.15207/JKCS.2017.8.11.001
  3. J. C. Hwang. (2017). Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution, Journal of the Korea Convergence Society, 8(5), 13-17. https://doi.org/10.15207/JKCS.2017.8.5.013
  4. J. C. Kahane. (1981). Anatomic and physiologic changes in the aging peripheral speech mechanism, Edited D. S. Beasley & G. A. Davis, Grune & Stratton, New York, 21-45.
  5. S. Y. Lee. (2011). The overall speaking rate and articulation rate of normal elderly people, Graduate program in speech and language pathology, Master these, Yonsei University.
  6. R. T. Sataloff, D. C. Rogen, M. Hawkshaw & J. R. Spiegel. (1997). The three ages of voice. The aging adult voice, Journal of Voice, 11(2), 156-160. https://doi.org/10.1016/S0892-1997(97)80072-0
  7. S. Lee & S. Kim. (2014). Elderly speech analysis for improving elderly speech recognition, Communications of the KOREA Information Science Society, 32(11), 15-20.
  8. J. Y. Lee & S. H. Choi. (2012). Perturbation analysis using a moving window for disordered voices, International Journal of Engineering, Science, and Innovative Technology, 3(1), 1-10.
  9. J. Y. Lee. (2016). Fundamental Frequency Characteristics using Moving Window Method for Korean Elderly Voices, International Journal of Engineering and Technology, 8(3), 1589-1599.
  10. J. B. Alonso, J. de Leon, I. Alonso & M. A. Ferrer. (2001). Automatic Detection of Pathologies in the Voice by HOS Based Parameters, EURASIP Journal on Applied Signal Processing, 4(2001), 275-284.
  11. J. Y. Lee, S. Jeong & M. S. Hahn. (2008). Pathological Voice Detection Using Efficient Combination of Heterogeneous Features, IEICE Transactions on Information and Systems, E91-D(2), 367-370.
  12. J. Y. Lee, S. Jeong, H. S. Choi & M. S. Hahn. (2008). Objective pathological voice quality assessment based on HOS features, IEICE Transactions on Information and Systems, E91-D(12), 2888-2891.
  13. J. Y. Lee. (2012). A two-stage approach using Gaussian mixture models and higher-order statistics for a classification of normal and pathological voices, Advances in Signal Processing on Euraship, 252(2012). http://asp.eurasipjournals.com/content/2012/1/252.
  14. J. Y. Lee, S. B. Jeong, M. S. Hahn, A. Sprecher & J. J. Jiang. (2011). An efficient approach using HOS-based parameters in the LPC residual domain to classify breathy and rough voices, Biomedical Signal Processing and Control, 6(2), 186-196. https://doi.org/10.1016/j.bspc.2010.09.003
  15. J. Y. Lee. (2017). Feature Extraction of Elderly Signals based on Bicoherence Estimation for Automated Medical Diagnosis System, International Journal of Control and Automation, 10(2), 115-128. http//dx.doi.org/10.14257/ijca.2017.10.2.10
  16. KOFWST, Gendered Innovations, http://gister.re.kr/#!/main
  17. WISET (2013). Science and Technology Gender Innovation. Seoul : WISET. ISBN 978-89-97520-24-4
  18. H. T. Kim, S. H. Cho, S. M. Youn, D. I. Sun & M. S. Kim. (2000). The Changes and Characteristics of Acoustic Parameters with Aging in Korean, Korean J Otolaryngol, 2000(43), 69-74.
  19. S. W. Kim, H. H. Park, E. S. Park & H. S. Choi. (2010). Acoustic Characteristics of Normal Healthy Koreans with Advancing age, Phonetics and Speech Sciences, 2(4), 19-28.
  20. P. H. Milenkovic. University of Wisconsin-Madison http://userpages.chorus.net/cspeech/
  21. https://en.wikipedia.org/wiki/WaveSurfer