DOI QR코드

DOI QR Code

Development of medical/electrical convergence software for classification between normal and pathological voices

장애 음성 판별을 위한 의료/전자 융복합 소프트웨어 개발

  • Moon, Ji-Hye (Department of Biomedical Engineering, Jungwon University) ;
  • Lee, JiYeoun (Department of Biomedical Engineering, Jungwon University)
  • 문지혜 (중원대학교 의료공학과) ;
  • 이지연 (중원대학교 의료공학과)
  • Received : 2015.10.30
  • Accepted : 2015.12.20
  • Published : 2015.12.28

Abstract

If the software is developed to analyze the speech disorder, the application of various converged areas will be very high. This paper implements the user-friendly program based on CART(Classification and regression trees) analysis to distinguish between normal and pathological voices utilizing combination of the acoustical and HOS(Higher-order statistics) parameters. It means convergence between medical information and signal processing. Then the acoustical parameters are Jitter(%) and Shimmer(%). The proposed HOS parameters are means and variances of skewness(MOS and VOS) and kurtosis(MOK and VOK). Database consist of 53 normal and 173 pathological voices distributed by Kay Elemetrics. When the acoustical and proposed parameters together are used to generate the decision tree, the average accuracy is 83.11%. Finally, we developed a program with more user-friendly interface and frameworks.

장애음성을 판별할 수 있는 소프트웨어가 개발 될 경우, 원격의료와 언어치료 등 여러 융복합 분야에서의 활용도가 매우 높다. 본 논문은 성대 진동에 대한 변화율을 나타내는 의료정보인 음향학적 파라미터와 신호처리 기반 고차 통계량에 기반을 둔 파라미터를 융합하여, CART(Classification And Regression Trees) 분석을 통해서 정상/장애음성 판별 프로그램을 구현하였다. 사용된 음향학적 파라미터는 Jitter(%)와 shimmer(%)이다. 그리고 본 연구에서 제안된 고차통계량 기반 파라미터는 왜도(Skewness)와 첨도(Kurtosis)의 평균과 분산이다. Kay Elemetrics의 데이터베이스에서 무작위로 발췌된 정상음성 53명, 장애 음성 173명의 /아/ 발화를 이용하여 결정트리(Decision tree) 기반장애음성 판별을 위해 평균적으로 83.15%의 성능을 보이는 알고리즘을 구현하였다. 그 결과를 바탕으로 추후 상용화를 고려하여 사용자 친화적인 프레임 워크에 의해 컨텐츠를 생성하는 융복합형 기능이 포함된 장애음성 판별 프로그램을 개발하였다.

Keywords

References

  1. Jinsu Lee, KHIDI Brief vol.140, pp.1-2, Korea Health Industry Development Institute, 2014.
  2. Hwa-Young Pyo; Hyun Sub Sim, A Study for the Development of Korean Voice Assessment Model for the Patients with Voice Disorders: A Qualitative Study, The Korean Association of Speech Sciences, vol. 14, no.2, pp. 7-22 (16 pages), 2007.
  3. Ji-Yeoun Lee; Minsoo Hahn, Automatic Assessment of Pathological Voice Quality Using Higher-Order Statistics in the LPC Residual Domain, EURASIP Journal on Advances in Signal Processing, Volume 2009, Article ID 748207, 8 pages, 2009.
  4. Soon-Bok Kwon; Soon-Woo Kwon, The Effect of Self Voice Feedback Training Using Praat on the Voice Improvement of Patient with Vocal Nodules, Journal of Special Education & Rehabilitation Science, Vol. 46, No. 1, pp. 191-215, 2007.
  5. J.B. Alonso et al., "Automatic Detection of Pathologies in the Voice by HOS Based Parameters," EURASIP Journal on Applied Signal Processing, vol. 4, pp. 275-284, 2001.
  6. Ki-Chang Nam; Seung-Hoon Lee; Jai-Nam Choi; Hong-Shik Choi; Do-Hyun Nam; Deok-Won Kim, Comparison of vowel pitch results among several commercial voice analysis programs, ICS'05, pp.54-56, 2005.
  7. Ji-Yeoun Lee, Performance Improvement of Automatic Pathological Voice Quality Assessment Based on Higher-Order Statistics, ICU-Schoo of Engineering [Thesis(doctoral)], pp.109, 2008.
  8. Bong-Hyun Kim; Dong-Uk Cho, Pronunciation Influence Analysis of Carbonate Drink and Eucalyptus Fragrance by Applying Speech Signal Processing Techniques, The Journal of Korean Institute of Communications and Information Sciences, Volume 37, Issue 5C, pp.420-428, 2012. https://doi.org/10.7840/KICS.2012.37C.5.420
  9. Taeyeong Shin; Giseong Kim; Yeonguk Kwon; Hyeongsun Kim, Speaker Identification based on Higher-Order Satistics in Noisy Environment, The journal of the acoustical society of Korea, v.16, no.6, pp. 25-35, 1997.
  10. Tae Young Shin;Jae Ho Kim; Kyung Sik Son; Hyung Soon Kim, Pitch Determination and Voiced/Unvoiced Decision of Noisy Speech Based on the Higher-Order Statistis, SCAS, Vol. 12, no. 1, 1995.
  11. JiYeoun Lee; Seong Hee Cho,. Perturbation analysis using a moving window for disordered voices, International Journal of Engineering, Science and Innovative Technology, Vol. 3, No. 1, pp. 1-10, 2012.
  12. Kay Elemetrics Corp. Multi-dimensional voice program: software instruction manual. Pine Brook: NJ: Kay Elemetrics Corp, 1993.
  13. J.I. Godino-Llorente; N. Saenz-Lechon; V. Osma-Ruiz; S. Aguilera-Navarro; P. Gomez-Vilda, An integrated tool for the diagnosis of voice disorders, Medical Engineering & Physics, Vol. 28, No. 3, pp. 276-289, 2006. https://doi.org/10.1016/j.medengphy.2005.04.014
  14. Xiang Wang; Jianping Zhang; Yonghong Ya, Discrimination Between Pathological and Normal Voices Using GMM-SVM Approach, Journal of Voice, Vol. 25, No. 1, pp. 38-43, 2011. https://doi.org/10.1016/j.jvoice.2009.08.002
  15. R. Das, A comparison of multiple classification methods for diagnosis of Parkinson disease, Expert Systems with Applications, Vol. 37, No.2, pp.1568-1572, 2010. https://doi.org/10.1016/j.eswa.2009.06.040