Machine Learning based Speech Disorder Detection System

Jung, Junyoung;Kim, Gibak;

doi:10.5909/JBE.2017.22.2.253

Journal of Broadcast Engineering (방송공학회논문지)

Volume 22 Issue 2
/
Pages.253-256
/
2017
/
1226-7953(pISSN)
/
2287-9137(eISSN)

The Korean Institute of Broadcast and Media Engineers (한국방송∙미디어공학회)

DOI QR Code

Machine Learning based Speech Disorder Detection System

기계학습 기반의 장애 음성 검출 시스템

Jung, Junyoung (School of Electrical Engineering, Soongsil Univ.) ;
Kim, Gibak (School of Electrical Engineering, Soongsil Univ.)

정준영 (숭실대학교 전기공학부) ;
김기백 (숭실대학교 전기공학부)

Received : 2017.01.05
Accepted : 2017.03.02
Published : 2017.03.30

https://doi.org/10.5909/JBE.2017.22.2.253 Citation PDF KSCI KPUBS

Download PDF

⟨ Previous Next ⟩

Abstract

This paper deals with the implementation of speech disorder detection system based on machine learning classification. Problems with speech are a common early symptom of a stroke or other brain injuries. Therefore, detection of speech disorder may lead to correction and fast medical treatment of strokes or cerebrovascular accidents. The speech disorder system can be implemented by extracting features from the input speech and classifying the features using machine learning algorithms. Ten machine learning algorithms with various scaling methods were used to discriminate speech disorder from normal speech. The detection system was evaluated by the TORGO database which contains dysarthric speech collected from speakers with either cerebral palsy or amyotrophic lateral sclerosis.

본 논문에서는 기계학습 기반의 분류 방법을 이용하여 장애 음성을 검출하고자 한다. 음성 장애 중 마비말 장애는 뇌성마비, 파킨슨 질환, 뇌졸중 등 주로 뇌질환에 의해 발생하는 것으로 알려져 있다. 이러한 장애 음성을 검출함으로써 뇌졸중 등의 급성 뇌질환 발생에 대한 조기 처치가 가능하다. 장애 음성 검출은 입력 음성에 대한 특징벡터 추출과 기계학습을 이용한 분류과정을 통해 이루어질 수 있다. 실험을 위해서 장애 음성 DB인 TORGO 데이터를 사용하였으며, 10가지 기계학습 알고리즘과 다양한 특징벡터 스케일링 방법에 대해 장애 음성 검출 성능을 평가하였다.

Keywords

References

H. Yun et, al., "On-Line Audio Genre Classification using Spectrogram and Deep Neural Network," Journal of Broadcast Engineering, Vol.21, No.6, pp. 977-984, November 2016. https://doi.org/10.5909/JBE.2016.21.6.977
H. Kim and J. Park, "Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases," Journal of Broadcast Engineering, Vol.20, No.3, pp. 408-413, May 2015. https://doi.org/10.5909/JBE.2015.20.3.408
J. Gil and M. Kim, "Subimage Detection of Window Image Using AdaBoost," Journal of Broadcast Engineering, Vol.19, No.5, pp. 578-589, September 2014. https://doi.org/10.5909/JBE.2014.19.5.578
R. D. Kent, G. Weismer, J. F. Kent, H. K. Vorperian, and J. R. Duffy, "Acoustic Studies of Dysarthric Speech: Methods, Progress, and Potential," Journal of Communication Disorders, vol. 32, no. 3, pp. 141-186, 1999. https://doi.org/10.1016/S0021-9924(99)00004-0
A. Tsanas, M. A. Little, P. E. McSharry, and L. O. Ramig, "Accurate Telemonitoring of Parkinson's Disease Progression by Noninvasive Speech Tests," IEEE Trans. Biomedical Engineering, vol. 57, no.4, pp. 884-893, April 2010. https://doi.org/10.1109/TBME.2009.2036000
http://scikit-learn.org/stable/modules/preprocessing.html.
F. Rudzicz, A.K. Namasivayam, and T. Wolff, "The TORGO database of acoustic and articulatory speech from speakers with dysarthria," Language Resources and Evaluation, vol. 46, no. 4, pp. 523-541, 2012. https://doi.org/10.1007/s10579-011-9145-0

Journal of Broadcast Engineering (방송공학회논문지)

Machine Learning based Speech Disorder Detection System

기계학습 기반의 장애 음성 검출 시스템

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)