통합 검색 | Korea Science

Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
- 대한음성학회지:말소리
- /
- 제64호
- /
- pp.77-88
- /
- 2007
Nowadays, mel-frequency cesptral coefficients (MFCCs) and Gaussian mixture models (GMMs) are used for the pathological voice detection. This paper suggests a method to improve the performance of the pathological/normal voice classification based on the MFCC-based GMM. We analyze the characteristics of the mel frequency-based filterbank energies using the fisher discriminant ratio (FDR). And the feature vectors through the linear discriminant analysis (LDA) transformation of the filterbank energies (FBE) and the MFCCs are implemented. An accuracy is measured by the GMM classifier. This paper shows that the FBE LDA-based GMM is a sufficiently distinct method for the pathological/normal voice classification, with a 96.6% classification performance rate. The proposed method shows better performance than the MFCC-based GMM with noticeable improvement of 54.05% in terms of error reduction.
PDF

장승진;최성희;김효민;최홍식;윤영로
- 대한전기학회:학술대회논문집
- /
- 대한전기학회 2007년도 제38회 하계학술대회
- /
- pp.1797-1798
- /
- 2007
In field of voice pathology, diverse statistics extracted form pitch estimation were commonly used to assess voice quality. In this study, we proposed robust pitch detection algorithm which can estimate pitch of pathological voices in benign vocal fold lesions. we also compared our proposed algorithm with three established pitch detection algorithms; autocorrelation, simplified inverse filtering technique, and nonlinear state-space embedding methods. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices. According to the results of pitch errors, gross pitch error showed some increases in cases of pathological voices; especially excessive increase in PDA based on nonlinear time-series. In an analysis of types of pathological voices classified by aperiodicity and the degree of chaos, the more voice has aperiodic and chaotic, the more growth of pitch errors increased. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.
PDF

장승진;최성희;김효민;최홍식;윤영로
- 대한전자공학회:학술대회논문집
- /
- 대한전자공학회 2007년도 하계종합학술대회 논문집
- /
- pp.407-408
- /
- 2007
Robust pitch estimation is an important study in many areas of speech processing. In voice pathology, diverse statistics extracted form pitch were commonly used to test voice quality. In this study, we compared several established pitch detection algorithms (PDAs) for verification of adequacy of the PDAs. In the database of total pathological voices of 99 and normal voices of 30, an analysis of errors related with pitch detection was evaluated between pathological and normal voices, or among the types of pathological voices such as benign vocal fold lesions; polyp, nodule, and cysts. Consequently, it is required to survey the severity of tested voice in order to obtain accurate pitch estimates.
PDF

이지연;정상배;최흥식;한민수
- 대한음성학회지:말소리
- /
- 제66호
- /
- pp.61-72
- /
- 2008
This paper proposes a method to improve pathological and normal voice classification performance by combining multiple features such as auditory-based and higher-order features. Their performances are measured by Gaussian mixture models (GMMs) and linear discriminant analysis (LDA). The combination of multiple features proposed by the frame-based LDA method is shown to be an effective method for pathological and normal voice classification, with a 87.0% classification rate. This is a noticeable improvement of 17.72% compared to the MFCC-based GMM algorithm in terms of error reduction.
PDF

Wang, Jianglin;Jo, Cheol-Woo
- 음성과학
- /
- 제13권3호
- /
- pp.7-18
- /
- 2006
Diagnosis of pathological voice is one of the important issues in biomedical applications of speech technology. This study focuses on the discrimination of voice disorder using HMM (Hidden Markov Model) for automatic detection between normal voice and vocal fold disorder voice. This is a non-intrusive, non-expensive and fully automated method using only a speech sample of the subject. Speech data from normal people and patients were collected. Mel-frequency filter cepstral coefficients (MFCCs) were modeled by HMM classifier. Different states (3 states, 5 states and 7 states), 3 mixtures and left to right HMMs were formed. This method gives an accuracy of 93.8% for train data and 91.7% for test data in the discrimination of normal and vocal fold disorder voice for sustained /a/.
PDF