• Title/Summary/Keyword: pathological speech

Search Result 44, Processing Time 0.029 seconds

An analysis of a statistical difference of acoustic Parameters' distribution between normal voice and pathological voice (병적 음성과 정상 음성의 음향학적 파라미터 분포에 대한 통계적 분석)

  • 김용주;권순복;김기련;신민철;조철우;왕수건
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.249-252
    • /
    • 2001
  • The most basic means of communication among humans is a voice. Without speaking of voice technologies, we found it is important and convenient to use a voice in everyday life. But. in consideration to speech recognition systems, we can't always desire a normal voice input as input signal to the system. Generally speaking. a pathological voice as against a normal which is a voice with a problem in the larynx. could be also special case of input voice. Of course, but the distortion of a speech signal by environmental effects i.e., noise or transmission channel was a raised problem. we will take up a pathological voices with laryngeal disease which is essential distortion factor in voice. Also, we are to find out the difference of acoustic parameters distribution between normal and pathological voice by a statistical method in our research.

  • PDF

Sample selection approach using moving window for acoustic analysis of pathological sustained vowels according to signal typing

  • Lee, Ji-Yeoun
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.99-108
    • /
    • 2011
  • The perturbation parameters like jitter, shimmer, and signal-to-noise ratio (SNR) are largely estimated in the particular segment from the subjective or whole portion of the given pathological voice signal although there are many possible regions to be able to analyze the voice signals. In this paper, the pathological voice signals were classified as type 1, 2, 3, or 4 according to narrow band spectrogram and the value differences of the perturbation parameters extracted in the subjective and entire portion tended to be getting bigger as from type 1 to type 4 signals. Therefore, sample selection method based on moving window to analyze type 2 and 3 signals as well as type 1 signals is proposed. Although type 3 signals cannot be analyzed using the perturbation analysis, the type 3 signals by selecting out the samples in which error count is less than 10 through moving window were analyzed. At present, there is no method to be able to analyze the type 4 signals. Future research will endeavor to determine the best way to evaluate such voices.

  • PDF

Artificial Intelligence for Clinical Research in Voice Disease (후두음성 질환에 대한 인공지능 연구)

  • Jungirl, Seok;Tack-Kyun, Kwon
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.33 no.3
    • /
    • pp.142-155
    • /
    • 2022
  • Diagnosis using voice is non-invasive and can be implemented through various voice recording devices; therefore, it can be used as a screening or diagnostic assistant tool for laryngeal voice disease to help clinicians. The development of artificial intelligence algorithms, such as machine learning, led by the latest deep learning technology, began with a binary classification that distinguishes normal and pathological voices; consequently, it has contributed in improving the accuracy of multi-classification to classify various types of pathological voices. However, no conclusions that can be applied in the clinical field have yet been achieved. Most studies on pathological speech classification using speech have used the continuous short vowel /ah/, which is relatively easier than using continuous or running speech. However, continuous speech has the potential to derive more accurate results as additional information can be obtained from the change in the voice signal over time. In this review, explanations of terms related to artificial intelligence research, and the latest trends in machine learning and deep learning algorithms are reviewed; furthermore, the latest research results and limitations are introduced to provide future directions for researchers.

Separation of Periodic and Aperiodic Components of Pathological Speech Signal (장애음성의 주기성분과 잡음성분의 분리 방법에 관하여)

  • Jo Cheolwoo;Li Tao
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.25-28
    • /
    • 2003
  • The aim of this paper is to analyze the pathological voice by separating signal into periodic and aperiodic part. Separation was peformed recursively from the residual signal of voice signal. Based on initial estimation of aperiodic part of spectrum, aperiodic part is decided from the extrapolation method. Periodic part is decided by subtracting aperiodic part from the original spectrum. A parameter HNR is derived based on the separation. Parameter value statistics are compared with those of Jitter and Shimmer for normal, benign and malignant cases.

  • PDF

Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS) (음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정)

  • Kim, Han-Su;Choi, Seong-Hee;Kim, Jae-In;Lee, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.2
    • /
    • pp.92-97
    • /
    • 2004
  • Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.

  • PDF

Perturbation and Nonlinear Dynamic Analysis of Sustained Vowels in Normal and Pathological Voices

  • Lee, Ji-Yeoun;Choi, Seong-Hee;Jiang, Jack J.;Hahn, Min-Soo;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.113-120
    • /
    • 2010
  • In this paper, we investigate the acoustic characteristics of sustained voices from normal subjects and patients with laryngeal pathologies. Perturbation methods (including jitter and shimmer), signal-to-noise ratio (SNR), and nonlinear dynamic methods (such as correlation dimension) are used to analyze normal and pathological voices. We find that jitter does not statistically discriminate between normal and pathological voices, but a significant difference is found for shimmer, SNR, and correlation dimension. The results suggest that nonlinear dynamic analysis may be valuable for the analysis of normal and pathological voices but perturbation analysis should be applied with caution for pathological voice analysis.

  • PDF

Implementation of Voice Source Simulator Using Simulink (Simulink를 이용한 음원모델 시뮬레이터 구현)

  • Jo, Cheol-Woo;Kim, Jae-Hee
    • Phonetics and Speech Sciences
    • /
    • v.3 no.2
    • /
    • pp.89-96
    • /
    • 2011
  • In this paper, details of the design and implementation of a voice source simulator using Simulink and Matlab are discussed. This simulator is an implementation by model-based design concept. Voice sources can be analyzed and manipulated through various factors by choosing options from GUI input and selecting pre-defined blocks or user created ones. This kind of simulation tool can simplify the procedure of analyzing speech signals for various purposes such as voice quality analysis, pathological voice analysis, and speech coding. Also, basic analysis functions are supported to compare the original signal and the manipulated ones.

  • PDF

Perturbation and Perceptual Analysis of Pathological Sustained Vowels according to Signal Typing

  • Lee, Ji-Yeoun;Choi, Seong-Hee;Jiang, Jack J.;Hahn, Min-Soo;Choi, Hong-Shik
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.109-115
    • /
    • 2010
  • In this paper, we investigate a signal typing on the basis of visual impression of distinctive spectrogram. Pathological voices are classified into signal type 1, 2, 3, or 4 to estimate perturbation parameters and to mark perceptual rating based on Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). The results suggest that perturbation analysis can be applied to only type 1 and 2 signals and the perceptual ratings of overall grade increase with each signal type, overall. A good inter-rater reliability is showed among three raters. We recommend that pathological voices should be marked the signal typing and CAPE-V, together, to definitely describe the characteristics of pathological voices.

  • PDF

Detection of Pathological Voice Using Linear Discriminant Analysis

  • Lee, Ji-Yeoun;Jeong, Sang-Bae;Choi, Hong-Shik;Hahn, Min-Soo
    • MALSORI
    • /
    • no.64
    • /
    • pp.77-88
    • /
    • 2007
  • Nowadays, mel-frequency cesptral coefficients (MFCCs) and Gaussian mixture models (GMMs) are used for the pathological voice detection. This paper suggests a method to improve the performance of the pathological/normal voice classification based on the MFCC-based GMM. We analyze the characteristics of the mel frequency-based filterbank energies using the fisher discriminant ratio (FDR). And the feature vectors through the linear discriminant analysis (LDA) transformation of the filterbank energies (FBE) and the MFCCs are implemented. An accuracy is measured by the GMM classifier. This paper shows that the FBE LDA-based GMM is a sufficiently distinct method for the pathological/normal voice classification, with a 96.6% classification performance rate. The proposed method shows better performance than the MFCC-based GMM with noticeable improvement of 54.05% in terms of error reduction.

  • PDF

Speech Rate Analysis of Dysarthric Patients with Parkinson's Disease and Multiple System Atrophy (파킨슨병과 다계통위축증 환자군 간의 말속도 비교평가)

  • Kim, Hyang-Hee;Lee, Mi-Sook;Kim, Sun-Woo;Lee, Won-Yong
    • Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.221-227
    • /
    • 2003
  • Diadochokinetic (DDK) speech task has been utilized as an evaluating tool for speakers with dysarthria for many years. This study attempted to differently diagnose multiple system atrophy (MSA) from idiopathic Parkinson's disease (PD) using patients' performance of DDK (i.e., alternate motion rate (AMR)). The subjects included 11 cases of pathologically confirmed MSA and 16 IPD patients who commonly presented with parkinsonian syndrome. The speech sample of each patient was analyzed acoustically using the MSPTM(Motor Speech Profile, a module of CSL). The results showed that the average DDK rate was significantly faster in the IPD than the MSA groups in all three syllables (i.e., /puh/, /tuh/. and /kuh/). We propose the average DDK rate variable as a core clinical trait in differentiating the two pathological conditions.

  • PDF