• Title/Summary/Keyword: Acoustic characteristics of voice

Search Result 146, Processing Time 0.022 seconds

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

Effects of Continuous Speech Therapy in Patients with Non-fluent Aphasia Using kMIT (kMIT를 이용한 비유창성 실어증 환자 음성 언어의 치료효과 연구)

  • Lee Ju Hee;Ko Myun Hwan;Kim Hyun Gi;Hong Ki Hwan
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.16 no.2
    • /
    • pp.158-164
    • /
    • 2005
  • Melody intonation therepy (MIT) is to improve the linguistic aspects of the verbal utterance for aphasic patients utilizing the intact right brain. It is applied to the aphasic patients with good comprehension, poor fluency, and little available speech are thought to be ideal candidates. The purpose of the study was to investigate the effects of Korean Melody intonation therapy (kMIT) in patients with non-fluent aphasia. Five male non-fluent aphasic patients were participated in this study. Average ages were 49.9 years old. Each therapy took 45-50minutes once a week for six months. Aphasic Screen lest (RISS) was used to assess language parameter such as Auditory comprehension, oral expression, reading, writing and calculation ability before and after kMIT. Mean of Length Utterance, verbal intelligibility and articulation disorder were assessed also. Computerized Speech Lab was used to assess the acoustic characteristics of aphasic patients before and after kMIT. The results are as follows : 1) Auditory comprehension, oral expression, reading, writing and calculation ability of the subjects increased after UH'. However, only oral expression showed significant difference (p<0.05). 2) Mean of Length Utterance of five patients generally increased after Un. 3) After kMIT, verbal intelligibility increased and showed significant difference (p<0.05). 4) Misarticulation rate generally decreased after m. 5) Voice Onset Time of the alveolar lenis /t/ and velar lenis /k/ gradually decreased after kMIT. 6) However, intonation pattern were increased gradually in yes'no question after kMIT.

  • PDF

COMPARISON OF SPEECH PATTERNS ACCORDING TO THE DEGREE OF SURGICAL SETBACK IN MANDIBULAR PROGNATHIC PATIENTS (하악골 전돌증 수술 후 하악골 이동량에 따른 발음 양상에 관한 비교 연구)

  • Shin, Ki-Young;Lee, Dong-Keun;Oh, Seung-Hwan;Sung, Hun-Mo;Lee, Suk-Hang
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.23 no.1
    • /
    • pp.48-58
    • /
    • 2001
  • After performing mandibular setback surgery, we found some changes in patterns and organs of speech. This investigation was undertaken to investigate the aspect and degree of speech patterns according to the amount of surgical setback in mandibular prognathic patients. Thirteen patients with skeletal Class III malocclusion were studied preoperative and postoperative over 6 months. They had undergone the mandible setback operation via bilateral sagittal split ramus osteotomy(BSSRO). We split the patients into two groups. Group 1 included patients whose degree of mandibular setback was 6mm or less, and Group 2 above 6mm. Control group was two adults wish normal speech patterns. A phonetician performed narrow phonetic transcriptions of tape-recorded words and sentences produced by each of the patients and the acoustic characteristics of the plosives, fricatives, and flaps were analyzed with a phonetic computer program (Computerized Speech Lab(CSL) Model 4300B(USA)). The results are as follows: 1. Generally, Patients showed longer closure duration of plosives, shorter VOT(voice onset time) and higher ratio of closure duration against VOT. 2. Patients showed more frequent diffuse distribution than the control group in frication noise energy of fricatives. 3. In fricatives, frequency of compact from were higher in group 1 than in group 2. 4. Generally, a short duration of closure for /ㄹ/ was not realized in the patient's flaps. Instead, it was realized as fricatives, sonorant with a vowel-like formant structure, or trill type consonant. 5. Abnormality of the patient's articulation was reduced, but adaptation of their articulation after surgery was not perfect and the degree of adaptation was different according to the degree of surgical setback.

  • PDF

Acoustic differences according to the epileptic focus in benign partial epilepsy with centrotemporal spikes patients (양성 부분 간질 환아에서 간질 발생 위치에 따른 음성언어 분석)

  • Kim, Jung Tae;Choi, Sang Hoon;Kim, Sun Jun
    • Clinical and Experimental Pediatrics
    • /
    • v.50 no.9
    • /
    • pp.896-900
    • /
    • 2007
  • Purpose : The aim of this study was to investigate the speech problems in benign rolandic epilepsy (BRE) according to the seizure focus in EEG and semiology. Methods : Twenty three patients [right origin (13 patients) or left side (10 patients)] who met the BRE criteria by International League Against Epilepsy (ILAE) were prospectively enrolled. We excluded the patients who had abnormal MRI or showed both side spikes in EEG. Computerized Speech Lab was used to assess the speech characteristics of the patients. Results : The error pattern of laryngeal articulation in BRE was exclusively substitution of stop consonants, these errors showed more frequent in the left group (16.0% vs 25.5%). Voice onset time (VOT) of stop consonants and Total duration (TD) of word in both groups were prolonged than normal control group, especially in left group (P<0.05). The first formant of vowel /o/ and second formant of /e/ were significantly decreased in left group (P<0.05). The right group scored wider on pitch range ($192.9{\pm}54.0Hz$) and energy range in spontaneous speech ($14.2{\pm}6.4db$) than the left group ($233.3{\pm}12.5Hz$, $19.4{\pm}9.3db$, respectively, P>0.05). Duration of counting (5 to 9) in left group slower than right group ($8.6{\pm}1.7$ vs $7.9{\pm}1.8sec$). Conclusion : Our data suggested that interictal spikes and seizures in either centrotemporal sides, especially left side group, may induce speech problems. We recommend the logopedic and phoniatric evaluations of speech in BRE patients.

Automatic detection and severity prediction of chronic kidney disease using machine learning classifiers (머신러닝 분류기를 사용한 만성콩팥병 자동 진단 및 중증도 예측 연구)

  • Jihyun Mun;Sunhee Kim;Myeong Ju Kim;Jiwon Ryu;Sejoong Kim;Minhwa Chung
    • Phonetics and Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.45-56
    • /
    • 2022
  • This paper proposes an optimal methodology for automatically diagnosing and predicting the severity of the chronic kidney disease (CKD) using patients' utterances. In patients with CKD, the voice changes due to the weakening of respiratory and laryngeal muscles and vocal fold edema. Previous studies have phonetically analyzed the voices of patients with CKD, but no studies have been conducted to classify the voices of patients. In this paper, the utterances of patients with CKD were classified using the variety of utterance types (sustained vowel, sentence, general sentence), the feature sets [handcrafted features, extended Geneva Minimalistic Acoustic Parameter Set (eGeMAPS), CNN extracted features], and the classifiers (SVM, XGBoost). Total of 1,523 utterances which are 3 hours, 26 minutes, and 25 seconds long, are used. F1-score of 0.93 for automatically diagnosing a disease, 0.89 for a 3-classes problem, and 0.84 for a 5-classes problem were achieved. The highest performance was obtained when the combination of general sentence utterances, handcrafted feature set, and XGBoost was used. The result suggests that a general sentence utterance that can reflect all speakers' speech characteristics and an appropriate feature set extracted from there are adequate for the automatic classification of CKD patients' utterances.

DFT-spread OFDM Communication System for the Power Efficiency and Nonlinear Distortion in Underwater Communication (수중통신에서 비선형 왜곡과 전력효율을 위한 DFT-spread OFDM 통신 시스템)

  • Lee, Woo-Min;Ryn, Heung-Gyoon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.8A
    • /
    • pp.777-784
    • /
    • 2010
  • Recently, the necessity of underwater communication and demand for transmitting and receiving various data such as voice or high resolution image data are increasing as well. The performance of underwater acoustic communication system is influenced by characteristics of the underwater communication channels. Especially, ISI(inter symbol interference) occurs because of delay spread according to multi-path and communication performance is degraded. In this paper, we study the OFDM technique to overcome the delay spread in underwater channel and by using CP, we compensate for delay spread. But PAPR which OFDM system has problem is very high. Therefore, we use DFT-spread OFDM method to avoid nonlinear distortion by high PAPR and to improve efficiency of amplifier. DFT-spread OFDM technique obtains high PAPR reduction effect because of each parallel data loads to all subcarrier by DFT spread processing before IFFT. In this paper, we show performance about delay spread through OFDM system and verify method that DFT spread OFDM is more suitable than OFDM for underwater communication. And we analyze performance according to two subcarrier mapping methods(Interleaved, Localized). Through the simulation results, performance of DFT spread OFDM is better about 5~6dB at $10^{-4}$ than OFDM. When compared to BER according to subcarrier mapping, Interleaved method is better about 3.5dB at $10^{-4}$ than Localized method.