• Title/Summary/Keyword: Connected speech

Search Result 146, Processing Time 0.026 seconds

Acoustic Features of Phonatory Offset-Onset in the Connected Speech between a Female Stutterer and Non-Stutterers (연속구어 내 발성 종결-개시의 음향학적 특징 - 말더듬 화자와 비말더듬 화자 비교 -)

  • Han, Ji-Yeon;Lee, Ok-Bun
    • Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.19-33
    • /
    • 2006
  • The purpose of this paper was to examine acoustical characteristics of phonatory offset-onset mechanism in the connected speech of female adults with stuttering and normal nonfluency. The phonatory offset-onset mechanism refers to the laryngeal articulatory gestures. Those gestures are required to mark word boundaries in phonetic contexts of the connected speech. This mechanism included 7 patterns based on the speech spectrogram. This study showed the acoustic features in the connected speech in the production of female adults with stuttering (n=1) and normal nonfluency (n=3). Speech tokens in V_V, V_H, and V_S contexts were selected for the analysis. Speech samples were recorded by Sound Forge, and the spectrographic analysis was conducted using Praat. Results revealed a stuttering (with a type of block) female exhibited more laryngealization gestures in the V_V context. Laryngealization gesture was more characterized by a complete glottal stop or glottal fry both in V_H and in V_S contexts. The results were discussed from theoretical and clinical perspectives.

  • PDF

Analysis of Error Patterns in ]Korean Connected Digit Telephone Speech Recognition (한국어 연속 숫자음 전화 음성 인식에서의 오인식 유형 분석)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung;Kim Sang Hun
    • MALSORI
    • /
    • no.46
    • /
    • pp.77-86
    • /
    • 2003
  • Channel distortion and coarticulation effect in the Korean connected digit telephone speech make it difficult to achieve high performance of connected digit recognition in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone speech, recognition error patterns are investigated and analyzed. Korean connected digit telephone speech database released by SiTEC and HTK system are used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF

Visual Presentation of Connected Speech Test (CST)

  • Jeong, Ok-Ran;Lee, Sang-Heun;Cho, Tae-Hwan
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.26-37
    • /
    • 1998
  • The Connected Speech Test (CST) was developed to test hearing aid performance using realistic stimuli (Connected speech) presented in a background of noise with a visible speaker. The CST has not been investigated as a measure of speech reading ability using the visual portion of the CST only. Thirty subjects were administered the 48 test lists of the CST using visual presentation mode only. Statistically significant differences were found between the 48 test lists and between the 12 passages of the CST (48 passages divided into 12 groups of 4 lists which were averaged.). No significant differences were found between male and female subjects; however, in all but one case, females scored better than males. No significant differences were found between students in communication disorders and students in other departments. Intra- and inter-subject variability across test lists and passages was high. Suggestions for further research include changing the scoring of the CST to be more contextually based and changing the speaker for the CST.

  • PDF

The Optimal and Complete Prompts Lists for Connected Spoken Digit Speech Corpus (연결 숫자음 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성)

  • Yu Ha-Jin
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.131-134
    • /
    • 2003
  • This paper describes an efficient algorithm to generate compact and complete prompts lists for connected spoken digits database. In building a connected spoken digit recognizer, we have to acquire speech data in various contexts. However, in many speech databases the lists are made by using random generators. We provide an efficient algorithm that can generate compact and complete lists of digits in various contexts. This paper includes the proof of optimality and completeness of the algorithm.

  • PDF

The Optimal and Complete Prompts Lists Generation Algorithm for Connected Spoken Word Speech Corpus (연결 단어 음성 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성 알고리즘)

  • 유하진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.2
    • /
    • pp.187-191
    • /
    • 2004
  • This paper describes an efficient algorithm to generate compact and complete prompts lists for connected spoken words speech corpus. In building a connected spoken digit recognizer, we have to acquire speech data in various contexts. However, in many speech databases the lists are made by using random generators. We provide an efficient algorithm that can generate compact and complete lists of digits in various contexts. This paper includes the proof of optimality and completeness of the algorithm.

A Study of Acoustic Measurement in Connected Speech with Dysphonia (음성장애 연속구어의 음향학적 분석)

  • Lee, Myoung-Soon
    • Phonetics and Speech Sciences
    • /
    • v.3 no.4
    • /
    • pp.109-115
    • /
    • 2011
  • The purposes of this study were to identify acoustic parameters of connected speech and to contribute to acoustic analysis of dysphonic voice about patient's natural speech voice as well as sustained phonation of vowels. Acoustic parameters of sentences included LTAS (long-term average spectrum) mean and spectral slope over frequence ranges such as 0-4kHz, 0-6kHz, 0-8kHz, 0-12.5kHz as well as HNR. Acoustic parameters of the vowel 'a' included jitter, RAP, shimmer, NHR, and HNR. Based on 'G' of GRBAS for the severity of dysphonia, two experienced raters judged and classified as four groups including controls, mild, moderate and severe dysphonic group. Connected speech was two sentences extracted from 'stroll' passage. Parameters of the vowel and LTAS mean of the sentences were measured by CSL. The spectral slope of the sentences and HNR of the vowel and the sentences were measured by Praat. Data were statistically analyzed by Spearman correlation and Kruskal-Wallis test using SPSS 12.0. The results of this study are as follows: First, jitter, RAP, shimmer and NHR were significantly different between the groups. Second, for several frequencies, LTAS mean and spectral slope of the sentences were significantly different between the groups. Third, the HNR of the sentences were significantly different between the groups. Forth, there was a presence of correlation between HNR and NHR of the vowel and HNR of the sentences. Accordingly, this study concluded that LTAS, spectral slope, and HNR were predictive parameters of connected speech voice for dysphonic voice.

  • PDF

Voice Quality of Dysarthric Speakers in Connected Speech (연결발화에서 마비말화자의 음질 특성)

  • Seo, Inhyo;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.33-41
    • /
    • 2013
  • This study investigated the perceptual and cepstral/spectral characteristics of phonation and their relationships in dysarthria in connected speech. Twenty-two participants were divided into two groups; the eleven dysarthric speakers were paired with matching age and gender healthy control participants. A perceptual evaluation was performed by three speech pathologists using the GRBAS scale to measure the cepstrual/spectral characteristics of phonation between the two groups' connected speech. Correlations showed dysarthric speakers scored significantly worse (with a higher rating) with severities in G (overall dysphonia grade), B (breathiness), and S (strain), while the smoothed prominence of the cepstral peak (CPPs) was significantly lower. The CPPs were significantly correlated with the perceptual ratings, including G, B, and S. The utility of CPPs is supported by its high relationship with perceptually rated dysphonia severity in dysarthric speakers. The receiver operating characteristic (ROC) analysis showed that the threshold of 5.08 dB for the CPPs achieved a good classification for dysarthria, with 63.6% sensitivity and the perfect specificity (100%). Those results indicate the CPPs reliably distinguished between healthy controls and dysarthric speakers. However, the CPP frequency (CPP F0) and low-high spectral ratio (L/H ratio) were not significantly different between the two groups.

Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech (채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상)

  • Kim Min Sung;Jung Sung Yun;Son Jong Mok;Bae Keun Sung
    • MALSORI
    • /
    • no.44
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database (훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구)

  • Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kang Jeom Ja
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.119-122
    • /
    • 2003
  • In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.

  • PDF

Boundary Tones of Intonational Phrase-Final Morphemes in Dialogues (대화체 억양구말 형태소의 경계성조 연구)

  • Han, Sun-Hee
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.219-234
    • /
    • 2000
  • The study of boundary tones in connected speech or dialogues is one of the most underdeveloped areas of Korean prosody. This. paper concerns the boundary tones of intonational phrase-final morphemes which are shown in the speech corpus of dialogues. Results of phonetic analysis show that different kinds of boundary tones are realized, depending on the positions of the intonational phrase-final morphemes in the sentences.. This study has also shown that boundary tone patterning is somewhat related to the sentence structure, and for better speech recognition and speech synthesis, it presents a simple model of boundary tones based on the fundamental frequency contour. The results of this study will contribute to our understanding of the prosodic pattern of Korean connected speech or dialogues.

  • PDF