• Title/Summary/Keyword: Pauses

Search Result 57, Processing Time 0.017 seconds

Korean prosodic properties between read and spontaneous speech (한국어 낭독과 자유 발화의 운율적 특성)

  • Yu, Seungmi;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.39-54
    • /
    • 2022
  • This study aims to clarify the prosodic differences in speech types by examining the Korean read speech and spontaneous speech in the Korean part of the L2 Korean Speech Corpus (speech corpus for Korean as a foreign language). To this end, the articulation length, articulation speed, pause length and frequency, and the average fundamental frequency values of sentences were set as variables and analyzed via statistical methodologies (t-test, correlation analysis, and regression analysis). The results found that read speech and spontaneous speech were structurally different in the form of prosodic phrases constituting each sentence and that the prosodic elements differentiating each speech type were articulation length, pause length, and pause frequency. The statistical results show that the correlation between articulation speed and articulation length was highest in read speech, explaining that the longer a given sentence is, the faster the speaker speaks. In spontaneous speech, however, the relationship between the articulation length and the pause frequency in a sentence was high. Overall, spontaneous speech produces more pauses because short intonation phrases are continuously built to make a sentence, and as a result, the sentence gets lengthened.

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Korean Journal of Audiology
    • /
    • v.23 no.4
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

Dialect classification based on the speed and the pause of speech utterances (발화 속도와 휴지 구간 길이를 사용한 방언 분류)

  • Jonghwan Na;Bowon Lee
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.43-51
    • /
    • 2023
  • In this paper, we propose an approach for dialect classification based on the speed and pause of speech utterances as well as the age and gender of the speakers. Dialect classification is one of the important techniques for speech analysis. For example, an accurate dialect classification model can potentially improve the performance of speaker or speech recognition. According to previous studies, research based on deep learning using Mel-Frequency Cepstral Coefficients (MFCC) features has been the dominant approach. We focus on the acoustic differences between regions and conduct dialect classification based on the extracted features derived from the differences. In this paper, we propose an approach of extracting underexplored additional features, namely the speed and the pauses of speech utterances along with the metadata including the age and the gender of the speakers. Experimental results show that our proposed approach results in higher accuracy, especially with the speech rate feature, compared to the method only using the MFCC features. The accuracy improved from 91.02% to 97.02% compared to the previous method that only used MFCC features, by incorporating all the proposed features in this paper.

Comparison of Novel Telemonitoring System Using the Single-lead Electrocardiogram Patch With Conventional Telemetry System

  • Soonil Kwon;Eue-Keun Choi;So-Ryoung Lee;Seil Oh;Hee-Seok Song;Young-Shin Lee;Sang-Jin Han;Hong Euy Lim
    • Korean Circulation Journal
    • /
    • v.54 no.3
    • /
    • pp.140-153
    • /
    • 2024
  • Background and Objectives: Although a single-lead electrocardiogram (ECG) patch may provide advantages for detecting arrhythmias in outpatient settings owing to user convenience, its comparative effectiveness for real-time telemonitoring in inpatient settings remains unclear. We aimed to compare a novel telemonitoring system using a single-lead ECG patch with a conventional telemonitoring system in an inpatient setting. Methods: This was a single-center, prospective cohort study. Patients admitted to the cardiology unit for arrhythmia treatment who required a wireless ECG telemonitoring system were enrolled. A single-lead ECG patch and conventional telemetry were applied simultaneously in hospitalized patients for over 24 hours for real-time telemonitoring. The basic ECG parameters, arrhythmia episodes, and signal loss or noise were compared between the 2 systems. Results: Eighty participants (mean age 62±10 years, 76.3% male) were enrolled. The three most common indications for ECG telemonitoring were atrial fibrillation (66.3%), sick sinus syndrome (12.5%), and atrioventricular block (10.0%). The intra-class correlation coefficients for detecting the number of total beats, atrial and ventricular premature complexes, maximal, average, and minimal heart rates, and pauses were all over 0.9 with p values for reliability <0.001. Compared to a conventional system, a novel system demonstrated significantly lower signal noise (median 0.3% [0.1-1.6%] vs. 2.4% [1.4-3.7%], p<0.001) and fewer episodes of signal loss (median 22 [2-53] vs. 64 [22-112] episodes, p=0.002). Conclusions: The novel telemonitoring system using a single-lead ECG patch offers performance comparable to that of a conventional system while significantly reducing signal loss and noise.

The Effects of Slide-Covered of Slide-Covered Contemporary Charging Automated External Defibrillator on Rapidity and Convenience of Defibrillation (슬라이드 커버 동시충전 자동심장충격기가 제세동의 신속성과 편의성에 미치는 효과)

  • Park, Si-Eun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.325-333
    • /
    • 2020
  • This study compares the rapidity and subjective convenience of T-AED and SC-AED for health care providers and the general public. Subjects were randomly allocated to T-AED (n=77) and SC-AED (n=79) groups. Each group conducted defibrillation, with subsequent measurement of the rapidity of defibrillation in peri-shock pause, pre-shock pause, hesitation pause, and post-shock pause. Defibrillation and chest compression delay times for both equipment were analyzed by t-test. On conclusion of the experiment, subjects answered a questionnaire on the subjective convenience of defibrillation, as measured for confidence, convenience, and clear decision. Comparisons of subjective convenience analyzed by t-test revealed significantly shortened peri-shock pause (11.22s), pre-shock pause (11.04s), and hesitation pause (2.15s) in the SC-AED group, as compared to the T-AED group (p<0.001). However, no significant differences were observed for post-shock pause values. Comparing subjective convenience, confidence (T-AED: 7.62±1.25VAS vs. SC-AED: 7.80±0.98VAS, p=0.358) was not significant, whereas convenience (T-AED: 7.05±1.36VAS vs. SC-AED: 8.95±0.89VAS, p<0.001) and clear decision (T-AED: 6.58±1.73VAS vs. SC-AED: 9.08±0.98VAS, p=0.001) showed statistically significant differences. Our results indicate that compared to T-AED, SC-AED has significantly shortened pauses. Moreover, it is more convenient for the user, and significantly aids in clear decisions.

Analysis of Korean Spontaneous Speech Characteristics for Spoken Dialogue Recognition (대화체 연속음성 인식을 위한 한국어 대화음성 특성 분석)

  • 박영희;정민화
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.330-338
    • /
    • 2002
  • Spontaneous speech is ungrammatical as well as serious phonological variations, which make recognition extremely difficult, compared with read speech. In this paper, for conversational speech recognition, we analyze the transcriptions of the real conversational speech, and then classify the characteristics of conversational speech in the speech recognition aspect. Reflecting these features, we obtain the baseline system for conversational speech recognition. The classification consists of long duration of silence, disfluencies and phonological variations; each of them is classified with similar features. To deal with these characteristics, first, we update silence model and append a filled pause model, a garbage model; second, we append multiple phonetic transcriptions to lexicon for most frequent phonological variations. In our experiments, our baseline morpheme error rate (WER) is 31.65%; we obtain MER reductions such as 2.08% for silence and garbage model, 0.73% for filled pause model, and 0.73% for phonological variations. Finally, we obtain 27.92% MER for conversational speech recognition, which will be used as a baseline for further study.

Relationships between rhythm and fluency indices and listeners' ratings of Korean speakers' English paragraph reading (리듬 및 유창성 지수와 한국 화자의 영어 읽기 발화 청취 평가의 관련성)

  • Hyunsong Chung
    • Phonetics and Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.25-33
    • /
    • 2022
  • This study investigates the relationships between rhythm and fluency indices and listeners' ratings of the rhythm and fluency of Korean college students' English paragraph reading. 17 university students read and recorded a passage from "The North Wind and the Sun" twice before and after three months of English pronunciation instruction. Seven in-service and pre-service English teachers in graduate school assessed the rhythm and fluency of the utterances. In addition, the values of 14 indices of rhythm and fluency were extracted from each speech and the relationships between the indices and the listeners' ratings were analyzed. The rhythm indices of the speakers in this study did not differ significantly from those of native English speakers presented in previous studies in %V, VarcoV, and nPVIV, but were higher in ΔV, ΔC, and VarcoC and lower in speech rate. The level of rhythm and fluency demonstrated by Korean college students was comparable, at least in terms of objective values for certain indices. The fluency indices, such as percentage of pauses, articulation rate, and speech rate, significantly contributed more to predicting both rhythm and fluency ratings than the rhythm indices.