• 제목/요약/키워드: casual speech

검색결과 12건 처리시간 0.019초

다른 발화 속도의 또렷한 음성과 대화체로 발화한 영어문장 인지 (The perception of clear and casual English speech under different speed conditions)

  • 이서배
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.33-37
    • /
    • 2018
  • Korean students with much exposure to the relatively slow and clear speech used in most English classes in Korea can be expected to have difficulty understanding the casual style that is common in the everyday speech of English speakers. This research attempted to investigate an effective way to utilize casual speech in English education, by exploring the way different speech styles (clear vs. casual) affect Korean learners' comprehension of spoken English. Twenty Korean university students and two native speakers of English participated in a listening session. The English utterances were produced in different speech styles (clear slow, casual slow, clear fast, and casual fast). The Korean students were divided into two groups by English proficiency level. The results showed that the Korean students achieved 69.4% comprehension accuracy, while the native speakers of English demonstrated almost perfect results. The Korean students (especially the low-proficiency group) had more problems perceiving function words than they did perceiving content words. Responding to the different speech styles, the high-proficiency group had more difficulty listening to utterances with phonological variation than they did listening to utterances produced at a faster speed. The low-proficiency group, however, struggled with utterances produced at a faster speed more than they did with utterances with phonological variation. The pedagogical implications of the results are discussed in the concluding section.

Korean speakers hyperarticulate vowels in polite speech

  • Oh, Eunhae;Winter, Bodo;Idemaru, Kaori
    • 말소리와 음성과학
    • /
    • 제13권3호
    • /
    • pp.15-20
    • /
    • 2021
  • In line with recent attention to the multimodal expression of politeness, the present study examined the association between polite speech and acoustic features through the analysis of vowels produced in casual and polite speech contexts in Korean. Fourteen adult native speakers of Seoul Korean produced the utterances in two social conditions to elicit polite (professor) and casual (friend) speech. Vowel duration and the first (F1) and second formants (F2) of seven sentence- and phrase-initial monophthongs were measured. The results showed that polite speech shares acoustic similarities with vowel production in clear speech: speakers showed greater vowel space expansion in polite than casual speech in an effort to enhance perceptual intelligibility. Especially, female speakers hyperarticulated (front) vowels for polite speech, independent of speech rate. The implications for the acoustic encoding of social stance in polite speech are further discussed.

대학생들이 또렷한 음성과 대화체로 발화한 영어문단의 구글음성인식 (Google speech recognition of an English paragraph produced by college students in clear or casual speech styles)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제9권4호
    • /
    • pp.43-50
    • /
    • 2017
  • These days voice models of speech recognition software are sophisticated enough to process the natural speech of people without any previous training. However, not much research has reported on the use of speech recognition tools in the field of pronunciation education. This paper examined Google speech recognition of a short English paragraph produced by Korean college students in clear and casual speech styles in order to diagnose and resolve students' pronunciation problems. Thirty three Korean college students participated in the recording of the English paragraph. The Google soundwriter was employed to collect data on the word recognition rates of the paragraph. Results showed that the total word recognition rate was 73% with a standard deviation of 11.5%. The word recognition rate of clear speech was around 77.3% while that of casual speech amounted to 68.7%. The reasons for the low recognition rate of casual speech were attributed to both individual pronunciation errors and the software itself as shown in its fricative recognition. Various distributions of unrecognized words were observed depending on each participant and proficiency groups. From the results, the author concludes that the speech recognition software is useful to diagnose each individual or group's pronunciation problems. Further studies on progressive improvements of learners' erroneous pronunciations would be desirable.

파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성 (An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease)

  • 신희백;고도홍
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

명료발화와 보통발화에서 파킨슨병환자 음성의 켑스트럼 및 스펙트럼 분석 (Characteristics of voice quality on clear versus casual speech in individuals with Parkinson's disease)

  • 신희백;심희정;정훈;고도흥
    • 말소리와 음성과학
    • /
    • 제10권2호
    • /
    • pp.77-84
    • /
    • 2018
  • The purpose of this study is to examine the acoustic characteristics of Parkinsonian speech, with respect to different utterance conditions, by employing acoustic/auditory-perceptual analysis. The subjects of the study were 15 patients (M=7, F=8) with Parkinson's disease who were asked to read out sentences under different utterance conditions (clear/casual). The sentences read out by each subject were recorded, and the recorded speech was subjected to cepstrum and spectrum analysis using Analysis of Dysphonia in Speech and Voice (ADSV). Additionally, auditory-perceptual evaluation of the recorded speech was conducted with respect to breathiness and loudness. Results indicate that in the case of clear speech, there was a statistically significant increase in the cepstral peak prominence (CPP), and a decrease in the L/H ratio SD (ratio of low to high frequency spectral energy SD) and CPP F0 SD values. In the auditory-perceptual evaluation, a decrease in breathiness and an increase in loudness were noted. Furthermore, CPP was found to be highly correlated to breathiness and loudness. This provides objective evidence of the immediate usefulness of clear speech intervention in improving the voice quality of Parkinsonian speech.

개별화자 음성의 특징 파라미터 분석 (An Analysis of Phonetic Parameters for Individual Speakers)

  • 고도흥
    • 음성과학
    • /
    • 제7권2호
    • /
    • pp.177-189
    • /
    • 2000
  • This paper investigates how individual speakers' speech can be distinguished using acoustic parameters such as amplitude, pitch, and formant frequencies. Word samples from fifteen male speakers in their 20's in three different regions were recorded in two different modes (i.e., casual and clear speech) in quiet settings, and were analyzed with a Praat macro scrip. In order to determine individual speakers' acoustical values, the total duration of voicing segments was measured in five different timepoints. Results showed that a high correlation coefficient between $F_1\;and\;F_2$ in formant frequency was found among the speakers although there was little correlation coefficient between amplitude and pitch. Statistical grouping shows that individual speakers' voices were not reflected in regional dialects for both casual and clear speech. In addition, the difference of maximum and minimum in amplitude was about 10 dB which indicates a perceptually audible degree. These acoustic data can give some meaningful guidelines for implementing algorithms of speaker identification and speaker verification.

  • PDF

Deep Level Situation Understanding for Casual Communication in Humans-Robots Interaction

  • Tang, Yongkang;Dong, Fangyan;Yoichi, Yamazaki;Shibata, Takanori;Hirota, Kaoru
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제15권1호
    • /
    • pp.1-11
    • /
    • 2015
  • A concept of Deep Level Situation Understanding is proposed to realize human-like natural communication (called casual communication) among multi-agent (e.g., humans and robots/machines), where the deep level situation understanding consists of surface level understanding (such as gesture/posture understanding, facial expression understanding, speech/voice understanding), emotion understanding, intention understanding, and atmosphere understanding by applying customized knowledge of each agent and by taking considerations of thoughtfulness. The proposal aims to reduce burden of humans in humans-robots interaction, so as to realize harmonious communication by excluding unnecessary troubles or misunderstandings among agents, and finally helps to create a peaceful, happy, and prosperous humans-robots society. A simulated experiment is carried out to validate the deep level situation understanding system on a scenario where meeting-room reservation is done between a human employee and a secretary-robot. The proposed deep level situation understanding system aims to be applied in service robot systems for smoothing the communication and avoiding misunderstanding among agents.

시간 영역 파형 패턴에 기반한 한국어 모음 'ㅗ'의 음성 인식 (Speech Recognition of the Korean Vowel 'ㅗ' Based on Time Domain Waveform Patterns)

  • 이재원
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제22권11호
    • /
    • pp.583-590
    • /
    • 2016
  • 최근 일상적인 인간 생활의 거의 모든 영역에서 사물 인터넷에 대한 관심이 급속히 증대되면서, 음성 인식은 중요한 HCI 수단으로 자리 잡고 있다. 더불어, 모바일 환경에서의 음성 인식 시스템에 대한 수요 또한 급속히 증대되고 있다. 모바일 환경을 위한 서버 기반의 음성 인식 시스템은 대체로 빠른 속도와 높은 인식률을 보이고 있지만, 데이터베이스에 저장되어 있는 단어를 단위로 하여 인식을 수행하므로, 인터넷이 연결되어 있어야 하고 서버에서의 많은 계산량을 필요로 한다. 본 논문은 음소 기반 한국어 음성 인식 시스템의 일부로서, 한국어 모음 'ㅗ'에 대한 새로운 인식 방식을 제안한다. 제안하는 방식은 주파수 영역에서의 분석 대신, 시간 영역에서의 파형 패턴에 기반하여 동작하므로, 계산 비용을 현저히 절감할 수 있다. 모음 'ㅗ'의 전형적인 파형 패턴들을 탐지하기 위한 요소 알고리즘들을 제시하며, 이를 결합하여 최종 판별을 수행한다. 실험 결과를 통해, 제안하는 방식이 89.9%의 인식 정확도를 달성할 수 있음을 확인하였다.

경상방언 대학생들이 발음한 국어 한자어 장단음 분석 (An Analysis of Short and Long Syllables of Sino-Korean Words Produced by College Students with Kyungsang Dialect)

  • 양병곤
    • 말소리와 음성과학
    • /
    • 제7권4호
    • /
    • pp.131-138
    • /
    • 2015
  • The initial syllables of a pair of Sino-Korean words are generally differentiated in their meaning by either short or long durations. They are realized differently by the dialect and generation of speakers. Recent research has reported that the temporal distinction has gradually faded away. The aim of this study is to examine whether college students with Kyungsang dialect made the distinction temporally using a statistical method of Mixed Effects Model. Thirty students participated in the recording of five pairs of Korean words in clear or casual speaking styles. Then, the author measured the durations of the initial syllables of the words and made a descriptive analysis of the data followed by applying Mixed Effects Models to the data by setting gender, length, and style as fixed effects, and subject and syllable as random effects, and tested their effects on the initial syllable durations. Results showed that college students with Kyungsang dialect did not produce the long and short syllables distinctively with any statistically significant difference between them. Secondly, there was a significant difference in the duration of the initial syllables between male and female students. Thirdly, there was also a significant difference in the duration of the initial syllables produced in the clear or casual styles. The author concluded that college students with Kyungsang dialect do not produce long and short Sino-Korean syllables distinctively, and any statistical analysis on the temporal aspect should be carefully made considering both fixed and random effects. Further studies would be desirable to examine production and perception of the initial syllables by speakers with various dialect, generation, and age groups.

시간 영역 벌크 지표에 기반한 한국어 모음 'ㅜ'의 음성 인식 (Speech Recognition of the Korean Vowel 'ㅜ' Based on Time Domain Bulk Indicators)

  • 이재원
    • 정보과학회 컴퓨팅의 실제 논문지
    • /
    • 제22권11호
    • /
    • pp.591-600
    • /
    • 2016
  • 네트워크와 컴퓨팅 기술의 발달로 인해 인간이 생활하는 거의 모든 일상 환경에 컴퓨팅 기술의 접목이 증대되고 있다. 또한, 사물 인터넷에 대한 관심이 급속히 증대되면서, 음성 인식은 중요한 HCI 수단으로 자리 잡고 있다. 본 논문은 음소 기반 한국어 음성 인식 시스템의 일부로서, 한국어 모음 'ㅜ'에 대한 새로운 인식 방식을 제안한다. 제안하는 방식은 주파수 영역에서의 분석 대신, 시간 영역에서 계산한 벌크 지표를 분석하여 동작하므로, 계산 비용을 현저히 절감할 수 있다. 벌크 지표를 사용하여 모음 'ㅜ'의 전형적인 파형 패턴들을 탐지하기 위한 네 가지 요소 알고리즘을 제시하며, 이를 결합하여 최종적인 판별을 수행한다. 실험 결과를 통해, 제안하는 방식이 90.1%의 인식 정확도를 달성할 수 있음을 확인하였으며, 인식 속도는 어절 당 0.68 msec이다.