• Title/Summary/Keyword: Speech Tone

Search Result 200, Processing Time 0.023 seconds

Korean native speakers' perceptive aspects on Korean wh & yes-no questions produced by Chinese Korean learners (중국인학습자들의 한국어 의문사의문문과 부정사의문문에 대한 한국어원어민 화자의 지각양상)

  • Yune, YoungSook
    • Phonetics and Speech Sciences
    • /
    • v.6 no.4
    • /
    • pp.37-45
    • /
    • 2014
  • Korean wh-questions and yes-no questions have morphologically the same structure. In speech, however, two types of questions are distinguished by prosodic difference. In this study, we examined if Korean native speakers can distinguish wh-question and yes-no questions produced by Chinese Korean leaners based on the prosodic information contained in the sentences. For this purpose, we performed perception analysis, and 15 Korean native speakers participated in the perception test. The results show that two types of interrogative sentences produced by Chinese Korean leaners were not distinguished by constant pitch contours. These results reveal that Chinese Korean leaners cannot match prosodic meaning and prosodic form. The most saliant prosodic feature used perceptually by native speakers to discriminate two types of interrogative sentences is pitch difference between the F0 pick of wh-word and boundary tone.

A Study on Korean Intonation Using Momel (Momel을 이용한 한국어의 억양 연구)

  • Kim, Sun-Hee;Yoo, Hyun-Ji;Hong, Hye-Jin;Lee, Ho-Young
    • MALSORI
    • /
    • no.63
    • /
    • pp.85-100
    • /
    • 2007
  • This paper aims to propose how to extract intonation patterns using Momel, a pitch stylization algorithm, and to present results of analyzing speech corpora in comparison with those in earlier researches. Two speech corpora are used: one is the sound files obtained from the K-ToBI web site, and the other consists of 80 passages pronounced by 4 speakers (2 male and 2 female). The results show that Momel provides significant pitch targets which can be labeled as H and L tones within prosodic units such as Accentual Phrase (AP) and Intonation Phrase (IP). The resulting AP patterns and IP boundary tone patterns correspond to those in earlier researches. Thus, this study will contribute to the study of intonation as well as to the development of automatic intonation labeling systems.

  • PDF

Prosodic Characteristics of Korean Distance Speech (한국어 원거리 음성의 운율적 특성)

  • Lee, Sook-hyang;Kim, Sun-Hee;Kim, Jong-Jin
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.87-90
    • /
    • 2005
  • The aim of this paper is to investigate the prosodic characteristics of Korean distant speech. 36 2-syllable words of 4 speakers (2 males and 2 females) produced in both distant-talking and normal environments were used. The results showed that ratios of second syllable to first syllable in vowel duration and vowel energy were significantly larger in the distant-talking environment compared to the normal environment and f0 range also bigger in the distant-talking environment. In addition, 'HL%' contour boundary tone in the second syllable and/or 'L +H' contour tone in the first syllable were used in the distant-talking environment.

  • PDF

A Study of the Prosodic Characteristics of Homographs with Context Cues by Subjects with Right and Left Hemisphere Damage (문맥 내에서 좌우반구 손상자의 동음어에 대한 운율 산출 비교)

  • Lee, Myoung-Soon
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.13-21
    • /
    • 2010
  • The purpose of this study was to examine the prosody characteristics of sentence-level utterances which contain homographs with context cues in patients with neurogenic communication disorders. Homographs which may be affected by prosody, especially tonic length features, were used to investigate this matter. The characteristics of tone, duration, pitch, and pitch peak were analyzed to examine the characteristics of prosody in patients with lesions in the left or right hemisphere and normal controls. The whole process was recorded using Praat 4.3.14 and for statistical analyses, three-way ANOVA and multiple comparative analyses, Chi-Square tests, and a one-way ANOVA were carried out using SPSS 12.0 for Windows. The conclusions of this study are as follows. First, the length of syllables and vowels in homographs in Korean was different depending on the meaning and was not significant between groups. Second, it was found that patients with lesions in the right hemisphere had significant difference on pitch. Third, it was found that frequency of pitch peak and tone in 'short' tone syllables were different between groups. The conclusion of this study found that the prosody of homographs between groups absolutely was not differentiated. Accordingly, more detailed studies of acoustic parameters and other parameters which the prosody characteristic between groups could be found are needed in the future.

  • PDF

A Multi-speaker Speech Synthesis System Using X-vector (x-vector를 이용한 다화자 음성합성 시스템)

  • Jo, Min Su;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.675-681
    • /
    • 2021
  • With the recent growth of the AI speaker market, the demand for speech synthesis technology that enables natural conversation with users is increasing. Therefore, there is a need for a multi-speaker speech synthesis system that can generate voices of various tones. In order to synthesize natural speech, it is required to train with a large-capacity. high-quality speech DB. However, it is very difficult in terms of recording time and cost to collect a high-quality, large-capacity speech database uttered by many speakers. Therefore, it is necessary to train the speech synthesis system using the speech DB of a very large number of speakers with a small amount of training data for each speaker, and a technique for naturally expressing the tone and rhyme of multiple speakers is required. In this paper, we propose a technology for constructing a speaker encoder by applying the deep learning-based x-vector technique used in speaker recognition technology, and synthesizing a new speaker's tone with a small amount of data through the speaker encoder. In the multi-speaker speech synthesis system, the module for synthesizing mel-spectrogram from input text is composed of Tacotron2, and the vocoder generating synthesized speech consists of WaveNet with mixture of logistic distributions applied. The x-vector extracted from the trained speaker embedding neural networks is added to Tacotron2 as an input to express the desired speaker's tone.

Coarticulation and vowel reduction in the neutral tone of Beijing Mandarin

  • Lin Maocan
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.207-207
    • /
    • 1996
  • The neutral tone is one of the most important distinguishing features in Beijing Mandarin, but there are two completely different views on its linguistic function: a special tone(Xu, 1980) versus weak stress(Chao, 1968). In this paper, the acoustic manifestation of the neutral tone will be explored to show that it is closely related to weak stress. 122 disyllabic words in which the second syllable carries the neutral tone, including 22 stress pairs, were uttered by a native male speaker of Beijing dialect and analysed by Kay Digital Sonagraph 5500-1. The results of the acoustic analysis are presented as follows: 1) The first two formants of the medial and the syllabic vowel moves towards that of central vowel with a greater magnitude in the syllable with the neutral tone than in the syllable with any of the four normal tones. Also the vowel ending, and nasal coda /n/ and / / in the syllable with the neutral tone tends to be deleted. 2) In the syllables with the neutral tone, there are strong carryover coarticulations between the medial and syllabic vowel and the preceding unvoiced consonant. In general, the vowel is affected to move towards the position of the central vowel with more greater magnitude by coronal consonant than by labial or velar consonant. 3) In the syllable with the neutral tone, when and only when it precedes a syllable with tone-4, the high vowel following [f], [ts'], [s], [ts'], [s], [tc'] or [c] tends to be voiceless. 4) It can be seen from the acoustical results of 22 stress pairs that the duration of the syllable with the neutral tone is on the average reduced to 55% of that of the syllable with the four normal tones, and the duration of the final in the syllable with neutral tone is on the average reduced to 45% of that of the final in the syllable with the four normal tones(Lin & Yan 1980). 5) The FO contour of the neutral tone is highly dependent on the preceding normal tone(Lin & Yan 1993). For a number of languages it has been found that the vowel space is reduced as the level of stress placed upon the vowel is reduced(Nord 1986). Therefore we reach the conclusion that the syllable with neutral tone is related to weak stress(Lin & Yan 1990). The neutral tone is not a special tone because the preceding normal tone.

  • PDF

Applying the Speech Register Principle to young children`s Perception of the Intelligent Service Robot (언어 사용력(Speech Register)원리를 활용한 유아의 교육용 로봇 인식)

  • Hyun, Eun-Ja;Lee, Ha-Won;Yeon, Hye-Min
    • The Journal of the Korea Contents Association
    • /
    • v.12 no.10
    • /
    • pp.532-540
    • /
    • 2012
  • The purpose of this study is to explore young children's perception of IrobiQ, the teacher assistive robot. Participants of this study were fifty 5 year olds attending 3 kindergarten centers who had experienced the robot for at least 2 years. The study was conducted based on the "the hypothesis of speech register". Each child was read a storybook by a researcher and asked to choose which one is more suitable to human speech tones and accents among a robot, a friend, and a toy. The findings of this study were that the children perceived a robot as a hybrid compound entity, not as a complete human though they perceived it closer to a human than an artificial thing. They were likely to use cognitive distinctions which is unique to human being, as the criteria to verify their answers. These results would suggest that the traditional binary ontological category(animate vs. inanimate) is reconsidered to include an hybrid entity.

A Personal Sound Amplification Product Compared to a Basic Hearing Aid for Speech Intelligibility in Adults with Mild-to-Moderate Sensorineural Hearing Loss

  • Choi, Ji Eun;Kim, Jinryoul;Yoon, Sung Hoon;Hong, Sung Hwa;Moon, Il Joon
    • Journal of Audiology & Otology
    • /
    • v.24 no.2
    • /
    • pp.91-98
    • /
    • 2020
  • Background and Objectives: This study aimed to compare functional hearing with the use of a personal sound amplification product (PSAP) or a basic hearing aid (HA) among sensorineural hearing impaired listeners. Subjects and Methods: Nineteen participants with mild-to-moderate sensorineural hearing loss (SNHL) (26-55 dB HL; pure-tone average, 0.5-4 kHz) were prospectively included. No participants had prior experience with HAs or PSAPs. Audiograms, speech intelligibility in both quiet and noisy environments, speech quality, and preference were assessed in three different listening conditions: unaided, with the HA, and with the PSAP. Results: The use of PSAP was associated with significant improvement in pure-tone thresholds at 1, 2, and 4 kHz compared to the unaided condition (all p<0.01). In the quiet environment, speech intelligibility was significantly improved after wearing a PSAP compared to the unaided condition (p<0.001), and this improvement was better than the result obtained with the HA. The PSAP also demonstrated similar improvement in the most comfortable levels compared to those obtained with the HA (p<0.05). However, there was no significant improvement of speech intelligibility in a noisy environment when wearing the PSAP (p=0.160). There was no significant difference in the reported speech quality produced by either device or in participant preference for the PSAP or HA. Conclusions: The current result suggests that PSAPs provide considerable benefits to speech intelligibility in a quiet environment and can be a good alternative to compensate for mild-to-moderate SNHL.

A Personal Sound Amplification Product Compared to a Basic Hearing Aid for Speech Intelligibility in Adults with Mild-to-Moderate Sensorineural Hearing Loss

  • Choi, Ji Eun;Kim, Jinryoul;Yoon, Sung Hoon;Hong, Sung Hwa;Moon, Il Joon
    • Korean Journal of Audiology
    • /
    • v.24 no.2
    • /
    • pp.91-98
    • /
    • 2020
  • Background and Objectives: This study aimed to compare functional hearing with the use of a personal sound amplification product (PSAP) or a basic hearing aid (HA) among sensorineural hearing impaired listeners. Subjects and Methods: Nineteen participants with mild-to-moderate sensorineural hearing loss (SNHL) (26-55 dB HL; pure-tone average, 0.5-4 kHz) were prospectively included. No participants had prior experience with HAs or PSAPs. Audiograms, speech intelligibility in both quiet and noisy environments, speech quality, and preference were assessed in three different listening conditions: unaided, with the HA, and with the PSAP. Results: The use of PSAP was associated with significant improvement in pure-tone thresholds at 1, 2, and 4 kHz compared to the unaided condition (all p<0.01). In the quiet environment, speech intelligibility was significantly improved after wearing a PSAP compared to the unaided condition (p<0.001), and this improvement was better than the result obtained with the HA. The PSAP also demonstrated similar improvement in the most comfortable levels compared to those obtained with the HA (p<0.05). However, there was no significant improvement of speech intelligibility in a noisy environment when wearing the PSAP (p=0.160). There was no significant difference in the reported speech quality produced by either device or in participant preference for the PSAP or HA. Conclusions: The current result suggests that PSAPs provide considerable benefits to speech intelligibility in a quiet environment and can be a good alternative to compensate for mild-to-moderate SNHL.

Analysis of Speech Signals by linear prediction and It's Application (선형 예측법에 의한 음성신호의 분석과 그 응용 방안)

  • 김명규
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.18 no.4
    • /
    • pp.27-33
    • /
    • 1981
  • In this paper, the effect of tone variation of speech signals is discussedty showing the variations of the linear prediction model spectra and the estimated vocal tract shape for Korean vowels. As an application of the analysis results a speech spenthesis scheme by combination of phonemes is also discussed based on experimental results.

  • PDF