• Title/Summary/Keyword: prosodic domain

Search Result 24, Processing Time 0.025 seconds

Effects of pitch accent and prosodic boundary on English vowel production by native versus nonnative (Korean) speakers. (영어의 강세와 운율경계가 모음 발화에 미치는 영향에 관한 음향 연구;원어민과 한국인을 대상으로)

  • Hur, Yu-Na;Kim, Sa-Hyang;Cho, Tae-Hong
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.240-242
    • /
    • 2007
  • The goal of this paper is to investigate effects of three prosodic factors, such as phrasal accent (accented vs. unaccented), prosodic boundary (IP-initial vs. IP-medial) and coda voicing (e.g., bed vs. bet), on acoustic realization of English vowels (/i, $_I/$, $/{\varepsilon}$, ${\ae}/$) as produced by native (Canadian) and nonnative (Korean) speakers. The speech corpus included 16 minimal pairs (e.g., bet-bat, bet-bed) embedded in a sentence. Results show that phonological contrast between vowels are maximized when they were accented, though the contrast maximization pattern was not the same between the English and Korean speakers. However, domain-initial position do not affect the phonetic manifestation of vowels. Results also show that phonological contrast due to coda voicing is maximized only when the vowels are accented. These results propose that the phonetic realization of vowels is affected by phrasal accent only, and not by the location within prosodic position.

  • PDF

Corpus-based Korean Text-to-speech Conversion System (콜퍼스에 기반한 한국어 문장/음성변환 시스템)

  • Kim, Sang-hun; Park, Jun;Lee, Young-jik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.24-33
    • /
    • 2001
  • this paper describes a baseline for an implementation of a corpus-based Korean TTS system. The conventional TTS systems using small-sized speech still generate machine-like synthetic speech. To overcome this problem we introduce the corpus-based TTS system which enables to generate natural synthetic speech without prosodic modifications. The corpus should be composed of a natural prosody of source speech and multiple instances of synthesis units. To make a phone level synthesis unit, we train a speech recognizer with the target speech, and then perform an automatic phoneme segmentation. We also detect the fine pitch period using Laryngo graph signals, which is used for prosodic feature extraction. For break strength allocation, 4 levels of break indices are decided as pause length and also attached to phones to reflect prosodic variations in phrase boundaries. To predict the break strength on texts, we utilize the statistical information of POS (Part-of-Speech) sequences. The best triphone sequences are selected by Viterbi search considering the minimization of accumulative Euclidean distance of concatenating distortion. To get high quality synthesis speech applicable to commercial purpose, we introduce a domain specific database. By adding domain specific database to general domain database, we can greatly improve the quality of synthetic speech on specific domain. From the subjective evaluation, the new Korean corpus-based TTS system shows better naturalness than the conventional demisyllable-based one.

  • PDF

Pitch Patterns of Interrogative Sentences in relation to the Focus (초점과 관련된 의문문 억양 패턴 실험)

  • Kim, Mi-Ran;Shin, Dong-Hyun;Choe, Jae-Woong;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.203-217
    • /
    • 2000
  • In spoken language, the characteristics of prosodic realization are related to the meaning of utterance. The pitch pattern of an interrogative sentence which differs from that of declarative sentences can be considered in this respect.. If we consider the question-answer pair, we can find that the most important variation comes from the intended meaning of asking. In this paper, we experiment with four kinds of interrogative sentences and show that the difference in pitch patterns of interrogative sentences can be explained in relation to the focus phenomena that is, the differences of the boundary tones in interrogative sentences are due to the differences in the prosodic domain of focus. For a relevant explanation with the focus phenomena, we divided focus into the categories: emphatic focus, which plays a role in delivering the speaker's intended meaning for the sentence interpretation, and informational focus, delivers the central intended meaning of the utterance. The results can be summarized in three points. First, High boundary tone delivers the meaning of asking. Second, the realization of different boundary tones that are found in wh-question and alternative question are just phonetic variations caused by focusing. Third, the high rise boundary tone in echo questions is related to the meaning of surprise or incredulity, and this relation is a consensus of existing opinion, that is, the speaker's attitude of surprise can raise the pitch range. From these results we can distinguish between boundary type and phonetic variation, and we can also give appropriate meaning to the different boundary tones in interrogative sentences that have been regarded as merely a part of sentence type.

  • PDF

Effects of Prosodic Strengthening on the Production of English High Front Vowels /i, ɪ/ by Native vs. Non-Native Speakers (원어민과 비원어민의 영어 전설 고모음 /i, ɪ/ 발화에 나타나는 운율 강화 현상)

  • Kim, Sahyang;Hur, Yuna;Cho, Taehong
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.129-136
    • /
    • 2013
  • This study investigated how acoustic characteristics (i.e., duration, F1, F2) of English high front vowels /i, ɪ/ are modulated by boundary- and prominence-induced strengthening in native vs. non-native (Korean) speech production. The study also examined how the durational difference in vowels due to the voicing of a following consonant (i.e., voiced vs. voiceless) is modified by prosodic strengthening in two different (native vs. non-native) speaker groups. Five native speakers of Canadian English and eight Korean learners of English (intermediate-advanced level) produced 8 minimal pairs with the CVC sequence (e.g., 'beat'-'bit') in varying prosodic contexts. Native speakers distinguished the two vowels in terms of duration, F1, and F2, whereas non-native speakers only showed durational differences. The two groups were similar in that they maximally distinguished the two vowels when the vowels were accented (F2, duration), while neither group showed boundary-induced strengthening in any of the three measurements. The durational differences due to the voicing of the following consonant were also maximized when accented. The results are discussed further in terms of phonetics-prosody interface in L2 production.

Accentual Effects on Lateralization

  • Kim, Soo-Jung
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.15-30
    • /
    • 2001
  • Lateralization, the change of a coronal nasal into a lateral in an l-n sequence, has been considered to be prosodically unrestricted, e.g. an utterance-span rule, in Korean (Han 1993, Park 1990). However, aerodynamic data of the nasal do not corroborate their claims. In the paper, I look at how lateralization can best be characterized. Specifically, I ask whether its domain is best treated via a syntax-based (Nespor & Vogel 1986, Selkirk 1984) or an intonation-based approach (Pierrehumbert 1980, Jun 1993) to prosodic structure. Based on nasal airflow data as a means of monitoring velum activity coincident with a nasal stop in an l-n sequence, combined with pitch tracks to define an accentual phrase, I argue that lateralization is neither an utterance-span rule nor a syntax-based rule. Sentences recorded with a potential environment for lateralization show that lateralization occurs within an accentural phrase but is blocked between accentual phrase boundaries. When intonation-based and syntax-based models disagree about phrase boundaries, lateralization only occurs where the intonation-based model predicts it will. This indicates that lateralization is best defined as an accentual pheonomenon, being sensitive to the accentual phrase. This finding lends further support to an intonation-based model for Korean prosodic structure (Jun 1993).

  • PDF

A pragmatically-oriented study of intonation and focus (억양과 초점에 관한 화용론적 연구)

  • Lee Yeong-Kil
    • MALSORI
    • /
    • no.38
    • /
    • pp.1-24
    • /
    • 1999
  • There is an indisputable connection between prosody and focus. The focal prominence in Korean, a prosodic realization of pitch prominence in an utterance, defines a focused constituent, the domain of which is identified by the Focus Identification Principle. To this is added the Basic Focus Rule which makes it possible to capture and interpret the focal domain, which can then be tested against the available context. The focal domain can be contextually made available by setting it off with information structure boundaries(I/S) identified by the Information Structure Identification Principle. The fragment of the utterance enclosed within the IS boundaries can be recognized as 'new' information with the help of the Focus Domain Identification Rule. Since information structures are pragmatically tied to semantic levels of grammatical systems, the Basic Focus Rule is now replaced by the Focal Prominence Principle ensuring the focal prominence within the focal domain. Close relationships exist between patterns of intonation and their expressiveness in terms of giving a pragmatically-oriented description of focus. This is particularly manifested in Korean sentences containing contrastiveness.

  • PDF

Case Drop and Prosodic Structure in Korean

  • Hong, Sung-Hoon
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.35-51
    • /
    • 2000
  • The goal of this paper is to examine how Case Drop (the drop of the case markers) correlates with the prosodic structure in Korean. On the assumption that intervocalic Lenis Stop Voicing (LSV) applies within the domain of the Accentual Phrase (AP), voicing analyses are performed on intervocalic lenis stop consonants before and after Case Drop. A statistical analysis reveals that the drop of the nominative and accusative case markers significantly alter the AP structure. Pitch values will then be extracted to verify that such changes in the AP structure conform to the pitch properties proposed for the AP (Jun 1993, 1998). The results show that the AP structure suggested by LSV does not always coincide with that imposed by the pitch properties.

  • PDF

An Acoustic Analysis of the Aspiration Merger in Korean

  • Mi, Jang
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.67-75
    • /
    • 2011
  • In Korean, 'Aspiration Merger' is the result of the heteromorphemic sequence of lenis stop and /h/ becoming a single aspirated stop word-medially. However, the contrast between lenis stop-plus-/h/ and an underlying aspirated stop is maintained when they span Phonological Phrase boundaries. By varying the position in the prosodic domain such as APP (Across Phonological Phrase) and PPM (Phonological Phrase Medial) positions, the phonetic properties of the two categories are compared. In the results from noise duration and change of intensity, lenis stop-plus-/h/ show a large difference between the APP and PPM positions. The results from a noise duration comparison show that the two categories are completely neutralized into aspirated stop in the PPM position and the complete neutralization is sensitive to prosodic phrasing.

  • PDF

SPEECH SYNTHESIS IN THE TIME DOMAIN BY PITCH CONTROL USING LAGRANGE INTERPOLATION(TD-PCULI)

  • Kang, Chan-Hee;Shin, Yong-Jo;Kim, Yun-Seok-;Kang, Dae-Soo;Lee, Jong-Heon-;Kwon, Ki-Hyung;An, Jeong-Keun;Sea, Sung-Tae;Chin, Yong-Ohk
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.984-990
    • /
    • 1994
  • In this paper a new speech synthesis method in the time domain using mono-syllables is proposed. It is to overcome the degradation of the synthetic speech quality by the synthesis method in the frequency domain and to develop an algorithm in the time domain for the prosodic control. In particular when we use a method in a time domain with mono-syllable as a synthesis unit it will be the main issues which are to control th pitch period and to smooth the energy pattern. As a solution to the pitch control, a method using Lagrange interpolation is suggested. As a solution to the other problem, an algorithm which can control the amplitude envelop shape of mono-syllable is proposed. As the results of experiments it was possible to synthesize unlimited Korean speeches including the prosody control. Accoding to the MOS evaluation the quality and the naturality in them was improved to be a good level.

  • PDF

Prosodic Phrasing and Focus in Korea

  • Baek, Judy Yoo-Kyung
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.246-246
    • /
    • 1996
  • Purpose: Some of the properties of the prosodic phrasing and some acoustic and phonological effects of contrastive focus on the tonal pattern of Seoul Korean is explored based on a brief experiment of analyzing the fundamental frequency(=FO) contour of the speech of the author. Data Base and Analysis Procedures: The examples were chosen to contain mostly nasal and liquid consonants, since it is difficult to track down the formants in stops and fricatives during their corresponding consonantal intervals and stops may yield an effect of unwanted increase in the FO value due to their burst into the following vowel. All examples were recorded three times and the spectrum of the most stable repetition was generated, from which the FO contour of each sentence was obtained, the peaks with a value higher than 250Hz being interpreted as a high tone (=H). The result is then discussed within the prosodic hierarchy framework of Selkirk (1986) and compared with the tonal pattern of the Northern Kyungsang dialect of Korean reported in Kenstowicz & Sohn (1996). Prosodic Phrasing: In N.K. Korean, H never appears both on the object and on the verb in a neutral sentence, which indicates the object and the verb form a single Phonological Phrase ($={\phi}$), given that there is only one pitch peak for each $={\phi}$. However, Seoul Korean shows that both the object and the verb have H of their own, indicating that they are not contained in one $={\phi}$. This violates the Optimality constraint of Wrap-XP (=Enclose a lexical head and its arguments in one $={\phi}$), while N.K. Korean obeys the constraint by grouping a VP in a single $={\phi}$. This asymmetry can be resolved through a constraint that favors the separate grouping of each lexical category and is ranked higher than Wrap-XP in Seoul Korean but vice versa in N.K. Korean; $Align-x^{lex}$ (=Align the left edge of a lexical category with that of a $={\phi}$). (1) nuna-ka manll-ll mEk-nIn-ta ('sister-NOM garlic-ACC eat-PRES-DECL') a. (LLH) (LLH) (HLL) ----Seoul Korean b. (LLH) (LLL LHL) ----N.K. Korean Focus and Phrasing: Two major effects of contrastive focus on phonological phrasing are found in Seoul Korean: (a) the peak of an Intonatioanl Phrase (=IP) falls on the focused element; and (b) focus has the effect of deleting all the following prosodic structures. A focused element always attracts the peak of IP, showing an increase of approximately 30Hz compared with the peak of a non-focused IP. When a subject is focused, no H appears either on the object or on the verb and a focused object is never followed by a verb with H. The post-focus deletion of prosodic boundaries is forced through the interaction of StressFocus (=If F is a focus and DF is its semantic domain, the highest prominence in DF will be within F) and Rightmost-IP (=The peak of an IP projects from the rightmost $={\phi}$). First Stress-F requires the peak of IP to fall on the focused element. Then to avoid violating Rightmost-IP, all the boundaries after the focused element should delete, minimizing the number of $={\phi}$'s intervening from the right edge of IP. (2) (omitted) Conclusion: In general, there seems to be no direct alignment constraints between the syntactically focused element and the edge of $={\phi}$ determined in phonology; all the alignment effects come from a single requirement that the peak of IP projects from the rightmost $={\phi}$ as proposed in Truckenbrodt (1995).

  • PDF