Search | Korea Science

Modelling Duration In Text-to-Speech Systems

Chung Hyunsong
- MALSORI
- /
- no.49
- /
- pp.159-174
- /
- 2004
The development of the durational component of prosody modelling was overviewed and discussed in text-to-speech conversion of spoken English and Korean, showing the strengths and weaknesses of each approach. The possibility of integrating linguistic feature effects into the duration modelling of TTS systems was also investigated. This paper claims that current approaches to language timing synthesis still require an understanding of how segmental duration is affected by context. Three modelling approaches were discussed: sequential rule systems, Classification and Regression Tree (CART) models and Sums-of-Products (SoP) models. The CART and SoP models show good performance results in predicting segment duration in English, while it is not the case in the SoP modelling of spoken Korean.
PDF

A Comparative Study of Syllable Structures between French and Korean in Real Utterances (실제 발화 상황에서 프랑스어와 한국어의 음절구조 비교)

Lee, Eun-Yung
- Speech Sciences
- /
- v.10 no.2
- /
- pp.237-248
- /
- 2003
This paper compares the syllable structure of French and Korean analyzing the speech data of these two languages recorded during the actual speech. Reference to the syllable structure of French is made from F. Wioland's research data. As for the Korean data, the primary data are drawn from the 30-minute radio interview in which two male TV anchors in their early 60s talk to each other. The secondary source of the data is collected by having the primary data replicated by the two male announcers in their early 20's broadcasting in the university ra야o station of KAIST. With reference to the data collected in French and Korean, this paper provides the statistical frequency of each type of syllable structure in each language through the acoustic analysis of the spectrograms and renders a phonetic account of the characteristics of each syllable type in the two languages. Also discussed in this paper is the distributional condition in which each syllable structure is laid out in the speech context.
PDF

The cerebral representation related to lexical ambiguity and idiomatic ambiguity (어휘적 중의성 및 관용적 중의성을 처리하는 대뇌 영역)

Yu Gisoon;Kang Hongmo;Jo Kyungduk;Kang Myungyoon;Nam Kichun
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.79-82
- /
- 2003
The purpose of this study is to examine the regions of the cerebrum that handles the lexical and idiomatic ambiguity. The stimuli sets consist of two parts, and each part has 20 sets of sentences. For each part, 10 sets are experimental conditions and the other 10 sets are control conditions. Each set has two sentences, the 'context' and 'target' sentences, and a sentence-verification question for guaranteeing patients' concentration to the task. The results based on 15 patients showed that significant activation is present in the right frontal lobe of the cerebral cortex for both kinds of ambiguity. It means that right hemisphere participates in the resolution of ambiguity, and there are no regions specified for lexical ambiguity or idiomatic ambiguity alone.
PDF

Syntactic ambiguity and phonological structure (통사적 모호성과 음운 구조)

Lim Un
- MALSORI
- /
- no.42
- /
- pp.57-69
- /
- 2001
Syntactic ambiguity can be understood by context usually, especially in reading and writing. Because phonological structure including stress, intonation and phonological phenomena can be pronounced differently according to different syntactic structures, syntactic ambiguity can be solved by phonological structure in listening and speaking. The objectives of this study was to survey how Korean English teachers apply phonological structures in order to solve syntactic ambiguity. The results of this study is as follows: First, Korean English leachers applied Compound Stress Rules well, when the second word was not branched. But they did not apply Compound Stress Rules well, when the second word was branched. Second, several Korean English teachers did not apply Nuclear Stress Rules well. They usually put the strongest stress on the first word. Third Korean English teachers did not differentiate appropriate applying situation of palatalization. They applied palatalization at both the single and the separated Phonological Phrase. Fourth, Korean English teachers did not apply stress shifting when stress crash happened. Because they did not apply stress shifting, they put the strongest stress on inappropriate syllable.
PDF

Unseen Model Prediction using an Optimal Decision Tree (Optimal Decision Tree를 이용한 Unseen Model 추정방법)

Kim Sungtak;Kim Hoi-Rin
- MALSORI
- /
- no.45
- /
- pp.117-126
- /
- 2003
Decision tree-based state tying has been proposed in recent years as the most popular approach for clustering the states of context-dependent hidden Markov model-based speech recognition. The aims of state tying is to reduce the number of free parameters and predict state probability distributions of unseen models. But, when doing state tying, the size of a decision tree is very important for word independent recognition. In this paper, we try to construct optimized decision tree based on the average of feature vectors in state pool and the number of seen modes. We observed that the proposed optimal decision tree is effective in predicting the state probability distribution of unseen models.
PDF

A Comparative Study between English and Korean Speakers on the Acoustic Characteristics of Focus Realization in English Focus Sentences (영어 초점구문에 나타나는 초점 발화의 음향 음성적 특성 비교 연구: 미국인 화자와 한국인 화자를 중심으로)

Kim, Kee-Ho
- Speech Sciences
- /
- v.11 no.2
- /
- pp.89-104
- /
- 2004
This paper investigates previous theories on English focus realization and attempts to find out the overall acoustic characteristics of English focus. It has been argued in previous studies that English focus can be defined as a new information that is not recoverable from the context (Halliday 1967), a complementary element of presupposition (Jackendoff 1972), and what is predicated about the topic in a sentence (Sgall 1973, Gundel 1974). The phonetic realization of English focus in an utterance has been said to be either L+H*/H*, or falling accent. Yet it is a more or less simplified pattern not based on real data obtained from native speakers of English, and it does not consider the various pragmatic and contextual situations. In our experiments we found that native speakers uttered English focus sentences in different ways according to the different focus structure. Another notable result is that Korean speakers, when provided with the same experimental material, are neither able to distinguish different focus types nor deaccent the elements that are not focused in an utterance.
PDF

Acoustic Features of Phonatory Offset-Onset in the Connected Speech between a Female Stutterer and Non-Stutterers (연속구어 내 발성 종결-개시의 음향학적 특징 - 말더듬 화자와 비말더듬 화자 비교 -)

Han, Ji-Yeon;Lee, Ok-Bun
- Speech Sciences
- /
- v.13 no.2
- /
- pp.19-33
- /
- 2006
The purpose of this paper was to examine acoustical characteristics of phonatory offset-onset mechanism in the connected speech of female adults with stuttering and normal nonfluency. The phonatory offset-onset mechanism refers to the laryngeal articulatory gestures. Those gestures are required to mark word boundaries in phonetic contexts of the connected speech. This mechanism included 7 patterns based on the speech spectrogram. This study showed the acoustic features in the connected speech in the production of female adults with stuttering (n=1) and normal nonfluency (n=3). Speech tokens in V_V, V_H, and V_S contexts were selected for the analysis. Speech samples were recorded by Sound Forge, and the spectrographic analysis was conducted using Praat. Results revealed a stuttering (with a type of block) female exhibited more laryngealization gestures in the V_V context. Laryngealization gesture was more characterized by a complete glottal stop or glottal fry both in V_H and in V_S contexts. The results were discussed from theoretical and clinical perspectives.
PDF

Gradient Reduction of $C_1$ in /pk/ Sequences

Son, Min-Jung
- Speech Sciences
- /
- v.15 no.4
- /
- pp.43-60
- /
- 2008
Instrumental studies (e.g., aerodynamic, EPG, and EMMA) have shown that the first of two stops in sequence can be articulatorily reduced in time and space sometimes; either gradient or categorical. The current EMMA study aims to examine possible factors_linguistic (e.g., speech rate, word boundary, and prosodic boundary) and paralinguistic (e.g., natural context and repetition)_to induce gradient reduction of $C_1$ in /pk/ cluster sequences. EMMA data are collected from five Seoul-Korean speakers. The results show that gradient reduction of lip aperture seldom occurs, being quite restricted both in speaker frequency and in token frequency. The results also suggest that the place assimilation is not a lexical process, implying that speakers have not fully developed this process to be phonologized in the abstract level.
PDF

New Interpretation on Intensification (된소리 현상의 새 분석)

Lee Mi-Jae
- MALSORI
- /
- no.25_26
- /
- pp.27-37
- /
- 1993
This paper deals with the nature and function of intensification in Korean in a wider scope of context which was not paid proper attention. Unobserved new areas of intensification are paid more attention like sound split of polysemy e.g. [s'eda], [kyongk'i] by n of intensification and north Korean application of intensification on [wonsu] and intensification of borrowed English. The recent phenomenon of 'gwua' intensification is experimented on two groups of people, young students and old people beyond 65 years old by means of sociolinguistic analysis. The result shows that its intensification is a form of student violent power and a mark of extreme solidarity among activist students. In conclusion, the nature of so called saisiot[t] e.g. intensification is voiceless tensed pause and its functions are the polarization of the original meaning of the word, sound split of polysemy and attachment of social values by intensification.
PDF

Phonological Error Patterns: Clinical Aspects on Coronal Feature (음운 오류 패턴: 설정성 자질의 임상적 고찰)

Kim, Min-Jung;Lee, Sung-Eun
- Phonetics and Speech Sciences
- /
- v.2 no.4
- /
- pp.239-244
- /
- 2010
The purpose of this study is to investigate two phonological error patterns on coronal feature of children with functional articulation disorders and to compare them with those of general children. We tested 120 children with functional articulation disorders and 100 general children from 2~4 years of age with 'Assessment of Phonology & Articulation for Chidren(APAC)'. The results were as follows: (1) 37 disordered children substituted [+coronal] consonants for [-coronal] consonants (fronting of velars) and 9 disordered children substituted [-coronal] consonants for [+coronal] consonants (backing to velars). (2) Theses two phonological patterns were affected by the articulatory place of following phoneme. (3) The fronting pattern of children with articulation disorders was similar with that of general children, but their backing pattern was different with that of general children. These results show the clinical usefulness of coronal feature in phonological pattern analysis, the need of articulatory assessment with various phonetic context, and the importance of error contexts in clinical judgment.
PDF

Search Result 72, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)