• 제목/요약/키워드: word duration

검색결과 145건 처리시간 0.019초

DMS 모델을 이용한 음성인식에 관한 연구 (A Study on Speech Recognition using DMS Model)

  • 안태옥;변용규
    • The Journal of the Acoustical Society of Korea
    • /
    • 제13권2E호
    • /
    • pp.41-50
    • /
    • 1994
  • 본 연구는 단어 패턴 중 유사한 특성의 정보에 기초를 둔 DMS(Dynamic Multi-Section) 모델을 제안한다. 이 모델은 각각의 단어를 몇 개의 구간(Section)의 시계열로 분할하고, 각각의 구간 모두에 지속 시간 정보와 구간을 대표하는 특징 벡터를 구간의 정보로 등록해 둔 것이다. 단어 패턴에서 모델을 작성하는 절차는 대표 특징 벡터와 지속 시간의 정보를 거리에 따라 반영하면서 단어 패턴과 모델과의 매칭을 반복하여 매칭에 의한 누적 거리가 최소로 되도록 하는 것이다. 제안된 음성 인식 실험을 수행하는 것 이외에도 비교를 위해 DP 방법, HMM 방법 및 MSVQ 방법에 의한 음성 인식 실험을 같은 조건하에서 같은 데이터로 수행하였다. 또한 제안된 DMS 모델을 이용한 음성 인식시에도 DMS/DP 방법에 의한 인식 및 DMS/VQ에 의한 인식률은 89.3%이다. 또한 DMS 모델을 이용한 DMS/DP에 의한 인식률은 95.8%이고, DMS/VQ에 의한 인식률은 96.8%이다. 그러므로, DMS 모델을 이용한 DMS/VQ 방법에 의한 인식이 일반적으로 많이 이용되고 잇는 DP 방법이나 HMM 방법 및 MSVQ 방법과 비교해 볼 때 인식률도 우수하며, 기억 용량 및 계산량도 감소되어, 본 연구에서 제안하는 DMS 모델의 유용성이 입증되었다.

  • PDF

Functional Reorganization Associated with Semantic Language Processing in Temporal Lobe Epilepsy Patients after Anterior Temporal Lobectomy: A Longitudinal Functional Magnetic Resonance Image Study

  • Kim, Jae-Hun;Lee, Jong-Min;Kang, Eun-Joo;Kim, June-Sic;Song, In-Chan;Chung, Chun-Kee
    • Journal of Korean Neurosurgical Society
    • /
    • 제47권1호
    • /
    • pp.17-25
    • /
    • 2010
  • Objective: The focus of this study is brain plasticity associated with semantic aspects of language function in patients with medial temporal lobe epilepsy (mTLE) Methods: Using longitudinal functional magnetic resonance imaging (fMRI), patterns of brain activation were observed in twelve left and seven right unilateral mTLE patients during a word-generation task relative to a pseudo-word reading task before and after anterior temporal section surgery. Results: No differences were observed in precentral activations in patients relative to normal controls (n = 12), and surgery did not alter the phonological-associated activations. The two mTLE patient groups showed left inferior prefrontal activations associated with semantic processing (word-generation>pseudo-word reading), as did control subjects. The amount of semantic-associated activation in the left inferior prefrontal region was negatively correlated with epilepsy duration in both patient groups. Following temporal resection, semantic-specific activations in inferior prefrontal region became more bilateral in left mTLE patients, but more left-lateralized in right mTLE patients. The longer the duration of epilepsy in the patients, the larger the increase in the left inferior prefrontal semantic-associated activation after surgery in both patient groups. Semantic activation of the intact hippocampus, which had been negatively correlated with seizure frequency, normalized after the epileptic side was removed. Conclusion: These results indicate alternation of semantic language network related to recruitment of left inferior prefrontal cortex and functional recovery of the hippocampus contralateral to the epileptogenic side, suggesting an intra- and inter-hemispheric reorganization following surgery.

Phonetic Aspects of English Stress Produced by South Kyungsang Korean Speakers

  • Yi, Do-Kyong
    • 음성과학
    • /
    • 제13권1호
    • /
    • pp.55-66
    • /
    • 2006
  • A purpose of this study is to investigate the acoustic characteristics of English stress produced by the two groups of South Kyungsang (henceforth, SK) Korean speakers: high-proficiency and low-proficiency with reference to English native speakers. Another purpose is to compare results from the high- and low-proficiency SK Korean subjects with those of the native speakers, and to provide an analytical account of how approximate the high-proficiency SK Korean subjects' production is to the native speakers' and how different the low-proficiency SK Korean subjects' is from the native speakers'. Results indicated that the native speakers' main strategy used in producing stressed syllables was duration while the high-proficiency SK Korean subjects' was predominantly pitch-oriented. The low-proficiency SK Korean subjects' pitch patterns showed regularity, emphasizing the penultimate syllable with pitch. In comparing duration among the three groups, both groups of the SK Korean subjects became more even in their duration values for each syllable as the structure of the word or the sentence became more complex.

  • PDF

대용량 운율 음성데이타를 이용한 자동합성방식 (Automatic Synthesis Method Using Prosody-Rich Database)

  • 김상훈
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1998년도 제15회 음성통신 및 신호처리 워크샵(KSCSP 98 15권1호)
    • /
    • pp.87-92
    • /
    • 1998
  • In general, the synthesis unit database was constructed by recording isolated word. In that case, each boundary of word has typical prosodic pattern like a falling intonation or preboundary lengthening. To get natural synthetic speech using these kinds of database, we must artificially distort original speech. However, that artificial process rather resulted in unnatural, unintelligible synthetic speech due to the excessive prosodic modification on speech signal. To overcome these problems, we gathered thousands of sentences for synthesis database. To make a phone level synthesis unit, we trained speech recognizer with the recorded speech, and then segmented phone boundaries automatically. In addition, we used laryngo graph for the epoch detection. From the automatically generated synthesis database, we chose the best phone and directly concatenated it without any prosody processing. To select the best phone among multiple phone candidates, we used prosodic information such as break strength of word boundaries, phonetic contexts, cepstrum, pitch, energy, and phone duration. From the pilot test, we obtained some positive results.

  • PDF

음향 측정과 지각 판단에 의한 한국인 영어의 운율 연구 (A Study Using Acoustic Measurement and Perceptual Judgment to identify Prosodic Characteristics of English as Spoken by Koreans)

  • 구희산
    • 음성과학
    • /
    • 제2권
    • /
    • pp.95-108
    • /
    • 1997
  • The purpose of this experimental study was to investigate prosodic characteristics of English as spoken by Koreans. Test materials were four English words, a sentence, and a paragraph. Six female Korean speakers and five native English speakers participated in acoustic and perceptual experiments. Pitch and duration of word syllables were measured from signals and spectrograms made by the Signalize 3.04 software program for Power Mac 7200. In the perceptual experiment, accent position, intonation patterns, rhythm patterns and phrasing were evaluated by the five native English speakers. Preliminary results from this limited study show that prosodic characteristics of Koreans include (1) pitch on the first part of a word and sentence is lower than that of English speakers, but the pitch on the last part is the opposite; (2) word prosody is quite similar to that of an English speaker, but sentence prosody is quite different; (3) the weakest point of sentence prosody spoken by Koreans is in the rhythmic pattern.

  • PDF

영어 폐쇄자음 발음 뒤에 나타나는 모음추가 현상 (Extra Vowel Addition Produced in Korean Students' English Pronunciation of Word-final Stop Consonants)

  • 황영순
    • 음성과학
    • /
    • 제7권4호
    • /
    • pp.169-186
    • /
    • 2000
  • This paper aims to confirm the mispronunciation of native Korean students due to the phonetic and phonological system differences between English and Korean, and to find the works-to-do by experiment. Many Korean students tend to differentiate the sounds of word-final stop consonants not by vowel duration or the allophones but by the phoneme of the consonant itself. In English, Stop sounds change through the conditions of the aspirated, unaspirated, or unreleased sounds. But in Korean they are not allophones of phonemes but distinct phonemes. Therefore, many Korean students are apt to add an extra vowel sound /i/ after the final stop consonant in the eve form due to both the unperception of the differences between the phonemes and the allophones of stop consonants, and the influence of the Korean sound-sequence relationship. Since the replacement of the allophones and extra vowel addition does not change the meaning, the importance was almost lost. Nevertheless, this kind of study is essential for the precise learning and the use of the English language.

  • PDF

An Acoustic Study of English Sentence Stress and Rhythm Produced by Korean Speakers

  • Kim, Ok-Young
    • 음성과학
    • /
    • 제14권1호
    • /
    • pp.121-135
    • /
    • 2007
  • The purpose of this paper is to examine how Korean speakers realize English stress and rhythm at the sentence level, and investigate what different acoustic characteristics of English sentence stress and rhythm Korean speakers have, compared with those of American English speakers. Stressed words in the sentence were analyzed in terms of duration, fundamental frequency, and intensity of the stressed vowel in the word with neutral stress and with emphatic stress, respectively. According to the results, when the words had emphatic stress, both Koreans' and Americans' F0 and intensity of the stressed vowel were higher than those with neutral stress. Korean speakers of English realized the sentence stress with shorter vowel duration and higher F0 than American English speakers when the words had emphatic stress. The analysis of the timing of the sentence with increased unstressed syllables showed that both Americans and Koreans produced the sentence with longer duration as the number of unstressed syllables increased. However, the duration of unstressed syllables between stressed syllables by Koreans was longer than that by Americans. Americans seemed to produce unstressed syllables between stressed syllables faster than Koreans for regular intervals of stressed syllables. This analysis implies that if there are more unstressed syllables between stressed syllables, Koreans might produce unstressed syllables and the whole sentence with longer duration.

  • PDF

ACOUSTIC FEATURES DIFFERENTIATING KOREAN MEDIAL LAX AND TENSE STOPS

  • Shin, Ji-Hye
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 1996년도 10월 학술대회지
    • /
    • pp.53-69
    • /
    • 1996
  • Much research has been done on the rues differentiating the three Korean stops in word initial position. This paper focuses on a more neglected area: the acoustic cues differentiating the medial tense and lax unaspirated stops. Eight adult Korean native speakers, four males and four females, pronounced sixteen minimal pairs containing the two series of medial stops with different preceding vowel qualities. The average duration of vowels before lax stops is 31 msec longer than before their tense counterparts (70 msec for lax vs 39 msec for tense). In addition, the average duration of the stop closure of tense stops is 135 msec longer than that of lax stops (69 msec for lax vs 204msec for tense). THESE DURATIONAL DIFFERENCES ARE 50 LARGE THAT THEY MAY BE PHONOLOGICALLY DETERMINED, NOT PHONETICALLY. Moreover, vowel duration varies with the speaker's sex. Female speakers have 5 msec shorter vowel duration before both stops. The quality of voicing, tense or lax, is also a cue to these two stop types, as it is in initial position, but the relative duration of the stops appears to be much more important cues. The duration of stops changes the stop perception while that of preceding vowel does not. The consequences of these results for the phonological description of Korean as well as the synthesis and automatic recognition of Korean will be discussed.

  • PDF

The interlanguage Speech Intelligibility Benefit for Korean Learners of English: Production of English Front Vowels

  • Han, Jeong-Im;Choi, Tae-Hwan;Lim, In-Jae;Lee, Joo-Kyeong
    • 말소리와 음성과학
    • /
    • 제3권2호
    • /
    • pp.53-61
    • /
    • 2011
  • The present work is a follow-up study to that of Han, Choi, Lim and Lee (2011), where an asymmetry in the source segments eliciting the interlanguage speech intelligibility benefit (ISIB) was found such that the vowels which did not match any vowel of the Korean language were likely to elicit more ISIB than matched vowels. In order to identify the source of the stronger ISIB in non-matched vowels, acoustic analyses of the stimuli were performed. Two pairs of English front vowels [i] vs. [I], and $[{\varepsilon}]$ vs. $[{\ae}]$ were recorded by English native talkers and two groups of Korean learners according to their English proficiency, and then their vowel duration and the frequencies of the first two formants (F1, F2) were measured. The results demonstrated that the non-matched vowels such as [I], and $[{\ae}]$ produced by Korean talkers seemed to show more deviated acoustic characteristics from those of the natives, with longer duration and with closer formant values to the matched vowels, [i] and $[{\varepsilon}]$, than those of the English natives. Combining the results of acoustic measurements in the present study and those of word identification in Han et al. (2011), we suggest that relatively better performance in word identification by Korean talkers/listeners than the native English talkers/listeners is associated with the shared interlanguage of Korean talkers and listeners.

  • PDF

KOREAN CONSONANT RECOGNITION USING A MODIFIED LVQ2 METHOD

  • Makino, Shozo;Okimoto, Yoshiyuki;Kido, Ken'iti;Kim, Hoi-Rin;Lee, Yong-Ju
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.1033-1038
    • /
    • 1994
  • This paper describes recognition results using the modified Learning Vector Quantization (MLVQ2) method which we proposed previously. At first, we investigated the property of duration of 29 Korean consonants and found that the variances of th duration were extremely big comparing to other languages. We carried out preliminary recognition experiments for three stop consonants P, T and K. From the recognition results, we defined the optimum conditions for the learning. Then we applied the MLVQ2 method to the recognition of Korean consonants. The training was carried out using the phoneme samples in the 611 word vocabulary uttered by 2 male speakers, where each of the speakers uttered two repetitions. The recognition experiment was carried out for the phoneme samples in two repetitions of the 611 word vocabulary uttered by another male speaker. The recognition scores for the twelve plosives were 68.2% for the test samples. The recofnition scores for the 29 Korean consonants were 64.8% for the test samples.

  • PDF