• Title/Summary/Keyword: Non-native Speech

Search Result 77, Processing Time 0.016 seconds

Annotation of a Non-native English Speech Database by Korean Speakers

  • Kim, Jong-Mi
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.111-135
    • /
    • 2002
  • An annotation model of a non-native speech database has been devised, wherein English is the target language and Korean is the native language. The proposed annotation model features overt transcription of predictable linguistic information in native speech by the dictionary entry and several predefined types of error specification found in native language transfer. The proposed model is, in that sense, different from other previously explored annotation models in the literature, most of which are based on native speech. The validity of the newly proposed model is revealed in its consistent annotation of 1) salient linguistic features of English, 2) contrastive linguistic features of English and Korean, 3) actual errors reported in the literature, and 4) the newly collected data in this study. The annotation method in this model adopts the widely accepted conventions, Speech Assessment Methods Phonetic Alphabet (SAMPA) and the TOnes and Break Indices (ToBI). In the proposed annotation model, SAMPA is exclusively employed for segmental transcription and ToBI for prosodic transcription. The annotation of non-native speech is used to assess speaking ability for English as Foreign Language (EFL) learners.

  • PDF

Correlation analysis of linguistic factors in non-native Korean speech and proficiency evaluation (비원어민 한국어 말하기 숙련도 평가와 평가항목의 상관관계)

  • Yang, Seung Hee;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.49-56
    • /
    • 2017
  • Much research attention has been directed to identify how native speakers perceive non-native speakers' oral proficiency. To investigate the generalizability of previous findings, this study examined segmental, phonological, accentual, and temporal correlates of native speakers' evaluation of L2 Korean proficiency produced by learners with various levels and nationalities. Our experiment results show that proficiency ratings by native speakers significantly correlate not only with rate of speech, but also with the segmental accuracies. The influence of segmental errors has the highest correlation with the proficiency of L2 Korean speech. We further verified this finding within substitution, deletion, insertion error rates. Although phonological accuracy was expected to be highly correlated with the proficiency score, it was the least influential measure. Another new finding in this study is that the role of pitch and accent has been underemphasized so far in the non-native Korean speech perception studies. This work will serve as the groundwork for the development of automatic assessment module in Korean CAPT system.

Acoustic analysis of English lexical stress produced by Korean, Japanese and Taiwanese-Chinese speakers

  • Jung, Ye-Jee;Rhee, Seok-Chae
    • Phonetics and Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.15-22
    • /
    • 2018
  • Stressed vowels in English are usually produced using longer duration, higher pitch, and greater intensity than unstressed vowels. However, many English as a foreign language (EFL) learners have difficulty producing English lexical stress because their mother tongues do not have such features. In order to investigate if certain non-native English speakers (Korean, Japanese, and Taiwanese-Chinese native speakers) are able to produce English lexical stress in a native-like manner, speech samples were extracted from the L2 learners' corpus known as AESOP (the Asian English Speech cOrpus Project). Sixteen disyllabic words were analyzed in terms of the ratio of duration, pitch, and intensity. The results demonstrate that non-native English speakers are able to produce English stress in a similar way to native English speakers, and all speakers (both native and non-native) show a tendency to use duration as the strongest cue in producing stress. The results also show that the duration ratio of native English speakers was significantly higher than that of non-native speakers, indicating that native speakers produce a bigger difference in duration between stressed and unstressed vowels.

How Korean Learner's English Proficiency Level Affects English Speech Production Variations

  • Hong, Hye-Jin;Kim, Sun-Hee;Chung, Min-Hwa
    • Phonetics and Speech Sciences
    • /
    • v.3 no.3
    • /
    • pp.115-121
    • /
    • 2011
  • This paper examines how L2 speech production varies according to learner's L2 proficiency level. L2 speech production variations are analyzed by quantitative measures at word and phone levels using Korean learners' English corpus. Word-level variations are analyzed using correctness to explain how speech realizations are different from the canonical forms, while accuracy is used for analysis at phone level to reflect phone insertions and deletions together with substitutions. The results show that speech production of learners with different L2 proficiency levels are considerably different in terms of performance and individual realizations at word and phone levels. These results confirm that speech production of non-native speakers varies according to their L2 proficiency levels, even though they share the same L1 background. Furthermore, they will contribute to improve non-native speech recognition performance of ASR-based English language educational system for Korean learners of English.

  • PDF

The Interlanguage Speech Intelligibility Benefit for Listeners (ISIB-L): The Case of English Liquids

  • Lee, Joo-Kyeong;Xue, Xiaojiao
    • Phonetics and Speech Sciences
    • /
    • v.3 no.1
    • /
    • pp.51-65
    • /
    • 2011
  • This study attempts to investigate the interlanguage speech intelligibility benefit for listeners (ISIB-L), examining Chinese talkers' production of English liquids and its perception of native listeners and non-native Chinese and Korean listeners. An Accent Judgment Task was conducted to measure non-native talkers' and listeners' phonological proficiency, and two levels of proficiency groups (high and low) participated in the experiment. The English liquids /l/ and /r/ produced by Chinese talkers were considered in terms of positions (syllable initial and final), contexts (segment, word and sentence) and lexical density (minimal vs. nonminimal pair) to see if these factors play a role in ISIIB-L. Results showed that both matched and mismatched interlanguage speech intelligibility benefit for listeners occurred except for the initial /l/. Non-native Chinese and Korean listeners, though only with high proficiency, were more accurate at identifying initial /r/, final /l/ and final /r/, but initial /l/ was significantly more intelligible to native listeners than non-native listeners. There was evidence of contextual and lexical density effects on ISIB-L. No ISIB-L was demonstrated in sentence context, but both matched and mismatched ISIB-L was observed in word context; this finding held true for only high proficiency listeners. Listeners recognized the targets better in the non-minimal pair (sparse density) environment than the minimal pair (higher density) environment. These findings suggest that ISIB-L for English liquids is influenced by talkers' and listeners' proficiency, syllable position in association with L1 and L2 phonological structure, context, and word neighborhood density.

  • PDF

Optimizing Multiple Pronunciation Dictionary Based on a Confusability Measure for Non-native Speech Recognition (타언어권 화자 음성 인식을 위한 혼잡도에 기반한 다중발음사전의 최적화 기법)

  • Kim, Min-A;Oh, Yoo-Rhee;Kim, Hong-Kook;Lee, Yeon-Woo;Cho, Sung-Eui;Lee, Seong-Ro
    • MALSORI
    • /
    • no.65
    • /
    • pp.93-103
    • /
    • 2008
  • In this paper, we propose a method for optimizing a multiple pronunciation dictionary used for modeling pronunciation variations of non-native speech. The proposed method removes some confusable pronunciation variants in the dictionary, resulting in a reduced dictionary size and less decoding time for automatic speech recognition (ASR). To this end, a confusability measure is first defined based on the Levenshtein distance between two different pronunciation variants. Then, the number of phonemes for each pronunciation variant is incorporated into the confusability measure to compensate for ASR errors due to words of a shorter length. We investigate the effect of the proposed method on ASR performance, where Korean is selected as the target language and Korean utterances spoken by Chinese native speakers are considered as non-native speech. It is shown from the experiments that an ASR system using the multiple pronunciation dictionary optimized by the proposed method can provide a relative average word error rate reduction of 6.25%, with 11.67% less ASR decoding time, as compared with that using a multiple pronunciation dictionary without the optimization.

  • PDF

Automatic proficiency assessment of Korean speech read aloud by non-natives using bidirectional LSTM-based speech recognition

  • Oh, Yoo Rhee;Park, Kiyoung;Jeon, Hyung-Bae;Park, Jeon Gue
    • ETRI Journal
    • /
    • v.42 no.5
    • /
    • pp.761-772
    • /
    • 2020
  • This paper presents an automatic proficiency assessment method for a non-native Korean read utterance using bidirectional long short-term memory (BLSTM)-based acoustic models (AMs) and speech data augmentation techniques. Specifically, the proposed method considers two scenarios, with and without prompted text. The proposed method with the prompted text performs (a) a speech feature extraction step, (b) a forced-alignment step using a native AM and non-native AM, and (c) a linear regression-based proficiency scoring step for the five proficiency scores. Meanwhile, the proposed method without the prompted text additionally performs Korean speech recognition and a subword un-segmentation for the missing text. The experimental results indicate that the proposed method with prompted text improves the performance for all scores when compared to a method employing conventional AMs. In addition, the proposed method without the prompted text has a fluency score performance comparable to that of the method with prompted text.

Phonological Process and Word Recognition in Continuous Speech: Evidence from Coda-neutralization (음운 현상과 연속 발화에서의 단어 인지 - 종성중화 작용을 중심으로)

  • Kim, Sun-Mi;Nam, Ki-Chun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.2
    • /
    • pp.17-25
    • /
    • 2010
  • This study explores whether Koreans exploit their native coda-neutralization process when recognizing words in Korean continuous speech. According to the phonological rules in Korean, coda-neutralization process must come before the liaison process, as long as the latter(i.e. liaison process) occurs between 'words', which results in liaison-consonants being coda-neutralized ones such as /b/, /d/, or /g/, rather than non-neutralized ones like /p/, /t/, /k/, /ʧ/, /ʤ/, or /s/. Consequently, if Korean listeners use their native coda-neutralization rules when processing speech input, word recognition will be hampered when non-neutralized consonants precede vowel-initial targets. Word-spotting and word-monitoring tasks were conducted in Experiment 1 and 2, respectively. In both experiments, listeners recognized words faster and more accurately when vowel-initial target words were preceded by coda-neutralized consonants than when preceded by coda non-neutralized ones. The results show that Korean listeners exploit the coda-neutralization process when processing their native spoken language.

  • PDF

Effects of Prosodic Strengthening on the Production of English High Front Vowels /i, ɪ/ by Native vs. Non-Native Speakers (원어민과 비원어민의 영어 전설 고모음 /i, ɪ/ 발화에 나타나는 운율 강화 현상)

  • Kim, Sahyang;Hur, Yuna;Cho, Taehong
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.129-136
    • /
    • 2013
  • This study investigated how acoustic characteristics (i.e., duration, F1, F2) of English high front vowels /i, ɪ/ are modulated by boundary- and prominence-induced strengthening in native vs. non-native (Korean) speech production. The study also examined how the durational difference in vowels due to the voicing of a following consonant (i.e., voiced vs. voiceless) is modified by prosodic strengthening in two different (native vs. non-native) speaker groups. Five native speakers of Canadian English and eight Korean learners of English (intermediate-advanced level) produced 8 minimal pairs with the CVC sequence (e.g., 'beat'-'bit') in varying prosodic contexts. Native speakers distinguished the two vowels in terms of duration, F1, and F2, whereas non-native speakers only showed durational differences. The two groups were similar in that they maximally distinguished the two vowels when the vowels were accented (F2, duration), while neither group showed boundary-induced strengthening in any of the three measurements. The durational differences due to the voicing of the following consonant were also maximized when accented. The results are discussed further in terms of phonetics-prosody interface in L2 production.

The Effects of Korean Coda-neutralization Process on Word Recognition in English (한국어의 종성중화 작용이 영어 단어 인지에 미치는 영향)

  • Kim, Sun-Mi;Nam, Ki-Chun
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.59-68
    • /
    • 2010
  • This study addresses the issue of whether Korean(L1)-English(L2) non-proficient bilinguals are affected by the native coda-neutralization process when recognizing words in English continuous speech. Korean phonological rules require that if liaison occurs between 'words', then coda-neutralization process must come before the liaison process, which results in liaison-consonants being coda-neutralized ones such as /b/, /d/, or /g/, rather than non-neutralized ones like /p/, /t/, /k/, /$t{\int}$/, /$d_{\Im}$/, or /s/. Consequently, if Korean listeners apply their native coda-neutralization rules to English speech input, word detection will be easier when coda-neutralized consonants precede target words than when non-neutralized ones do. Word-spotting and word-monitoring tasks were used in Experiment 1 and 2, respectively. In both experiments, listeners detected words faster and more accurately when vowel-initial target words were preceded by coda-neutralized consonants than when preceded by coda non-neutralized ones. The results show that Korean listeners exploit their native phonological process when processing English, irrespective of whether the native process is appropriate or not.

  • PDF