• 제목/요약/키워드: Non-native Speech

검색결과 77건 처리시간 0.017초

The Contribution of Prosody to the Foreign Accent of Chinese Talkers' English Speech

  • Liu, Xing;Lee, Joo-Kyeong
    • 말소리와 음성과학
    • /
    • 제4권3호
    • /
    • pp.59-73
    • /
    • 2012
  • This study attempts to investigate the contribution of prosody to the foreign accent in Chinese speakers' English production by examining the synthesized speech of crossing native and non-native talkers' prosody and segments. For the stimuli of the foreign accent ratings, we transplanted gender-matched native speakers' prosody onto non-native talkers' segments and vice versa, utilizing the TD-PSOLA algorithm. Eight English native listeners participated in judging foreign accent and comprehensibility of the transplanted stimuli. Results showed that the synthesized stimuli were perceived as stronger foreign accent regardless of speakers' proficiency when English speakers' prosody was crossed with Chinese speakers' segments. This suggests that segments contribute more than prosody to native listeners' evaluation of foreign accent. When transplanted with English speakers' segments, Chinese speakers' prosody showed a difference in duration rather than pitch between high and low proficiency such that stronger foreign accent was detected when low proficient Chinese speakers' duration was crossed with English speakers' segments. This indicated that prosody, more specifically duration, plays a role though the prosodic role is not overall as significant as segments. According to the post acoustic analysis, the temporal features contributing to making the duration parameter prominent as opposed to pitch were found out to be speaking rate, pause duration and pause frequency. Finally, foreign accent and comprehensibility showed no significant correlation such that native listeners had no difficulty listening to highly foreign accented speech.

한국어 화자의 영어 양순음 /b/와 순치음 /v/ 식별에서 시각 단서의 효과 (The Effect of Visual Cues in the Identification of the English Consonants /b/ and /v/ by Native Korean Speakers)

  • 김윤현;고성룡
    • 말소리와 음성과학
    • /
    • 제4권3호
    • /
    • pp.25-30
    • /
    • 2012
  • This study investigated whether native Korean listeners could use visual cues for the identification of the English consonants /b/ and /v/. Both auditory and audiovisual tokens of word minimal pairs in which the target phonemes were located in word-initial or word-medial position were used. Participants were instructed to decide which consonant they heard in $2{\times}2$ conditions: cue (audio-only, audiovisual) and location (word-initial, word-medial). Mean identification scores were significantly higher for audiovisual than audio-only condition and for word-initial than word-medial condition. Also, according to signal detection theory, sensitivity, d', and response bias, c were calculated based on both hit rates and false alarm rates. The measures showed that the higher identification rate in the audiovisual condition was related with an increase in sensitivity. There were no significant differences in response bias measures across conditions. This result suggests that native Korean speakers can use visual cues while identifying confusing non-native phonemic contrasts. Visual cues can enhance non-native speech perception.

Effects of base token for stimuli manipulation on the perception of Korean stops among native and non-native listeners

  • Oh, Eunjin
    • 말소리와 음성과학
    • /
    • 제12권1호
    • /
    • pp.43-50
    • /
    • 2020
  • This study investigated whether listeners' perceptual patterns varied according to base token selected for stimuli manipulation. Voice onset time (VOT) and fundamental frequency (F0) values were orthogonally manipulated, each in seven steps, using naturally produced words that contained a lenis (/kan/) and an aspirated (/khan/) stop in Seoul Korean. Both native and non-native groups showed significantly higher numbers of aspirated responses for the stimuli constructed with /khan/, evidencing the use of minor cues left in the stimuli after manipulation. For the native group the use of the VOT and F0 cues in the stop categorization did not differ depending on whether the base token included the lenis or aspirated stop, indicating that the results of previous studies remain tenable that investigated the relative importance of the acoustic cues in the native listener perception of the Korean stop contrasts by using one base token for manipulating perceptual stimuli. For the non-native group, the use patterns of the F0 cue differed as a function of base token selected. Some findings indicated that listeners used alternative cues to identify the stop contrast when major cues sound ambiguous. The use of the manipulated VOT and F0 cues by the non-native group was not native-like, suggesting that non-native listeners may have perceived the minor cues as stable in the context of the manipulated cue combinations.

Differences in Vowel Duration Due to the Underlying Voicing of the Following Coda Stop in Russian and English: Native and Non-native Values

  • Oh, Eun-Jin
    • 음성과학
    • /
    • 제13권3호
    • /
    • pp.19-33
    • /
    • 2006
  • This study explores whether Russian, known to have a process of syllable-final devoicing, reveals differences in vowel duration as a function of the underlying voicing of the coda stop. This paper also examines whether non-native speakers of Russian and English learn typical L2 values in vowel duration. The results indicate that vowels in Russian have a slightly longer mean duration before a voiced stop than before a voiceless stop (a mean difference of 9.52 ms), but in most cases the differences did not exhibit statistical significance. In English the mean difference was 60.05 ms, and the differences were in most cases statistically significant. All native Russian speakers of English produced larger absolute differences in vowel duration for English than for Russian, and all native English speakers of Russian produced smaller absolute differences for Russian than for English. More experienced learners seemed to achieve more native-like values of vowel duration than less experienced learners did, suggesting that learning occurs gradually as the learners gain more experience with the L2.

  • PDF

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

  • 장태엽
    • 대한음성학회지:말소리
    • /
    • 제66호
    • /
    • pp.41-59
    • /
    • 2008
  • Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.

  • PDF

A Study of Comparing Speech Act Data from Two Differing Data-gathering Instruments

  • Suh, Jae-Suk
    • 영어어문교육
    • /
    • 제13권3호
    • /
    • pp.77-97
    • /
    • 2007
  • To compare data on the speech act of requests from two different methods, a study was conducted in which both native and non-native speakers of English participated as subjects, and data were collected by means of actual e-mail writing and DCT (discourse completion test). The analysis of requests from the two different data-gathering methods showed that despite some similarities, considerable differences existed between e-mail and DCT requests in several important aspects of requests such as amount of talk, directness level, downgraders and supportive moves which play an important role in making a given request sound less imposing and more polite. Also it was shown that requests of non-native speakers differed considerably from requests of native speakers in terms of the four aspects of requests across type of data-gathering methods. Based on the findings, some suggestions were made for both further research and L2 classrooms.

  • PDF

AI-based language tutoring systems with end-to-end automatic speech recognition and proficiency evaluation

  • Byung Ok Kang;Hyung-Bae Jeon;Yun Kyung Lee
    • ETRI Journal
    • /
    • 제46권1호
    • /
    • pp.48-58
    • /
    • 2024
  • This paper presents the development of language tutoring systems for nonnative speakers by leveraging advanced end-to-end automatic speech recognition (ASR) and proficiency evaluation. Given the frequent errors in non-native speech, high-performance spontaneous speech recognition must be applied. Our systems accurately evaluate pronunciation and speaking fluency and provide feedback on errors by relying on precise transcriptions. End-to-end ASR is implemented and enhanced by using diverse non-native speaker speech data for model training. For performance enhancement, we combine semisupervised and transfer learning techniques using labeled and unlabeled speech data. Automatic proficiency evaluation is performed by a model trained to maximize the statistical correlation between the fluency score manually determined by a human expert and a calculated fluency score. We developed an English tutoring system for Korean elementary students called EBS AI Peng-Talk and a Korean tutoring system for foreigners called KSI Korean AI Tutor. Both systems were deployed by South Korean government agencies.

How Different are Learner Speech and Loanword Phonology?

  • Kim, Jong-Mi
    • 말소리와 음성과학
    • /
    • 제1권3호
    • /
    • pp.3-18
    • /
    • 2009
  • Do loanword properties emerge in the acquisition of a foreign language and if so, how? Classic studies in adult language learning assumed loanword properties that range from near-ceiling to near-chance level of appearance depending on speech proficiency. The present research argues that such variations reflect different phonological types, rather than speech proficiency. To investigate the difference between learner speech and loanword phonology, the current research analyzes the speech data from five different proficiency levels of 92 Korean speakers who read 19 pairs of English words and sentences that contained loanwords. The experimental method is primarily an acoustical one, by which the phonological cause in the loanwords (e.g., the insertion of [$\Box$] at the end of the word stamp) would be attested to appear in learner speech, in comparison with native speech from 11 English speakers and 11 Korean speakers. The data investigated for the research are of segment deletion, insertion, substitution, and alternation in both learner speech and the native speech. The results indicate that learner speech does not present the loanword properties in many cases, but depends on the types of phonological causes. The relatively easy acquisition of target pronunciation is evidenced in the cases of segment deletion, insertion, substitution, and alternation, except when the loanword property involves the successful command of the target phonology such as the de-aspiration of [p] in apple. Such a case of difficult learning draws a sharp distinction from the cases of easy learning in the development of learner speech, particularly beyond the intermediate level of proficiency. Overall, learner speech departs from loanword phonology and develops toward the native speech value, depending on phonological contrasts in the native and foreign languages.

  • PDF

The identification of Korean vowels /o/ and /u/ by native English speakers

  • Oh, Eunhae
    • 말소리와 음성과학
    • /
    • 제8권1호
    • /
    • pp.19-24
    • /
    • 2016
  • The Korean high back vowels /o/ and /u/ have been reported to be in a state of near-merger especially among young female speakers. Along with cross-generational changes, the vowel position within a word has been reported to render different phonetic realization. The current study examines native English speakers' ability to attend to the phonetic cues that distinguish the two merging vowels and the positional effects (word-initial vs. word-final) on the identification accuracy. 28 two-syllable words containing /o/ or /u/ in either initial or final position were produced by native female Korean speakers. The CV part of each target word were excised and presented to six native English speakers. The results showed that although the identification accuracy was the lowest for /o/ in word- final position (41%), it increased up to 80% in word-initial position. The acoustic analyses of the target vowels showed that /o/ and /u/ were differentiated on the height dimension only in word-initial position, suggesting that English speakers may have perceived the distinctive F1 difference retained in the prominent position.

정제 알고리즘을 이용한 한국인 화자의 영어 발화 자동 진단 시스템 (Automatic Pronunciation Diagnosis System of Korean Students' English Using Purification Algorithm)

  • 양일호;김민석;유하진;한혜승;이주경
    • 말소리와 음성과학
    • /
    • 제2권2호
    • /
    • pp.69-75
    • /
    • 2010
  • We propose an automatic pronunciation diagnosis system to evaluate the pronunciation of a foreign language without the uttered text. We recorded English utterances spoken by native and Korean speakers, and utterances spoken by Koreans are evaluated by native speakers based on three criteria: fluency, accuracy of phones and intonation. The system evaluates the utterances of test Korean speakers based on the differences of log-likelihood given two models: one is trained by English speech uttered by native speakers, and the other is trained by English speech uttered by Korean speakers. We also applied purification algorithm to increase class differentiability. The purification can detect and eliminate the non-speech frames such as short pauses, occlusive silences that do not help to discriminate between utterances. As the results, our proposed system has higher correlation with the human scores than the baseline system.

  • PDF