Multicriteria-Based Computer-Aided Pronunciation Quality Evaluation of Sentences

Yoma, Nestor Becerra;Berrios, Leopoldo Benavides;Sepulveda, Jorge Wuth;Torres, Hiram Vivanco;

doi:10.4218/etrij.13.0112.0016

ETRI Journal

Volume 35 Issue 1
/
Pages.89-99
/
2013
/
1225-6463(pISSN)
/
2233-7326(eISSN)

Electronics and Telecommunications Research Institute (한국전자통신연구원)

DOI QR Code

Multicriteria-Based Computer-Aided Pronunciation Quality Evaluation of Sentences

Yoma, Nestor Becerra (Department of Electrical Engineering, Universidad de Chile) ;
Berrios, Leopoldo Benavides (Department of Electrical Engineering, Universidad de Chile) ;
Sepulveda, Jorge Wuth (Department of Electrical Engineering, Universidad de Chile) ;
Torres, Hiram Vivanco (Department of Linguistics, Universidad de Chile)

Received : 2012.01.07
Accepted : 2012.07.16
Published : 2013.02.01

https://doi.org/10.4218/etrij.13.0112.0016 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The problem of the sentence-based pronunciation evaluation task is defined in the context of subjective criteria. Three subjective criteria (that is, the minimum subjective word score, the mean subjective word score, and first impression) are proposed and modeled with the combination of word-based assessment. Then, the subjective criteria are approximated with objective sentence pronunciation scores obtained with the combination of word-based metrics. No a priori studies of common mistakes are required, and class-based language models are used to incorporate incorrect and correct pronunciations. Incorrect pronunciations are automatically incorporated by making use of a competitive lexicon and the phonetic rules of students' mother and target languages. This procedure is applicable to any second language learning context, and subjective-objective sentence score correlations greater than or equal to 0.5 can be achieved when the proposed sentence-based pronunciation criteria are approximated with combinations of word-based scores. Finally, the subjective-objective sentence score correlations reported here are very comparable with those published elsewhere resulting from methods that require a priori studies of pronunciation errors.

Keywords

References

L. Neumeyer et al., "Automatic Text-Independent Pronunciation Scoring of Foreign Language Student Speech," Proc. ICSLP, 1996, pp. 1457-1460.
H. Franco et al., "Automatic Pronunciation Scoring for Language Instruction," ICASSP, vol. 2, 1997, pp. 1471-1474.
A. Neri, C. Cucchiarini, and W. Strik, "Automatic Speech Recognition for Second Language Learning: How and Why It Actually Works," Proc. 15th Int. Congress Phonetic Sci., Barcelona, Spain, 2003, pp. 1157-1160.
S. Nakagawa and K. Ohta, "A Statistical Method of Evaluating Pronunciation Proficiency for Presentation in English," Proc. InterSpeech, Antwerp, Belgium, Aug. 2007.
J. Tepperman et al., "A Bayesian Network Classifier for Word-Level Reading Assessment," Proc. InterSpeech, Antwerp, Belgium, Aug. 2007.
C. Molina et al., "ASR Based Pronunciation Evaluation with Automatically Generated Competing Vocabulary and Classifier Fusion," Speech Commun., vol. 51, no. 6, June 2009, pp. 485-498. https://doi.org/10.1016/j.specom.2009.01.002
O. Deshmukh, S. Joshi, and A. Verma, "Automatic Pronunciation Evaluation and Classification," INTERSPEECH, 2008, pp. 1721-1724.
T. Cincarek et al., "Automatic Pronunciation Scoring of Words and Sentences Independent from the Non-native's First Language," Computer Speech Language, vol. 23, no. 1, Jan. 2009, pp. 65-88. https://doi.org/10.1016/j.csl.2008.03.001
S. Xu et al., "Automatic Pronunciation Evaluation Based on Feature Extraction and Combination," Proc. 3rd Int. Conf. Innovative Computing Inf. Control, 2008, pp. 172-176.
N. Moustroufas and V. Digalakis, "Automatic Pronunciation Evaluation of Foreign Speakers Using Unknown Text," Computer Speech Language, vol. 21, no. 1, Jan. 2007, pp. 219-230. https://doi.org/10.1016/j.csl.2006.04.001
L. Neumeyer et al., "Automatic Scoring of Pronunciation Quality," Speech Commun., vol. 30, no. 2-3, Feb. 2000, pp. 83-93. https://doi.org/10.1016/S0167-6393(99)00046-1
H. Franco et al., "Combination of Machine Scores for Automatic Grading of Pronunciation Quality," Speech Commun., vol. 30, no. 2-3, Feb. 2000, pp. 121-130. https://doi.org/10.1016/S0167-6393(99)00045-X
S. Wei et al., "Pronunciation Space Models for Pronunciation Evaluation," 6th Int. Symp. Chinese Spoken Language Process. (ISCSLP), Dec. 2008, pp. 1-4.
W. Ward and S. Issar, "A Class Based Language Model for Speech Recognition," Proc. ICASSP, 1996, pp. 416-418.
J. Zhang et al., "Improvements in Audio Processing and Language Modeling in the CU Communicator," Eurospeech, Aalborg, Denmark, 2001.
K.Y. Kwan, T. Lee, and C. Yang, "Unsupervised N-Best Based Model Adaptation Using Model-Level Confidence Measures," Proc. ICSLP, 2002, pp. 69-72.
J. Sooful and E. Botha, "Comparison of Acoustic Distance Measures for Automatic Cross-Language Phoneme Mapping," Proc. ICSLP, Denver, CO, USA, 2002, pp. 521-524.
J. Kittler et al., "On Combining Classifiers," IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 3, Mar. 1998, pp. 226-239. https://doi.org/10.1109/34.667881
L.I. Kuncheva, J.C. Bezdeck, and R.P.W. Duin, "Decision Templates for Multiple Classifier Fusion: An Experimental Comparison," Pattern Recog., vol. 34, no. 2, 2001, pp. 299-314. https://doi.org/10.1016/S0031-3203(99)00223-X
L.I. Kuncheva, "A Theoretical Study on Six Classifier Fusion Strategies," IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, Feb. 2002, pp. 281-286. https://doi.org/10.1109/34.982906
J. Kittler and F.M. Alkoot, "Sum versus Vote Fusion in Multiple Classifier Systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, issue 1, 2003, pp. 110-115. https://doi.org/10.1109/TPAMI.2003.1159950
G. Fumera and F. Roli, "A Theoretical and Experimental Analysis of Linear Combiners for Multiple Classifier Systems," IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 6, June 2005, pp. 942-956. https://doi.org/10.1109/TPAMI.2005.109
J. Garofalo et al., Continuous Speech Recognition (CSR-I) Wall Street Journal (WSJ0) News, Complete, Linguistic Data Consortium, Philadelphia, PA, USA, 1993.
Linguistic Data Consortium, LATINO-40 Spanish Read News Corpus, database, Linguistic Data Consortium, University of Pennsylvania, Philadelphia, PA, USA, 1995.

ETRI Journal

Multicriteria-Based Computer-Aided Pronunciation Quality Evaluation of Sentences

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)