DOI QR코드

DOI QR Code

Fluency Scoring of English Speaking Tests for Nonnative Speakers Using a Native English Phone Recognizer

  • Received : 2015.05.21
  • Accepted : 2015.06.16
  • Published : 2015.06.30

Abstract

We propose a new method for automatic fluency scoring of English speaking tests spoken by nonnative speakers in a free-talking style. The proposed method is different from the previous methods in that it does not require the transcribed texts for spoken utterances. At first, an input utterance is segmented into a phone sequence by using a phone recognizer trained by using native speech databases. For each utterance, a feature vector with 6 features is extracted by processing the segmentation results of the phone recognizer. Then, fluency score is computed by applying support vector regression (SVR) to the feature vector. The parameters of SVR are learned by using the rater scores for the utterances. In computer experiments with 3 tests taken by 48 Korean adults, we show that speech rate, phonation time ratio, and smoothed unfilled pause rate are best for fluency scoring. The correlation of between the rater score and the SVR score is shown to be 0.84, which is higher than the correlation of 0.78 among raters. Although the correlation is slightly lower than the correlation of 0.90 when the transcribed texts are given, it implies that the proposed method can be used as a preprocessing tool for fluency evaluation of speaking tests.

Keywords

References

  1. Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of nonnative speaker conversations. Discourse Processes, Vol. 14, No. 4, 423-441. https://doi.org/10.1080/01638539109544795
  2. Chambers, F. (1997). What do we mean by fluency?. System, Vol. 25, No. 4, 535-544. https://doi.org/10.1016/S0346-251X(97)00046-8
  3. Kormos, J. & Denes, M. (2004). Exploring measures and perceptions of fluency in the speech of second language learners. System, Vol. 32, No. 2, 145-164. https://doi.org/10.1016/j.system.2004.01.001
  4. Fillmore, C. J. (1979). Individual differences in language ability and language behavior. Academic Press, 85-101.
  5. Crystal, D. (1987). The Cambridge Encyclopedia of Language. Cambridge University Press, Cambridge.
  6. Deshmukh, O. D., Kandhway, K., Verma, A. & Audhkhasi, K. (2009). Automatic evaluation of spoken English fluency. In Proc. ICASSP 2009, 4829-4832.
  7. Zechner, K., Higgns, D., Xi, X. & Williamson, D. M. (2009). Automatic scoring of non-native spontaneous speech in tests of spoken English. Speech Communication, Vol. 51, No. 10, 883-895. https://doi.org/10.1016/j.specom.2009.04.009
  8. Xi, X., Higgins, D., Zechner, K. & Williamson, D. (2012). A comparison of two scoring methods for an automated speech scoring system. Language Testing, 1-24.
  9. Neumeyer, L., Franco, H., Digalakis, V. & Weintraub, M. (2000). Automatic scoring of pronunciation quality. Speech Communication, Vol. 30, No. 2-3, 83-93. https://doi.org/10.1016/S0167-6393(99)00046-1
  10. Jang, B. Y. & Kwon, O. W. (2014). Computer-based fluency evaluation of English speaking tests for Koreans. Journal of the Korean Society of Speech Sciences, Vol. 6, No. 2, 9-20. (장병용 & 권오욱 (2014). 한국인을 위한 영어 말하기 시험의 컴퓨터 기반 유창성 평가. 말소리와 음성과학 제6권 제2호, 9-20. https://doi.org/10.13064/KSSS.2014.6.2.009
  11. Wang, L., Liu, Y., Pan, F., Dong, B. & Yan, Y. (2014). Automatic scoring of scene question-answer in English spoken test. In Proc. Information Science, Electronics and Electrical Engineering (ISEEE), 712-715.
  12. Vertanen, K. (1994). HTK Wall Street Journal Training Recipe, www.keithv.com.
  13. Paul, D. B. & Baker, J. M. (1992). The design for the Wall Street Journal-based CSR corpus. In Proc. Workshop on Speech and Natural Language, 357-362.
  14. Garofolo, J. S. (1988). Getting started with the DARPA TIMIT CD-ROM: An acoustic phonetic continuous speech database. National Institute of Standards and Technology (NIST), Gaithersburgh, MD, 107.
  15. Lenzo, K. (2007). The CMU pronouncing dictionary, www.speech.cs.cmu.edu/cgibin/cmudict.
  16. Rabiner, L. & Juang, B. (1993). Fundamentals of Speech Recognition, Prentice Hall.
  17. Towell, R., Hawkins, R. & Bazergui, N. (1996). The development of fluency in advanced learners of French. Applied Linguistics, Vol. 17, No. 1, 84-119. https://doi.org/10.1093/applin/17.1.84
  18. Muller, K. R., Smola, A. J., Ratsch, G.., Scholkopf, B., Kohlmorgen J. & Vapnik, V. (1997). Predicting time series with support vector machines. In Proc. Artificial Neural Networks-ICANN'97, 999-1004.
  19. Drucker, H., Burges, C. J., Kaufman, L., Smola, A. & Vapnik, V. (1997). Support vector regression machines. Advances in Neural Information Processing Systems, 155-161.
  20. Haykin, S. (1999). Neural Networks, Prentice Hall.
  21. Smola, A. J. & Scholkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, Vol. 14, No. 3, 199-222. https://doi.org/10.1023/B:STCO.0000035301.49549.88
  22. Cucchiarini, C., Helmer, S. & Lou, B. (2000). Quantitative assessment of second language learners' fluency by means of automatic speech recognition technology. The Journal of the Acoustical Society of America, Vol. 172, No. 2, 989-999.