DOI QR코드

DOI QR Code

Comparison of Adult and Child's Speech Recognition of Korean

한국어에서의 성인과 유아의 음성 인식 비교

  • 유재권 (덕성여자대학교 전산정보통신학과 지능형 멀티미디어 연구실) ;
  • 이경미 (덕성여자대학교 컴퓨터학과)
  • Received : 2011.01.14
  • Accepted : 2011.03.31
  • Published : 2011.05.28

Abstract

While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.

Keywords

Speech Recognition;Comparison of Adult and Children's Speech; HMM;VTLN

References

  1. 이용주, 김봉완, 김영일, 최대림, "한국의 공동이용을 위한 음성언어자원의 구축 및 보급현황", 한국어정보학회, 제10권, 제1호, pp.81-85, 2008.
  2. D. Giuliani and M. Gerosa, "Investigating recognition of children's speech," Proc. ICASSP, pp.137-140, 2003.
  3. S. Narayanan and A. Potamianos, "Creating conversational interfaces for children," IEEE Trans. on Speech and Audio Processing, Vol.10, No.2, pp.65-78, 2002. https://doi.org/10.1109/89.985544
  4. M. Gerosa, D. Giuliani, and F. Brugnara, "Acoustic variability and automatic recognition of children's speech," Speech Communication 49, pp.847-860, 2007. https://doi.org/10.1016/j.specom.2007.01.002
  5. D. Elenius and M. Blomberg, "Comparing speech recognition for adults and children," in Proceedings of FONETIK, Stockholm, Sweden, 2004.
  6. H. Wakita, "Normalization of vowels by vocal tract length and its application to vowel identification," IEEE Trans. on Acoustic. Speech and Signal Processing, 25, pp.183-192, 1977. https://doi.org/10.1109/TASSP.1977.1162929
  7. S. ohgren, "Experiment with adaptation and vocal tract length normalization at automatic speech recognition of children's Speech," KTH, Stockholm, Sweden, 2007.
  8. J. E. Huber, E. T. Stathopoulos, G. M. Curione, T. A. Ash and K. Johnson, "Formants of children women and men: The effect of vocal intensity variation," Journal of the acoustical society of america. Vol.106, No.3, pp.1532-1542, 1999. https://doi.org/10.1121/1.427150
  9. A. Potamianos and S. Narayanan, "A review of the acoustic and linguistic properties of children's speech," in Proceedings of IEEE Multimedia Signal Processing Workshop, 2007. https://doi.org/10.1109/MMSP.2007.4412809
  10. 장보경, 이연규, "유아의 연령과 성별에 따른 언어발달과 사회정서 발달의 차이", 한국Montessori 교육학회, Vol.14, No.2, pp.61-77, 2009.
  11. G. Potamianos and S. Narayanan, "Robust recognition of children speech," IEEE Transaction on Speech and Audio Processing 11, pp.603-616, 2003. https://doi.org/10.1109/TSA.2003.818026
  12. A. Hagen, B. Pellom, and R. Cole, "Highly accurate children's speech recognition for interactive reading tutors using subword units," Speech Communication. Vol.49, No.12, pp.861-873, 2007. https://doi.org/10.1016/j.specom.2007.05.004
  13. R. D. Kent and L. L. Forner, "Speech segment durations in sentence recitations by children and adults," Journal of Phonetics, 8, pp.157-168, 1980.
  14. S. Lee, A. Potamianos, and S. Naraynan, "Acoustics of children's speech: Developmental changes of temporal and spectral parameters," Journal of the Acoustical Society of America, pp.1455-1468, 1999.
  15. H. John and H. Wendy, "Speech synthesis and recognition," Taylor & Francis, 2nd edition, 2001.
  16. J. Nicholas and A. Geers, "Effects of early auditory experience on the spoken language of deaf children at 3 years of age," Ear Hear, 27, pp.286-296, 2006. https://doi.org/10.1097/01.aud.0000215973.76912.c6